[sqlite] Worstcase consequence of exotic storage failure, and how guard against it?

Simon Slavin Thu, 25 Jun 2015 15:06:24 +0100

On 25 Jun 2015, at 11:49am, Mikael <mikael.trash at gmail.com> wrote:


> Harddrives can fail in the most remarkable, surprising ways imaginable, as
> for instance reflected on
> https://www.usenix.org/legacy/events/fast08/tech/full_papers/bairavasundaram/bairavasundaram_html/index.html
> and https://blog.algolia.com/when-solid-state-drives-are-not-that-solid/ .

A couple of nice pieces of detective work.

> (E.g., writes may fail transparently, random sectors may be zeroed, the
> disk's internal sector map may break so reads may return any garbled data,
> RAM-based disk cache may hide write failures so data initially seems to
> have been written but was not, etc.)

Don't be unduly deterred from SSDs by some reports you'll read on the internet. 
 Many of these articles start from the false premise that "We know rotating 
drives aren't perfect, but SSDs are better technology with no moving parts so 
they should be perfect.".  The actual figures for faulty firmware in rotating 
drives compared with SSDs are about the same for each new model (around 1 in 
150 newly launched models of drive has faulty firmware).  So they're about 
equally likely to show firmware-related faults.

> This motivates me to ask for your complete answer to,
> 
> 1) What is the worst-case SQLite misbehavior(s) that can disk failure can
> lead to, and,

Any misbehaviour by SQLite that gets reported here gets fixed by the developer 
team.  They're really good.  It's one reason why I'm still working with SQLite. 
 So I'm going to answer your question as if it refers to the entire system 
being used: program, APIs, and hardware.

It depends on what you're using the disk for.  In banking, the worst-case 
misbehaviour happens not when the drive stops working, but when incorrect data 
is returned.  This generally takes two simultaneous parity/checksum/hashing 
failures which happens more than you'd think.  It's far better for the bank for 
the disk to fail obviously and completely.  A bank is rich, has an extremely 
good backup protocol, and can easily afford to buy a new disk when one disk 
fails.  If hardware fails in a bank, it just means that a couple of techies are 
going to claim overtime.

If, on the other hand, you have a small measuring device with an embedded 
controller (e.g. GPS unit, measurement devices, calculators), a single 
incorrect figure will probably be obviously wrong, and it will just make you 
shake their head and try again.  If you're at a customer site and you spot the 
number is wrong, the customer will never notice you retrying.  But if you're at 
a customer site and your tools stop working completely it makes you look like 
you buy cheap untrustworthy tools and that can make you or your company look 
bad to the customer.

> 2) Through what return values and otherwise does SQLite report non-exotic
> disk failures, and,

<https://www.sqlite.org/c3ref/c_abort.html>

If the fault happens when the data is not being processed (computer idle or 
file closed) you'll probably get something like _CORRUPT or _NOTADB.  If 
failure happens while data is being handled you'll probably get something like 
_IOERR.  It is rare to see any of these in the wild unless you already know 
your hardware is faulty.  For example, if your computer show unexpected 
unexplained error messages during bootup, don't carry on as normal.

A further breakdown of a number of reasons for SQLITE_IOERR can be found at

<https://www.sqlite.org/c3ref/c_abort_rollback.html>

The lesson here is that your code should check and report any non SQLITE_OK 
result codes returned by /all/ SQLite calls, even if you don't think a fault 
can result, or if the program can't do anything about the error.  Sometimes the 
error for writing to disk doesn't show up in the "UPDATE" command but later on 
when you try to close the file.  Programs I write for life-critical or 
mission-critical situations check and report everything.  Programs I write for 
games, quick lookups or calculations for trivial purposes might not.

> 3) In a server setup, is there any panacea against disk failures - basing
> the storage on RAID1 perhaps?

Nope.  For this question ignore entirely the fact that you're using SQLite and 
just imagine what you'd do for any storage device.  RAID1, for example, is just 
a mirrored pair of disks.  If one returns incorrect data for a 'read' your 
system still falls over because it doesn't know which disk to believe.

If data matters to you, buy a drive rated for server use (24/7 uptime, 
parity/checksums checked, write-before-acknowledge, jumpers set and drivers 
configured for the drive being used in a server).  It will be slower than most 
drives but more trustworthy.  Even then you must assume that your hardware will 
fail sometime, so create a backup procedure, take the backups, and occasionally 
try to restore one to make sure it's a usable backup and your copy/restore 
procedure works.  The most common failure modes in a backup system are not a 
failure to take backups but faults in the procedure so that backups don't 
contain all the data needed, or that the backup can only be restored on the 
hardware which has just failed.

Simon.

[sqlite] Worstcase consequence of exotic storage failure, and how guard against it?

Reply via email to