Re: Detailed info on the B-tree store? Native implementations thereof?

Will Hartung Tue, 11 Aug 2009 16:52:13 -0700

On Tue, Aug 11, 2009 at 12:07 PM, Jens Alfke<[email protected]> wrote:
> Interesting. Does this really guarantee file integrity even in the case of
> power failure? (I have some experience dealing with file corruption, from
> working on Mac OS X components that use sqlite.) The worst problem is that
> the disk controller will reorder sector writes to reduce seek time, which in
> effect means that if power is lost, some random subset of the last writes
> may not happen. So you won't just end up with a truncated file — you could
> have a file that seems intact and has a correct header at the end, but has
> 4k bytes of garbage somewhere within the last transaction. Does CouchDB's
> file structure guard against that?


It doesn't guard against that, but it's a recoverable situation in
that if you think something like that has happened, you can scan back
to the old header and start from there. Effectively, each header is
the root of it's own tree, with the append only doing a crude "copy on
write" kind of thing. If you scan back in the file, each header is a
"snapshot" of the db state, and each of those snapshots "should be"
consistent.

As for detecting that corruption has happened (like what you suggest),
I don't think couch does anything like that. In theory, it could
chksum the nodes to detect corrupted pages/nodes when they're read,
but even that may not happen until sometime after the corruption has
happened (though in practice this detection would likely occur during
the view update).

Of course, then you get in to the situation of "dealing with the
problem". Right now it probably falls in to the "don't test for what
you can't handle" bucket.

Regards,

Will Hartung

Re: Detailed info on the B-tree store? Native implementations thereof?

Reply via email to