On Tue, Aug 11, 2009 at 12:07 PM, Jens Alfke<[email protected]> wrote: > Interesting. Does this really guarantee file integrity even in the case of > power failure? (I have some experience dealing with file corruption, from > working on Mac OS X components that use sqlite.) The worst problem is that > the disk controller will reorder sector writes to reduce seek time, which in > effect means that if power is lost, some random subset of the last writes > may not happen. So you won't just end up with a truncated file — you could > have a file that seems intact and has a correct header at the end, but has > 4k bytes of garbage somewhere within the last transaction. Does CouchDB's > file structure guard against that?
It doesn't guard against that, but it's a recoverable situation in that if you think something like that has happened, you can scan back to the old header and start from there. Effectively, each header is the root of it's own tree, with the append only doing a crude "copy on write" kind of thing. If you scan back in the file, each header is a "snapshot" of the db state, and each of those snapshots "should be" consistent. As for detecting that corruption has happened (like what you suggest), I don't think couch does anything like that. In theory, it could chksum the nodes to detect corrupted pages/nodes when they're read, but even that may not happen until sometime after the corruption has happened (though in practice this detection would likely occur during the view update). Of course, then you get in to the situation of "dealing with the problem". Right now it probably falls in to the "don't test for what you can't handle" bucket. Regards, Will Hartung
