On Oct 25, 2011, at 3:09 PM, Mark Hahn wrote:

>> CouchDB does a full sync, because this is the only way to be proof against
>> disasters like power failures
> 
> Isn't couchdb crash-proof due to append-only writing?  What do you gain
> other than possible loss of latest writes, which you can lose anyway with a
> fsync.

You can get corrupted files even with append-only writing. In the worst case, 
let’s say everything gets written to disk before the power failure _except_ for 
one disk block in the middle of the update*. After rebooting, you have what 
appears to be a valid file (it’s got the magic trailer) except that 4096 (or 
512 or whatever) bytes in one of the last updated documents are garbage.

I don’t know the details of how CouchDB finds the trailer in the file, but it 
would have to be doing something like a checksum of every single write to guard 
against that; which seems too expensive to me.

Instead, to be safe, what you do is write the payload, wait for a full fsync, 
then write the trailer only after you know that the entire payload is safely on 
the platters.

—Jens

* Disk controllers don’t write sectors in the order they receive the write 
requests. They shove them in the cache, then write them out grouped by tracks 
to minimize seek time.

Reply via email to