Thanks Damien. I'm thinking that the situation you describe cannot occur if before_header is enabled in the fsync_options, since any data pointed to by the #db_header that the server found after the restart was already synced. Is that correct?
Adam On Apr 14, 2010, at 10:26 AM, Damien Katz wrote: > The reason for fsync on open is the server doesn't know if the data it's > reading off the file is commited fully to the disk. It's possible the the > server wrote to file and crashed before fsync, then restarted. Then it could > refresh view indexes on the non-fsynced storage data, for example, and crash > again, losing data in the storage file, but not the updates to the index > file. Now the index is permanently out of date with the storage file. But if > you fsync on opening the storage file, that can't happen. > > -Damien > > > On Apr 14, 2010, at 5:52 AM, Adam Kocoloski wrote: > >> Initially posted on user@, but maybe it got lost in the noise. Does anyone >> know why we call fsync when we open a file? >> >> Adam >> >> Begin forwarded message: >> >>> From: Adam Kocoloski <[email protected]> >>> Date: April 11, 2010 10:44:03 PM EDT >>> To: [email protected] >>> Subject: optimal settings for [couchdb] fsync_options? >>> >>> Hi folks, I wanted to assemble some concrete information about the purpose >>> of each of the three fsync_options available in CouchDB and under what >>> conditions they should be enabled/disabled. These options are >>> >>> 1) before_header - calls file:sync(Fd) before writing a DB header to disk. >>> I believe the goal here is to prevent DB corruption by ensuring that all >>> the data referred to by the header is durably stored before the header is >>> written. A system that preserves write ordering could safely disable this >>> option. Does anyone know an example of such a system? Perhaps a >>> combination of a noop IO scheduler and a write-through or nonvolatile disk >>> cache? >>> >>> 2) after_header - calls file:sync(Fd) immediately after writing the DB >>> header. I think this one is done so that we don't lose too much data >>> following a CouchDB restart, and so that a client can ensure that stored >>> data will be retrievable after a restart by POSTing to >>> /db/_ensure_full_commit. It might make sense to disable this option if >>> e.g. you're relying on replication for durability. Although that's dicey >>> because the replicator calls ensure_full_commit for both DBs before writing >>> its own checkpoint record*, and by disabling the after_header option you'd >>> run the risk of skipping updates on the target in the face of a power >>> failure. >>> >>> 3) on_file_open - calls file:sync(Fd) immediately after opening a DB file. >>> I really don't know the purpose of this one. Anyone? >>> >>> Best, Adam >>> >>> * The reason the replicator calls ensure_full_commit on the source is to >>> detect situations where update_seqs might be reused. I wonder if we could >>> engineer a way around that ever happening, for example by ensuring that on >>> restart the update sequence jumps by a large number. But that's a >>> discussion for d...@. >> >
