Initially posted on user@, but maybe it got lost in the noise.  Does anyone 
know why we call fsync when we open a file?

Adam

Begin forwarded message:

> From: Adam Kocoloski <[email protected]>
> Date: April 11, 2010 10:44:03 PM EDT
> To: [email protected]
> Subject: optimal settings for [couchdb] fsync_options?
> 
> Hi folks, I wanted to assemble some concrete information about the purpose of 
> each of the three fsync_options available in CouchDB and under what 
> conditions they should be enabled/disabled.  These options are
> 
> 1) before_header - calls file:sync(Fd) before writing a DB header to disk.  I 
> believe the goal here is to prevent DB corruption by ensuring that all the 
> data referred to by the header is durably stored before the header is 
> written.  A system that preserves write ordering could safely disable this 
> option.  Does anyone know an example of such a system? Perhaps a combination 
> of a noop IO scheduler and a write-through or nonvolatile disk cache?
> 
> 2) after_header - calls file:sync(Fd) immediately after writing the DB 
> header.  I think this one is done so that we don't lose too much data 
> following a CouchDB restart, and so that a client can ensure that stored data 
> will be retrievable after a restart by POSTing to /db/_ensure_full_commit.  
> It might make sense to disable this option if e.g. you're relying on 
> replication for durability.  Although that's dicey because the replicator 
> calls ensure_full_commit for both DBs before writing its own checkpoint 
> record*, and by disabling the after_header option you'd run the risk of 
> skipping updates on the target in the face of a power failure.
> 
> 3) on_file_open - calls file:sync(Fd) immediately after opening a DB file.  I 
> really don't know the purpose of this one.  Anyone?
> 
> Best, Adam
> 
> * The reason the replicator calls ensure_full_commit on the source is to 
> detect situations where update_seqs might be reused.  I wonder if we could 
> engineer a way around that ever happening, for example by ensuring that on 
> restart the update sequence jumps by a large number.  But that's a discussion 
> for d...@.

Reply via email to