Thanks Damien.  I'm thinking that the situation you describe cannot occur if 
before_header is enabled in the fsync_options, since any data pointed to by the 
#db_header that the server found after the restart was already synced.  Is that 
correct?

Adam

On Apr 14, 2010, at 10:26 AM, Damien Katz wrote:

> The reason for fsync on open is the server doesn't know if the data it's 
> reading off the file is commited fully to the disk. It's possible the the 
> server wrote to file and crashed before fsync, then restarted. Then it could 
> refresh view indexes on the non-fsynced storage data, for example, and crash 
> again, losing data in the storage file, but not the updates to the index 
> file. Now the index is permanently out of date with the storage file. But if 
> you fsync on opening the storage file, that can't happen.
> 
> -Damien
> 
> 
> On Apr 14, 2010, at 5:52 AM, Adam Kocoloski wrote:
> 
>> Initially posted on user@, but maybe it got lost in the noise.  Does anyone 
>> know why we call fsync when we open a file?
>> 
>> Adam
>> 
>> Begin forwarded message:
>> 
>>> From: Adam Kocoloski <[email protected]>
>>> Date: April 11, 2010 10:44:03 PM EDT
>>> To: [email protected]
>>> Subject: optimal settings for [couchdb] fsync_options?
>>> 
>>> Hi folks, I wanted to assemble some concrete information about the purpose 
>>> of each of the three fsync_options available in CouchDB and under what 
>>> conditions they should be enabled/disabled.  These options are
>>> 
>>> 1) before_header - calls file:sync(Fd) before writing a DB header to disk.  
>>> I believe the goal here is to prevent DB corruption by ensuring that all 
>>> the data referred to by the header is durably stored before the header is 
>>> written.  A system that preserves write ordering could safely disable this 
>>> option.  Does anyone know an example of such a system? Perhaps a 
>>> combination of a noop IO scheduler and a write-through or nonvolatile disk 
>>> cache?
>>> 
>>> 2) after_header - calls file:sync(Fd) immediately after writing the DB 
>>> header.  I think this one is done so that we don't lose too much data 
>>> following a CouchDB restart, and so that a client can ensure that stored 
>>> data will be retrievable after a restart by POSTing to 
>>> /db/_ensure_full_commit.  It might make sense to disable this option if 
>>> e.g. you're relying on replication for durability.  Although that's dicey 
>>> because the replicator calls ensure_full_commit for both DBs before writing 
>>> its own checkpoint record*, and by disabling the after_header option you'd 
>>> run the risk of skipping updates on the target in the face of a power 
>>> failure.
>>> 
>>> 3) on_file_open - calls file:sync(Fd) immediately after opening a DB file.  
>>> I really don't know the purpose of this one.  Anyone?
>>> 
>>> Best, Adam
>>> 
>>> * The reason the replicator calls ensure_full_commit on the source is to 
>>> detect situations where update_seqs might be reused.  I wonder if we could 
>>> engineer a way around that ever happening, for example by ensuring that on 
>>> restart the update sequence jumps by a large number.  But that's a 
>>> discussion for d...@.
>> 
> 

Reply via email to