A couchdb database *is* a write-ahead journal by design, it’s opened with O_APPEND, all writes are strictly to the end of the file. Updates become visible after a new database header is written fully to disk (we can detect partial header writes, in that event we seek backward for the previous one).
With delayed commits set to false, fsync is guaranteed to have been called before the http response to the write is sent but this is *not* the same as 'every write will be immediately flushed to disk'. Concurrent writes will be fsynced together. With delayed commits set to true, fsync is called once per second, which indeed is an opportunity for data loss. The client will receive the ack before the fsync call is made. I strongly recommend setting delayed commits to false, this will be the default in CouchDB 2.0. It is the only sane setting for a database. I can’t speak for mongodb beyond lamenting their laissez faire approach to data safety but I can clarify what happens in CouchDB 2.0 when we have multiple copies of every item of data; In CouchDB 2.0 (and BigCouch, obviously), we keep three copies of each document, on separate nodes. A write will attempt to update all three copies in parallel and respond to the client when two of these acknowledge the write (as noted, this will be after the fsync call). In that event, a 201 code is returned. In the case that, for whatever reason, only one write occurred, we return a 202 code as an indication that the write, while persistent on *a* disk, is not stored redundantly yet. Every write to a copy of a document triggers an internal healing mechanism to ensure it reaches every other replica. Thus, even a 202 will graduate to a 201 internally when the nodes are available again. This mechanism is, to no one’s real surprise, exactly the same mechanism as replication between two databases (aka, it reuses the MVCC power of couchdb to ensure eventual consistency between replicas). Does that answer your question? B. On 17 Jul 2014, at 08:12, Bhanu <[email protected]> wrote: > With delayed commits to false, my understanding is that every write will be > immediately flushed to disk which may affect the overall performance of > couchdb. Is there any way to get durability guarantee with delayed commits > set to true? > > Does couchdb not use any journalling/write-ahead logging like mongodb does > -- so that journal files could be written more frequently than the actual > data file? > > Since CouchDB is AP with eventual consistency, I believe it suffers from > same mongodb issues mentioned here -- > https://groups.google.com/forum/#!topic/mongodb-user/SDY82VKzif0 > > Is there any case where couchdb will lose write though the write is > successfully acknowledged to client? > > Can we say that if delayed_commits is set to true, then CouchDB might lose > data? When will client receive acknowledgement in this case (delayed_commits > = true)? After the flushing or immediately? > > Thanks, > Bhanu > > > > -- > View this message in context: > http://couchdb-development.1959287.n2.nabble.com/What-are-the-cases-in-which-we-can-see-data-loss-with-CouchDB-tp7593186p7593325.html > Sent from the CouchDB Development mailing list archive at Nabble.com.
