Adam, Thanks for the time you spent explaining things. I should have traced into the code a little further I guess ;-). I've just created a ticket for the removal of the two batch_save config options.
- Matt On 14 April 2010 15:02, Adam Kocoloski <[email protected]> wrote: > On Apr 14, 2010, at 9:38 AM, Matt Goodall wrote: > >> On 14 April 2010 13:23, Adam Kocoloski <[email protected]> wrote: >>> On Apr 14, 2010, at 7:59 AM, Matt Goodall wrote: >>> >>>> Hi, >>>> >>>> Over in couchdb-python land someone wanted to use batch=ok when >>>> creating and updating documents, so we added support. >>>> >>>> I was semi-surprised to notice that _bulk_docs does not support >>>> batch=ok. I realise _bulk_docs essentially is a batch update but a >>>> _bulk_docs batch=ok would presumably allow CouchDB to buffer more in >>>> memory before writing to disk. What are your thoughts? >>> >>> Its probably of limited utility. If you're already batching on the client >>> side, you can achieve the same effect by sending in a larger batch. I'm >>> not opposed to it per se, just don't think it will help with throughput all >>> that much. >> >> :nod: given the new behaviour I'm inclined to agree. >> >>> >>>> >>>> Now, this buffering is where the "implementation concerns" come in. >>>> According to the wiki: >>>> >>>> "There is a query option batch=ok which can be used to achieve higher >>>> throughput at the cost of lower guarantees. When a PUT (or a document >>>> POST as described below) is sent using this option, it is not >>>> immediately written to disk. Instead it is stored in memory on a >>>> per-user basis for a second or so (or the number of docs in memory >>>> reaches a certain point). After the threshold has passed, the docs are >>>> committed to disk." >>>> >>>> However, unless I'm missing something (quite likely ;-)), there is no >>>> "stored in memory on a per-user basis" or any check for when "the >>>> number of docs in memory reaches a certain point". All it seems to do >>>> is spawn a new process so the update happens when the Erlang scheduler >>>> gets around to it. In fact, I don't see any reference to the >>>> batch_save_interval and batch_save_size configuration options in the >>>> code. >>> >>> The wiki describes the 0.10 implementation of batch=ok. In 0.11 batch mode >>> takes advantage of the fact that couch_db_updater always merges all waiting >>> updates to a DB into a single write, and so doesn't bother with the >>> separate set of supervised processes accumulating documents. In effect the >>> 0.11 batch=ok is "I'm not going to wait around, but save this as soon as >>> you get a chance". >> >> Ah, I didn't dig far enough into the code to see that happening. >> >> So, purely for my understanding, it's now simplified to a delayed >> commit that happens at most 1000ms after normal changes are received. >> Anything that causes the commit to happen earlier cancels the pending >> commit. >> >> Does that mean that batch="ok" with delayed_commits=false is meaningless? > > So, we should distinguish between writes and fsyncs. CouchDB 0.11 never > waits to write; if there is an update_docs message in couch_db_updater's > mailbox it acts on that "immediately" (that is, as soon as it finishes > whatever else it's doing at the moment). Moreover, it batches together all > the update_docs messages in its mailbox and does one write operation. At the > end of this write operation the modified pages may not yet be flushed to > disk, in fact they almost certainly are not. The kernel is caching them for > a period of time. That's where fsync comes in. > > The delayed_commits setting controls the frequency with which CouchDB writes > the DB header and calls fsync. If it is set to false, CouchDB syncs the file > as soon as it completes a write operation. A write operation can be a single > document update, or it can update multiple documents in the case of > concurrent writer threads, batch=ok, or _bulk_docs requests. If > delayed_commits is set to true, CouchDB syncs the file at 1 second intervals > (if an update to the file has occurred in that interval, of course). > > batch=ok with delayed_commits=false is not quite meaningless, but you're > right, you probably won't sneak too many updates into a single commit unless > fsync is really slow. One example is OS X, where Erlang's file:sync calls a > different fcntl which actually forces the hard disk to flush the data to > spinning platters. It's super-slow but more reliable than regular-old fsync, > which just gets the data from the kernel to the hard disk's cache. If you > have a non-volatile disk cache on your Linux server that's cool, but a > regular old consumer hard drive in your MacBook does not have that luxury. > >> Anyway, it sounds like the two batch_save config options should be >> removed from etc/couchdb/default.ini.tpl.in. > > Yes. > >>> This does change the performance characteristics quite a bit; in >>> particular, when the underlying disk is fast the new batch=ok behavior will >>> result in significantly larger uncompacted databases. >> >> Agh, this suggests I didn't understand the updater's behaviour. Large >> uncompacted database normally means lots of small additions to the >> database file. How does fast disk speed affect that? > > All I meant there was that if the disk is slow, you can dump a bunch of > messages into couch_db_updater's mailbox while it's talking to the disk. > When it finishes what its doing and looks in the mailbox, it'll batch > everything in the mailbox together for the next write op. This results in a > somewhat smaller DB file. If the disk is fast couch_db_updater's mailbox > will be mostly empty, and it'll be doing a larger number of smaller > operations. Best, > > Adam > >>> >>>> Shouldn't batch=ok send the doc off to some background process that >>>> accumulates docs until either the batch interval or size threshold has >>>> been reached? This would also ensure that batch=ok updates are handled >>>> in the order they arrive, although I'm not sure if that matters given >>>> that the user has basically said they don't care if it succeeds or not >>>> by using batch=ok. >>> >>> I think the documents updates are still handled in the order in which they >>> were received. >>> >>>> >>>> - Matt >>> >>> >>> Best, Adam > >
