Hi, Over in couchdb-python land someone wanted to use batch=ok when creating and updating documents, so we added support.
I was semi-surprised to notice that _bulk_docs does not support batch=ok. I realise _bulk_docs essentially is a batch update but a _bulk_docs batch=ok would presumably allow CouchDB to buffer more in memory before writing to disk. What are your thoughts? Now, this buffering is where the "implementation concerns" come in. According to the wiki: "There is a query option batch=ok which can be used to achieve higher throughput at the cost of lower guarantees. When a PUT (or a document POST as described below) is sent using this option, it is not immediately written to disk. Instead it is stored in memory on a per-user basis for a second or so (or the number of docs in memory reaches a certain point). After the threshold has passed, the docs are committed to disk." However, unless I'm missing something (quite likely ;-)), there is no "stored in memory on a per-user basis" or any check for when "the number of docs in memory reaches a certain point". All it seems to do is spawn a new process so the update happens when the Erlang scheduler gets around to it. In fact, I don't see any reference to the batch_save_interval and batch_save_size configuration options in the code. Shouldn't batch=ok send the doc off to some background process that accumulates docs until either the batch interval or size threshold has been reached? This would also ensure that batch=ok updates are handled in the order they arrive, although I'm not sure if that matters given that the user has basically said they don't care if it succeeds or not by using batch=ok. - Matt
