couch_httpd_db.erl

Adam Kocoloski Fri, 21 Aug 2009 19:38:23 -0700

On Aug 18, 2009, at 4:33 AM, Brian Candler wrote:

On Sat, Aug 15, 2009 at 10:17:28AM -0700, Chris Anderson wrote:
One middle ground implementation that could work for throughput,would
be to use the batch=ok ets based storage, but instead of immediately
returning 202 Accepted, hold the connection open until the batch is
written, and return 201 Created after the batch is written. Thiswouldallow the server to optimize batch size, without the client needingto
worry about things, and we could return 201 Created and maintain our
strong consistency guarantees.
Do you mean default to batch=ok behaviour? (In which case, if youdon't wantto batch you'd specify something else, e.g. x-couch-full-commit:true?)
This is fine by me. Of course, clients doing sequential writes maysee verypoor performance (i.e. write - wait response - write - wait responseetc).However this approach should work well with HTTP pipelining, as wellas withclients which open multiple concurrent HTTP connections. Thereplicator
would need to do pipelining, if it doesn't already.

Errm, it's going to be tough to pipeline PUTs and POSTs, as that'slabeled a SHOULD NOT in RFC2616. Even if we know that it would besafe to pipeline PUTs in CouchDB, HTTP clients are probably not goingto let it happen. I certainly agree about the connection pool,though. The replicator does use a connection pool, and it pipelinesGET requests, too.


http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.1.2.2

As I was attempting to say before: any solution which makes writeguarantees

should expose behaviour which is meaningful to the client.

- there's no point doing a full commit on every write unless you delay
the HTTP response until after the commit (otherwise there's still a
window where the client thinks the data has still gone safely to disk,
but actually it could be lost)

Right, and we do delay the response in that case, so I think it ismeaningful.

- there's no point having two different forms of non-safe write,because
there's no reasonable way for the client to choose between them.
Currently we have 'batch=ok', and we also have a normal write without
'x-couch-full-commit: true' - both end up with the data sitting in RAM
for a while before going to disk, the difference being whether it's
Erlang RAM or VFS buffer cache RAM.
I like the idea of being able to tune the batch size internallywithin
the server. This could allow CouchDB to automatically adjust for
performance without changing consistency guarantees, eg: run large
batches when under heavy load, but when accessed by a single user,
just do full_commits all the time.
I agree. I also think it would be good to be able to tune this perDB, or
more simply, per write.
e.g. a PUT request could specify max_wait=2000 (if not specified,use adefault value from the ini file). Subsequent requests could specifytheirown max_wait params, and a full commit would occur when the earliestofthese times occurs. max_wait=0 would then replace the x-couch-full-commit:
header, which seems like a bit of a frig to me anyway.
from being resource hogs by specifying a min_wait in the ini file.That is,if you set min_wait=100, then any client which insists on having afullcommit by specifying max_wait=0 may find itself delayed up to 0.1sbefore
its request is honoured.

I interpreted Chris' idea differently. Instead of exposing yet moreways to try to tune the DB, put the tuning logic into the server andlet it choose when to commit in an attempt to optimize both latencyand throughput.

A simple example might be to group together all outstanding writerequests and do one commit for the group. When the write load is low,we commit after every update. When the disk is slow or the write loadis high, we could have multiple incoming write requests while a singlecommit is in progress. Instead of committing each one separately (thecurrent behavior AFAIK) we'd update them all together like a single_bulk_docs request. The latency for the earliest requests wouldincrease, but the throughput would be much higher.

In a perfect world I'd like to see x-couch-full-commit and _bulk_docsfall into disuse. I realize the latter won't happen because noteveryone wants to implement an HTTP connection pool. batch=ok hasvery different semantics and so would still be useful, although Iimagine that most uses are batch=ok are done to maximize throughput,not minimize latency. If the throughput of normal operation was "highenough" batch=ok probably wouldn't be that popular.


Best, Adam

Re: svn commit: r804427 - in /couchdb/trunk: etc/couchdb/default.ini.tpl.in share/www/script/test/delayed_commits.js src/couchdb/couch_db.erl src/couchdb/couch_httpd_db.erl

Reply via email to