On May 16, 2009, at 8:53 PM, Antony Blakey wrote:
On 17/05/2009, at 12:09 AM, Adam Kocoloski wrote:
So, I think there's still some confusion here. By "open
connections" do you mean TCP connections to the source? That
number is never higher than 10. ibrowse does pipeline requests on
those 10 connections, so there could be as many as 1000
simultaneous HTTP requests. However, those requests complete as
soon as the data reaches the ibrowse client process, so in fact the
number of outstanding request during replication is usually very
small. We're not doing flow control at the TCP socket layer.
IIUC, given that no attachments bodies are consumed by the
replicator until the documents are checkpointed, it's possible for
the replicator to block if the number of pending attachments in a
checkpoint buffer is greater than the ibrowse concurrent request
limit. In a case like mine, with many attachments on very small
documents, this is very likely. Or am I still confused? :/
There's one key point that you're overlooking. From ibrowse'
perspective, _there is no checkpoint buffer_. ibrowse gets a request
to download an attachment, and it immediately starts that request,
sends the data in 1MB chunks to an attachment receiver process, and
completes the request. In theory, we could have 10,000 attachment
receiver processes holding binaries to be written to disk, and ibrowse
would be none the wiser.
Best, Adam