On May 16, 2009, at 11:22 AM, Antony Blakey wrote:

On 16/05/2009, at 11:07 PM, Adam Kocoloski wrote:

No, I don't believe so. ibrowse accepts a {stream_to, pid()} option. It accumulates packets until it reaches a threshold configurable by {stream_chunk_size, integer()} (default 1MB), then sends the data to the Pid. I don't think ibrowse is writing to disk at any point in the process. We do see that when streaming really large attachments, ibrowse becomes the biggest memory user in the emulator.

This is what I thought was happening, which means that with small documents with many attachments (say > 1Mb) you could potentially end up with masses of open connections representing data promises that are only forced at checkpoint time, so that's not scalable. I think the number of open ibrowse connections (which I see doesn't neccessariy match the number of unforced promises), needs to be an input to the checkpoint decision.

So, I think there's still some confusion here. By "open connections" do you mean TCP connections to the source? That number is never higher than 10. ibrowse does pipeline requests on those 10 connections, so there could be as many as 1000 simultaneous HTTP requests. However, those requests complete as soon as the data reaches the ibrowse client process, so in fact the number of outstanding request during replication is usually very small. We're not doing flow control at the TCP socket layer.

If by "open connections" you really mean "attachment receiver processes spawned by the couch_rep gen_server" I think you'd be closer to the mark. We can get an approximate handle on that just by counting the number of links to the gen_server.

I'm not sure I understand what part is "not scalable". I agree that ignoring the attachment receivers and their mailboxes when deciding whether to checkpoint is a big problem. I'm testing a fix for that right now. Is there something else you meant by that statement? Best,

Adam

P.S. One issue in my mind is that we only do the checkpoint test after we receive a document. We could end up in a situation where a document request is sitting in a pipeline behind a huge attachment, and the checkpoint test won't execute until the entire attachment is downloaded into memory. There are ways around this, e.g. using ibrowse:spawn_link_worker_process/2 to bypass the default connection pool for attachment downloads.

Reply via email to