On May 16, 2009, at 11:22 AM, Antony Blakey wrote:
On 16/05/2009, at 11:07 PM, Adam Kocoloski wrote:
No, I don't believe so. ibrowse accepts a {stream_to, pid()}
option. It accumulates packets until it reaches a threshold
configurable by {stream_chunk_size, integer()} (default 1MB), then
sends the data to the Pid. I don't think ibrowse is writing to
disk at any point in the process. We do see that when streaming
really large attachments, ibrowse becomes the biggest memory user
in the emulator.
This is what I thought was happening, which means that with small
documents with many attachments (say > 1Mb) you could potentially
end up with masses of open connections representing data promises
that are only forced at checkpoint time, so that's not scalable. I
think the number of open ibrowse connections (which I see doesn't
neccessariy match the number of unforced promises), needs to be an
input to the checkpoint decision.
So, I think there's still some confusion here. By "open connections"
do you mean TCP connections to the source? That number is never
higher than 10. ibrowse does pipeline requests on those 10
connections, so there could be as many as 1000 simultaneous HTTP
requests. However, those requests complete as soon as the data
reaches the ibrowse client process, so in fact the number of
outstanding request during replication is usually very small. We're
not doing flow control at the TCP socket layer.
If by "open connections" you really mean "attachment receiver
processes spawned by the couch_rep gen_server" I think you'd be closer
to the mark. We can get an approximate handle on that just by
counting the number of links to the gen_server.
I'm not sure I understand what part is "not scalable". I agree that
ignoring the attachment receivers and their mailboxes when deciding
whether to checkpoint is a big problem. I'm testing a fix for that
right now. Is there something else you meant by that statement? Best,
Adam
P.S. One issue in my mind is that we only do the checkpoint test after
we receive a document. We could end up in a situation where a
document request is sitting in a pipeline behind a huge attachment,
and the checkpoint test won't execute until the entire attachment is
downloaded into memory. There are ways around this, e.g. using
ibrowse:spawn_link_worker_process/2 to bypass the default connection
pool for attachment downloads.