Hi Damien, I'm not sure I follow. My worry was that, if I built a
replicator which only queried _changes to get the list of updates, I'd
have to be prepared to process a very large response. I thought one
smart way to process this response was to throttle the download at the
TCP level by putting the socket into passive mode.
I agree that the HTTP client seems to be at fault, because the option
that it exposes to switch to passive mode seems to be a no-op. What
exactly did you mean by "streams the data while not buffering the
data"? Best,
Adam
On Jun 12, 2009, at 8:03 AM, Damien Katz wrote:
I don't think this is TCPs fault, it's the HTTP client. We need a
HTTP client that streams data while not buffering the data (low
level TCP already buffers some), instead of sending all the data
that comes in to the waiting process, essentially buffering
everything.
-Damien
On Jun 11, 2009, at 4:14 PM, Adam Kocoloski wrote:
I had some time to work on a replicator that queries _changes
instead of _all_docs_by_seq today. The first question that came to
my mind was how to put a spigot on the firehose. If I call
_changes without a "since" qs parameter on a 10M document DB I'm
going to get 10M chunks of output back.
I thought I might be able to control the flow at the TCP socket
level using the inets HTTP client's {stream,{self,once}} option. I
still think this would be an elegant option if I can get it to
work, but my early tests show that all the chunks still show up
immediately in the calling process regardless of whether I stream
to self or {self,once}.
All for now, Adam