On 16/05/2009, at 8:27 AM, Chris Anderson wrote:
Thanks for reporting this. I'm not sure I can see the issue in the
last logfile you posted (I haven't gone through the diffs to see where
you added log statements...) It seems that the attachment size is not
an issue, its the fact that there are many many attachments on each
doc. This means it should be fairly easy to make a reproducible
JavaScript test case, that causes a never-finishing replication. Once
we have that, I'd be happy to run it and bang on the code till I get
it to pass.
I've created a test case with many documents, but it doesn't cause the
problem, so it must be somewhat more subtle than it looks.
Specifically, it may have something to do with the replication state
to that point.
I think the big problem is the architecture where attachments aren't
started streaming until the doc itself is written to disk. There's no
reason it should have to be this way, as we could setup a queue of
attachments (and docs that are waiting on them) and make it's width
configurable, beginning the attachment transfer right away. I've
written code like this a few times, and it should be totally doable in
this context.
That's what I was thinking, although I think it's a considerable
rewrite from what is currently there, and a significant issue is
avoiding out-of-order writes. A better option might be to trigger
checkpoints on the basis of the number of outstanding promises, in
combination with buffering attachment downloads to disk.
Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
I contend that we are both atheists. I just believe in one fewer god
than you do. When you understand why you dismiss all the other
possible gods, you will understand why I dismiss yours.
--Stephen F Roberts