On 16/05/2009, at 9:15 AM, Antony Blakey wrote:
On 16/05/2009, at 8:27 AM, Chris Anderson wrote:
Thanks for reporting this. I'm not sure I can see the issue in the
last logfile you posted (I haven't gone through the diffs to see
where
you added log statements...) It seems that the attachment size is not
an issue, its the fact that there are many many attachments on each
doc. This means it should be fairly easy to make a reproducible
JavaScript test case, that causes a never-finishing replication. Once
we have that, I'd be happy to run it and bang on the code till I get
it to pass.
I've created a test case with many documents, but it doesn't cause
the problem, so it must be somewhat more subtle than it looks.
Specifically, it may have something to do with the replication state
to that point.
To deal with the problem of outstanding promises I set
couch_util:should_flush(1) - that's a separate issue. The bug that
causes my replication to hang seems to be in ibrowse. The problem is
that ibrowse is returning 1 more byte of data than it should, and so
the following code in couch_db is failing because the case where
LenLeft - size(Bin) < 0 isn't being caught. This blocks replication.
When I wget the offending resource I get the correct length. The
problem is with the second attachment (Perceive.png) in http://gist.github.com/112074
.
write_streamed_attachment(_Stream, _F, 0, SpAcc)
{ok, SpAcc};
write_streamed_attachment(Stream, F, LenLeft, nil) ->
Bin = F(),
{ok, StreamPointer} = couch_stream:write(Stream, Bin),
write_streamed_attachment(Stream, F, LenLeft - size(Bin),
StreamPointer);
write_streamed_attachment(Stream, F, LenLeft, SpAcc) ->
Bin = F(),
{ok, _} = couch_stream:write(Stream, Bin),
write_streamed_attachment(Stream, F, LenLeft - size(Bin), SpAcc).
To enable replication to continue, a temporary fix is to replace the
first case with this:
write_streamed_attachment(_Stream, _F, LenLeft, SpAcc) when 1 >
LenLeft
{ok, SpAcc};
although maybe a better option is to *add* this case:
write_streamed_attachment(_Stream, _F, LenLeft, SpAcc) when 0 >
LenLeft
?LOG_ERROR("write_streamed_attachment has written too much
data", []),
{ok, SpAcc};
and truncate the binary to the expected length. I'm not familiar with
ibrowse in terms of debugging this problem further.
Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
I contend that we are both atheists. I just believe in one fewer god
than you do. When you understand why you dismiss all the other
possible gods, you will understand why I dismiss yours.
--Stephen F Roberts