Re: Attachment Replication Problem - Bug Found

Adam Kocoloski Sat, 16 May 2009 05:46:23 -0700

Hi Antony, thank you! I encountered this "one more byte" problem oncebefore, but it wasn't 100% reproducible, so I wasn't reallycomfortable checking in a workaround. I've basically been waiting tosee if it would ever show up for anyone else :-/

I think we should commit this change, but I'd still like to confirmthat the attachment on the target is not corrupted by the chunkprocessing issue (i.e. the last chunk starts with a \r or somethinglike that). Or even better, fix the chunk processing issue.

Now, on to the checkpointing conditions. I think there's someconfusion about the attachment workflow. Attachments are downloaded_immediately_ and in their entirety by ibrowse, which then sends thedata as 1MB binary chunks to the attachment receiver processes. Thedata sits in these processes' mailboxes until the next checkpoint.The flow control occurs entirely in Couch, not in ibrowse or the TCPlayer. We shouldn't end up with too many open connections this way --but if we do, we can tweak the max_connections and max_pipeline_sizeibrowse parameters to throttle it.

You all appear to be right, the pending attachment data are notconsidered when deciding when to checkpoint. That's a major bug and aregression in my opinion. My bad.

In another thread Matt Goodall suggested checkpointing after a certainamount of time has passed. So we'd have a checkpointing algo thatconsiders


* memory utilization
* number of pending writes
* time elapsed

Anything else we ought to take as an input? I've got some time tohack on this today.


Adam

On May 16, 2009, at 5:08 AM, Antony Blakey wrote:

On 16/05/2009, at 12:59 PM, Antony Blakey wrote:
and truncate the binary to the expected length. I'm not familiarwith ibrowse in terms of debugging this problem further.
The final mod I've ended up with is this, which deals with theibrowse problem:
------------------------------------------------------------------------------

write_streamed_attachment(_Stream, _F, 0, SpAcc) ->
   {ok, SpAcc};
write_streamed_attachment(Stream, F, LenLeft, nil) ->
   Bin = F(),
   TruncatedBin = check_bin_length(LenLeft, Bin),
   {ok, StreamPointer} = couch_stream:write(Stream, TruncatedBin),
write_streamed_attachment(Stream, F, LenLeft -size(TruncatedBin), StreamPointer);
write_streamed_attachment(Stream, F, LenLeft, SpAcc) ->
   Bin = F(),
   TruncatedBin = check_bin_length(LenLeft, Bin),
   {ok, _} = couch_stream:write(Stream, TruncatedBin),
write_streamed_attachment(Stream, F, LenLeft -size(TruncatedBin), SpAcc).
check_bin_length(LenLeft, Bin) when size(Bin) > LenLeft ->
   <<ValidData:LenLeft/binary, Crap/binary>> = Bin,
?LOG_ERROR("write_streamed_attachment has written too muchexpected: ~p got: ~p tail: ~p", [LenLeft, size(Bin), Crap]),
   ValidData;
check_bin_length(_, Bin) -> Bin.

------------------------------------------------------------------------------
Interestingly, the problems occur at the exactly the same pointsduring replication, and in each case the excess tail is <<"\r">>,which suggests to me a boundary condition processing a chunkedresponse. It's probably not a problem creating the response becausedirect access using wget returns the right amount of data.
My replication still fails near the end, this time silently killingcouchdb, but it's getting closer.
Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
Always have a vision. Why spend your life making other people’sdreams?
-- Orson Welles (1915-1985)

Re: Attachment Replication Problem - Bug Found

Reply via email to