Re: Replication of attachment is extremely slow

Paul Davis Fri, 24 Jan 2014 10:27:06 -0800

If you can duplicate this the first thing I'd look at during a slow
replication is "sudo netstat -tanp tcp" to see if you're maybe bumping
up against open socket limits.


On Fri, Jan 24, 2014 at 7:40 AM, Scott Weber <[email protected]> wrote:
> I appreciate the digging, but in the case of the test file we were using, it 
> is some text that doesn't have dashes or newlines, mixed with image data 
> which are big binary blobs.
>
> So strings that look like mime boundaries aren't likely to be present.
>
> -Scott
>
>
>
>
> ----- Original Message -----
> From: Nick North <[email protected]>
> To: "[email protected]" <[email protected]>; 
> [email protected]
> Cc:
> Sent: Friday, January 24, 2014 9:28 AM
> Subject: Re: Replication of attachment is extremely slow
>
> On 24 January 2014 15:01, Jens Alfke <[email protected]> wrote:
>
>>
>> On Jan 24, 2014, at 5:06 AM, Nick North <[email protected]> wrote:
>>
>> > I'm not really expecting this problem to be the cause of the slowdown:
>> > the attachment needs to contain a lot of initial prefixes of the MIME
>> > boundary string for things to be really bad.
>>
>> This is on the reading side, where the MIME parser is looking for the
>> boundary string that signals the end of the attachment part?
>> But the boundary string has to appear after a CRLF, so the actual sequence
>> to search for starts with "\r\n--". I'd expect the slowdown to happen only
>> if the data contains a lot of those sequences, not just any old hyphens.
>>
>> (Also, that search is really slow enough to be noticeable?! Doesn't Erlang
>> have a native string-search primitive?)
>>
>> —Jens
>>
>> PS: Maybe we should move this thread to the new replication mailing list :)
>
>
> Copied to the replication list (though not with all the preceding posts
> including, with their top and bottom posting).
>
> I don't have the code in front of me, but what you say about the search
> string sounds right, so apologies for the error. However, that makes things
> worse: the current code searches each 4KB block of the attachment for any
> initial prefix of the boundary sequence. If it finds a prefix, but not the
> whole string, it passes the block up to that point through, and starts
> searching again from about the place where the prefix was found, on the
> remainder of the original block, plus the next 4KB appended to the end. So,
> if the boundary sequence begins with "\r", then every occurrence of "\r"
> will slow it down, by causing boundary sequence searching to start again
> from where it occurs, with a larger piece of attachment to search. "\r" is
> probably more common than "-", making the problem more likely to pop up.
>
> Nick
>

Re: Replication of attachment is extremely slow

Reply via email to