On Thu, Jun 17, 2004 at 08:47:57PM -0700, Craig Barratt wrote: > > > But, the comment seems to have been right on. I have re-run the > > experiment with block sizes as small as 3000 (yes it took a long > > time to complete) all the way up to block sizes of 100000 with it > > working in reasonable times. But, when the block size approaches > > 170,000 or so, the performance degrades exponentially. > > > > I understand that I am testing at the very fringes of what we should > > expect rsync to do. File sizes of 25Gig and 55Gig are beyond what was > > originally envisioned (based on 64k hash buckets and a sliding window > > of 256k). > > Here's a patch to try. It basically ensures that the window is > at least 16 times the block size. Before I'd endorse this patch > for CVS we need to make sure there aren't cases where map_ptr is > called with a much bigger length, making the 16x a bit excessive. > > Perhaps I would be tempted to repeat the previous check that the > window start plus the window size doesn't exceed the file length, > although it must be at least offset + len - window_start as in > the original code.
hehe, I'm catching up to you guys. I'm kinda late. That's what I get for letting my email back up for >1 month. :) > > In any case, I'd be curious if this fixes the problem. > > Craig > > --- rsync-2.6.2/fileio.c Sun Jan 4 19:57:15 2004 > +++ ../rsync-2.6.2/fileio.c Thu Jun 17 19:33:26 2004 > @@ -193,8 +193,8 @@ > if (window_start + window_size > map->file_size) { > window_size = map->file_size - window_start; > } > - if (offset + len > window_start + window_size) { > - window_size = (offset+len) - window_start; > + if (offset + 16 * len > window_start + window_size) { > + window_size = (offset + 16 * len) - window_start; > } My initial reaction (having not actually read the code) is that it would be desirable make the window_size highly composite, and then ensure that the block size is an integer factor of the window_size. In other words, avoid the memmoves altogether. {/me thinks a bit} Actually, I don't think this removes the pathological case at all. It just reduces the frequency of the impact by a factor of 16. Consider when len = window_size/16 - 1. We'll still end up with a memmove of size len-16, which is expensive when len is large, despite window_size being 16 times larger. Besides making window_size mod len = 0, another solution could be to use a circular buffer. This is a pretty nasty bug you guys found. I hope we can fix it soon. -chris > > /* make sure we have allocated enough memory for the window */ > -- > To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html