Re: [Bug 1463] New: poor performance with large block size
On Sat, Jul 17, 2004 at 03:54:31AM -0700, Wayne Davison wrote: > On Fri, Jul 16, 2004 at 08:20:51PM -0400, Chris Shoemaker wrote: > > On Thu, Jul 15, 2004 at 07:06:28PM -0700, Wayne Davison wrote: > > > + max_map_size = MIN(MAX_MAP_SIZE, blength * 32); > > > > This makes max_map_size a multiple (32) of blength > > for a large range (blength*32 < MAX_MAP_SIZE), > > Oops, that was supposed to be MAX(), not MIN(), otherwise it doesn't > help the problem of too-many memory moves for large block sizes. I'll Sure it does (did) - in the sense that it capped the map sizes and therefore the memmoves. I don't think MAX is better, because it forces large reads even for small block sizes. That might be ok for the case where you will walk the whole file, but otherwise it's making a single map_ptr() pretty expensive. > go ahead and check that change in for now and look forward to your > findings on improving the code. > > > ISTM, that the only reason to have the slight lag in window > > advancement is you expected to frequently service requests where the > > offset was decreasing just a little. I didn't see that happening > > anywhere. Did I miss something? > > I think the only place where the calls might not advance are in the > receiver where the map_ptr() calls can be to any block-size-multiple Agreed, but the partial-stride window advancement in map_ptr only helps if your subsequent calls are _only_a_little_bit_ behind the current cursor. I haven't wrapped my mind around hash_search() yet, so I can't say for sure, but nothing jumps out as producing that type of access pattern. > in the basis file. It seems strange to me to force a 256K read for > every basis-file match, so maybe this is something else to look into > optimizing. Indeed. -chris > > ..wayne.. -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [Bug 1463] New: poor performance with large block size
On Fri, Jul 16, 2004 at 08:20:51PM -0400, Chris Shoemaker wrote: > On Thu, Jul 15, 2004 at 07:06:28PM -0700, Wayne Davison wrote: > > + max_map_size = MIN(MAX_MAP_SIZE, blength * 32); > > This makes max_map_size a multiple (32) of blength > for a large range (blength*32 < MAX_MAP_SIZE), Oops, that was supposed to be MAX(), not MIN(), otherwise it doesn't help the problem of too-many memory moves for large block sizes. I'll go ahead and check that change in for now and look forward to your findings on improving the code. > ISTM, that the only reason to have the slight lag in window > advancement is you expected to frequently service requests where the > offset was decreasing just a little. I didn't see that happening > anywhere. Did I miss something? I think the only place where the calls might not advance are in the receiver where the map_ptr() calls can be to any block-size-multiple in the basis file. It seems strange to me to force a 256K read for every basis-file match, so maybe this is something else to look into optimizing. ..wayne.. -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [Bug 1463] New: poor performance with large block size
On Thu, Jul 15, 2004 at 07:06:28PM -0700, Wayne Davison wrote: > On Wed, Jul 14, 2004 at 06:27:45PM -0400, Chris Shoemaker wrote: > > My initial reaction (having not actually read the code) is that it would > > be desirable make the window_size highly composite, and then ensure that > > the block size is an integer factor of the window_size. In other words, > > avoid the memmoves altogether. > > I don't believe that this is possible since the offset to map_ptr() is > often incremented by a single byte. Ah, well, I did say I hadn't actually read the code. :) (Now I have.) Users of map_ptr(): file_checksum() walks with non-overlapping stride of CSUM_CHUNK. generate_and_send_sums() walks with non-overlapping stride of blength. simple_send_token() walks with non-overlapping stride of CHUNK_SIZE. send_deflated_token() walks with non-overlapping stride of CHUNK_SIZE (I think.) matched() walks with non-overlapping stride of CHUNK_SIZE. hash_search() walks a window of size blength (I think) with overlapping steps of 1. You must be talking about hash_search(). I hadn't considered that type of use, only the non-overlapping, constant stride use. I agree that playing with window sizes can't prevent hash_search() from triggering the memmove in map_ptr(). Nevertheless, in may help to optimize the window sizes for the other five uses. CSUM_CHUNK == 64, so any left-over during the memmove case is probably inconsequential. CHUNK_SIZE == 32k, so that makes me think that there's a real benefit to making the window size a multiple of 32k. > I'll attach my version of Craig's fix that increases the MAX_MAP_SIZE > value for large block sizes. It also caps the block size that can be > requested or computed to 256KB (which would make the map buffer in my > patched version max out at 8MB). > > ..wayne.. > --- fileio.c 16 Jul 2004 01:32:02 - 1.13 > +++ fileio.c 16 Jul 2004 01:58:34 - > @@ -24,6 +24,8 @@ > > extern int sparse_files; > > +unsigned int max_map_size = MAX_MAP_SIZE; > + > static char last_byte; > static int last_sparse; > > @@ -186,7 +188,7 @@ char *map_ptr(struct map_struct *map,OFF > } else { > window_start = 0; > } > - window_size = MAX_MAP_SIZE; > + window_size = max_map_size; > if (window_start + window_size > map->file_size) { > window_size = map->file_size - window_start; > } > --- generator.c 15 Jul 2004 02:20:08 - 1.97 > +++ generator.c 16 Jul 2004 01:58:34 - > @@ -52,6 +52,7 @@ extern int only_existing; > extern int orig_umask; > extern int safe_symlinks; > extern unsigned int block_size; > +extern unsigned int max_map_size; > > extern struct exclude_list_struct server_exclude_list; > > @@ -162,7 +163,9 @@ static void sum_sizes_sqroot(struct sum_ > c >>= 1; > } while (c >= 8); /* round to multiple of 8 */ > blength = MAX(blength, BLOCK_SIZE); > + blength = MIN(blength, MAX_MAP_SIZE); When we would have blength > MAX_MAP_SIZE, this makes max_map_size a multiple (1) of blength. Good. > } > + max_map_size = MIN(MAX_MAP_SIZE, blength * 32); This makes max_map_size a multiple (32) of blength for a large range (blength*32 < MAX_MAP_SIZE), For those two cases, this is probably optimal. But what is max_map_size when MAX_MAP_SIZE/32 < blength < MAX_MAP_SIZE? Ans: max_map_size = MAX_MAP_SIZE, and this is still problematic when e.g. blength = MAX_MAP_SIZE/2 + 1, resulting in memmoves of size MAX_MAP_SIZE/2 - 1, in the common, non-overlapping stride case. (It is, however, problematic over a smaller range of blength then the previous code was.) So, while this certainly does a lot to improve the worst-case performance, I think it leaves a large range of of conditions for which it is sub-optimal. There is a bit of a problem of too-many-free-variables here, because CHUNK_SIZE==32k, and that's not changing with blength. Since 32k has no factor other than 2, if blength turns out to be odd (doesn't really happen, blength is rounded to 3 bits, I think), then no max_map_size less than blength*32k can avoid the memmove. Obviously you would have to balance the desire not to need so much memory with the desire to avoid the memmove. So, you _could_ remove the extra free varible by making CHUNK_SIZE depend on (be a multiple of) blength, just like max_map_length. I don't know what the full impact of that change would be. At the other extreme, you could constrain blength to be a power of two. That makes it a cofactor of CHUNK_SIZE, and max_map_size is optimal (and of reasonable size) when it is the greater of the two. In general, max_map_size can't be optimal for reads both of size CHUNK_SIZE and size blength, unless one is a factor of the other. BTW, I think "max_map_size" is a rather misleading name, since the map size is never less than this (except maybe at the end of the file), and it can become greater t
Re: [Bug 1463] New: poor performance with large block size
On Wed, Jul 14, 2004 at 06:27:45PM -0400, Chris Shoemaker wrote: > My initial reaction (having not actually read the code) is that it would > be desirable make the window_size highly composite, and then ensure that > the block size is an integer factor of the window_size. In other words, > avoid the memmoves altogether. I don't believe that this is possible since the offset to map_ptr() is often incremented by a single byte. > Besides making window_size mod len = 0, another solution could be to use > a circular buffer. The callers of map_ptr() currently expect to see the entire requested range in a single buffer, so that would be a pretty big change for some of the callers. We can contemplate this, though. I'll attach my version of Craig's fix that increases the MAX_MAP_SIZE value for large block sizes. It also caps the block size that can be requested or computed to 256KB (which would make the map buffer in my patched version max out at 8MB). ..wayne.. --- fileio.c16 Jul 2004 01:32:02 - 1.13 +++ fileio.c16 Jul 2004 01:58:34 - @@ -24,6 +24,8 @@ extern int sparse_files; +unsigned int max_map_size = MAX_MAP_SIZE; + static char last_byte; static int last_sparse; @@ -186,7 +188,7 @@ char *map_ptr(struct map_struct *map,OFF } else { window_start = 0; } - window_size = MAX_MAP_SIZE; + window_size = max_map_size; if (window_start + window_size > map->file_size) { window_size = map->file_size - window_start; } --- generator.c 15 Jul 2004 02:20:08 - 1.97 +++ generator.c 16 Jul 2004 01:58:34 - @@ -52,6 +52,7 @@ extern int only_existing; extern int orig_umask; extern int safe_symlinks; extern unsigned int block_size; +extern unsigned int max_map_size; extern struct exclude_list_struct server_exclude_list; @@ -162,7 +163,9 @@ static void sum_sizes_sqroot(struct sum_ c >>= 1; } while (c >= 8); /* round to multiple of 8 */ blength = MAX(blength, BLOCK_SIZE); + blength = MIN(blength, MAX_MAP_SIZE); } + max_map_size = MIN(MAX_MAP_SIZE, blength * 32); if (protocol_version < 27) { s2length = csum_length; --- options.c 15 Jul 2004 19:06:32 - 1.160 +++ options.c 16 Jul 2004 01:58:34 - @@ -635,6 +635,12 @@ int parse_arguments(int *argc, const cha } #endif + if (block_size > MAX_MAP_SIZE) { + rprintf(FINFO, "limiting block-size to %d bytes\n", + MAX_MAP_SIZE); + block_size = MAX_MAP_SIZE; + } + if (write_batch && read_batch) { rprintf(FERROR, "write-batch and read-batch can not be used together\n"); -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [Bug 1463] New: poor performance with large block size
On Thu, Jun 17, 2004 at 08:47:57PM -0700, Craig Barratt wrote: > > > But, the comment seems to have been right on. I have re-run the > > experiment with block sizes as small as 3000 (yes it took a long > > time to complete) all the way up to block sizes of 10 with it > > working in reasonable times. But, when the block size approaches > > 170,000 or so, the performance degrades exponentially. > > > > I understand that I am testing at the very fringes of what we should > > expect rsync to do. File sizes of 25Gig and 55Gig are beyond what was > > originally envisioned (based on 64k hash buckets and a sliding window > > of 256k). > > Here's a patch to try. It basically ensures that the window is > at least 16 times the block size. Before I'd endorse this patch > for CVS we need to make sure there aren't cases where map_ptr is > called with a much bigger length, making the 16x a bit excessive. > > Perhaps I would be tempted to repeat the previous check that the > window start plus the window size doesn't exceed the file length, > although it must be at least offset + len - window_start as in > the original code. hehe, I'm catching up to you guys. I'm kinda late. That's what I get for letting my email back up for >1 month. :) > > In any case, I'd be curious if this fixes the problem. > > Craig > > --- rsync-2.6.2/fileio.cSun Jan 4 19:57:15 2004 > +++ ../rsync-2.6.2/fileio.c Thu Jun 17 19:33:26 2004 > @@ -193,8 +193,8 @@ > if (window_start + window_size > map->file_size) { > window_size = map->file_size - window_start; > } > - if (offset + len > window_start + window_size) { > - window_size = (offset+len) - window_start; > + if (offset + 16 * len > window_start + window_size) { > + window_size = (offset + 16 * len) - window_start; > } My initial reaction (having not actually read the code) is that it would be desirable make the window_size highly composite, and then ensure that the block size is an integer factor of the window_size. In other words, avoid the memmoves altogether. {/me thinks a bit} Actually, I don't think this removes the pathological case at all. It just reduces the frequency of the impact by a factor of 16. Consider when len = window_size/16 - 1. We'll still end up with a memmove of size len-16, which is expensive when len is large, despite window_size being 16 times larger. Besides making window_size mod len = 0, another solution could be to use a circular buffer. This is a pretty nasty bug you guys found. I hope we can fix it soon. -chris > > /* make sure we have allocated enough memory for the window */ > -- > To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
RE: [Bug 1463] New: poor performance with large block size
I applied your patch and it has resolved the problem. Thanks Craig -Original Message- From: Craig Barratt [mailto:[EMAIL PROTECTED] Sent: Thursday, June 17, 2004 11:48 PM To: Wallace Matthews Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [Bug 1463] New: poor performance with large block size Wally writes: > I apologize to Craig. Chris is correct. No problem. > I had been reading so many of Chris's highly intelligent e-mails... Same here. > But, the comment seems to have been right on. I have re-run the > experiment with block sizes as small as 3000 (yes it took a long > time to complete) all the way up to block sizes of 10 with it > working in reasonable times. But, when the block size approaches > 170,000 or so, the performance degrades exponentially. > > I understand that I am testing at the very fringes of what we should > expect rsync to do. File sizes of 25Gig and 55Gig are beyond what was > originally envisioned (based on 64k hash buckets and a sliding window > of 256k). Here's a patch to try. It basically ensures that the window is at least 16 times the block size. Before I'd endorse this patch for CVS we need to make sure there aren't cases where map_ptr is called with a much bigger length, making the 16x a bit excessive. Perhaps I would be tempted to repeat the previous check that the window start plus the window size doesn't exceed the file length, although it must be at least offset + len - window_start as in the original code. In any case, I'd be curious if this fixes the problem. Craig --- rsync-2.6.2/fileio.cSun Jan 4 19:57:15 2004 +++ ../rsync-2.6.2/fileio.c Thu Jun 17 19:33:26 2004 @@ -193,8 +193,8 @@ if (window_start + window_size > map->file_size) { window_size = map->file_size - window_start; } - if (offset + len > window_start + window_size) { - window_size = (offset+len) - window_start; + if (offset + 16 * len > window_start + window_size) { + window_size = (offset + 16 * len) - window_start; } /* make sure we have allocated enough memory for the window */ -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [Bug 1463] New: poor performance with large block size
Wally writes: > I apologize to Craig. Chris is correct. No problem. > I had been reading so many of Chris's highly intelligent e-mails... Same here. > But, the comment seems to have been right on. I have re-run the > experiment with block sizes as small as 3000 (yes it took a long > time to complete) all the way up to block sizes of 10 with it > working in reasonable times. But, when the block size approaches > 170,000 or so, the performance degrades exponentially. > > I understand that I am testing at the very fringes of what we should > expect rsync to do. File sizes of 25Gig and 55Gig are beyond what was > originally envisioned (based on 64k hash buckets and a sliding window > of 256k). Here's a patch to try. It basically ensures that the window is at least 16 times the block size. Before I'd endorse this patch for CVS we need to make sure there aren't cases where map_ptr is called with a much bigger length, making the 16x a bit excessive. Perhaps I would be tempted to repeat the previous check that the window start plus the window size doesn't exceed the file length, although it must be at least offset + len - window_start as in the original code. In any case, I'd be curious if this fixes the problem. Craig --- rsync-2.6.2/fileio.cSun Jan 4 19:57:15 2004 +++ ../rsync-2.6.2/fileio.c Thu Jun 17 19:33:26 2004 @@ -193,8 +193,8 @@ if (window_start + window_size > map->file_size) { window_size = map->file_size - window_start; } - if (offset + len > window_start + window_size) { - window_size = (offset+len) - window_start; + if (offset + 16 * len > window_start + window_size) { + window_size = (offset + 16 * len) - window_start; } /* make sure we have allocated enough memory for the window */ -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
RE: [Bug 1463] New: poor performance with large block size
I apologize to Craig. Chris is correct. I had been reading so many of Chris's highly intelligent e-mails that for some reason my brain ascribed the comment to Chris. But, the comment seems to have been right on. I have re-run the experiment with block sizes as small as 3000 (yes it took a long time to complete) all the way up to block sizes of 10 with it working in reasonable times. But, when the block size approaches 170,000 or so, the performance degrades exponentially. I understand that I am testing at the very fringes of what we should expect rsync to do. File sizes of 25Gig and 55Gig are beyond what was originally envisioned (based on 64k hash buckets and a sliding window of 256k). I am very impressed that rsync is functional at these file sizes. The developers are doing a great job. wally -Original Message- From: Chris Shoemaker [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 16, 2004 1:45 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [Bug 1463] New: poor performance with large block size On Wed, Jun 16, 2004 at 06:21:15AM -0700, [EMAIL PROTECTED] wrote: > https://bugzilla.samba.org/show_bug.cgi?id=1463 > >Summary: poor performance with large block size >Product: rsync >Version: 2.6.2 > Platform: x86 > OS/Version: other > Status: NEW > Severity: normal > Priority: P3 > Component: core > AssignedTo: [EMAIL PROTECTED] > ReportedBy: [EMAIL PROTECTED] > QAContact: [EMAIL PROTECTED] > > > I have a 29Gig file that is the previous version of a file and a 1.3 Gig > incremental backup of the file. I did the transfer with no block size option > and it takes about 6 minutes (GigEthernet). I used --block-size = 90k and it > took about 6 minutes. I used --block-size=182000 (close to the square root of > 29 Gig) and it only completed ~50 Meg of transfer of the 1.3 Gig in a couple of > hours. > > Chris Shoemaker suggests this is a problem with the sliding window being a > fixed size of 256k and that the block size will not allow multiple blocks in > the window size. Er, that suggestion sounds WAY too intelligent to have come from me. :-) Seriously, that was Craig Barratt. -chris > > Operating system is Redhat 9.1 for both systems. Both systems have the 2.6.2 > with only 1 patch and it is the one Wayne provided for --bwlimit being bi-modal > > -- > Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email > --- You are receiving this mail because: --- > You are the QA contact for the bug, or are watching the QA contact. > -- > To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: [Bug 1463] New: poor performance with large block size
On Wed, Jun 16, 2004 at 06:21:15AM -0700, [EMAIL PROTECTED] wrote: > https://bugzilla.samba.org/show_bug.cgi?id=1463 > >Summary: poor performance with large block size >Product: rsync >Version: 2.6.2 > Platform: x86 > OS/Version: other > Status: NEW > Severity: normal > Priority: P3 > Component: core > AssignedTo: [EMAIL PROTECTED] > ReportedBy: [EMAIL PROTECTED] > QAContact: [EMAIL PROTECTED] > > > I have a 29Gig file that is the previous version of a file and a 1.3 Gig > incremental backup of the file. I did the transfer with no block size option > and it takes about 6 minutes (GigEthernet). I used --block-size = 90k and it > took about 6 minutes. I used --block-size=182000 (close to the square root of > 29 Gig) and it only completed ~50 Meg of transfer of the 1.3 Gig in a couple of > hours. > > Chris Shoemaker suggests this is a problem with the sliding window being a > fixed size of 256k and that the block size will not allow multiple blocks in > the window size. Er, that suggestion sounds WAY too intelligent to have come from me. :-) Seriously, that was Craig Barratt. -chris > > Operating system is Redhat 9.1 for both systems. Both systems have the 2.6.2 > with only 1 patch and it is the one Wayne provided for --bwlimit being bi-modal > > -- > Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email > --- You are receiving this mail because: --- > You are the QA contact for the bug, or are watching the QA contact. > -- > To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 1463] New: poor performance with large block size
https://bugzilla.samba.org/show_bug.cgi?id=1463 Summary: poor performance with large block size Product: rsync Version: 2.6.2 Platform: x86 OS/Version: other Status: NEW Severity: normal Priority: P3 Component: core AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] QAContact: [EMAIL PROTECTED] I have a 29Gig file that is the previous version of a file and a 1.3 Gig incremental backup of the file. I did the transfer with no block size option and it takes about 6 minutes (GigEthernet). I used --block-size = 90k and it took about 6 minutes. I used --block-size=182000 (close to the square root of 29 Gig) and it only completed ~50 Meg of transfer of the 1.3 Gig in a couple of hours. Chris Shoemaker suggests this is a problem with the sliding window being a fixed size of 256k and that the block size will not allow multiple blocks in the window size. Operating system is Redhat 9.1 for both systems. Both systems have the 2.6.2 with only 1 patch and it is the one Wayne provided for --bwlimit being bi-modal -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the QA contact for the bug, or are watching the QA contact. -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html