On Thu, Jul 15, 2004 at 07:06:28PM -0700, Wayne Davison wrote: > On Wed, Jul 14, 2004 at 06:27:45PM -0400, Chris Shoemaker wrote: > > My initial reaction (having not actually read the code) is that it would > > be desirable make the window_size highly composite, and then ensure that > > the block size is an integer factor of the window_size. In other words, > > avoid the memmoves altogether. > > I don't believe that this is possible since the offset to map_ptr() is > often incremented by a single byte.
Ah, well, I did say I hadn't actually read the code. :) (Now I have.) Users of map_ptr(): file_checksum() walks with non-overlapping stride of CSUM_CHUNK. generate_and_send_sums() walks with non-overlapping stride of blength. simple_send_token() walks with non-overlapping stride of CHUNK_SIZE. send_deflated_token() walks with non-overlapping stride of CHUNK_SIZE (I think.) matched() walks with non-overlapping stride of CHUNK_SIZE. hash_search() walks a window of size blength (I think) with overlapping steps of 1. You must be talking about hash_search(). I hadn't considered that type of use, only the non-overlapping, constant stride use. I agree that playing with window sizes can't prevent hash_search() from triggering the memmove in map_ptr(). Nevertheless, in may help to optimize the window sizes for the other five uses. CSUM_CHUNK == 64, so any left-over during the memmove case is probably inconsequential. CHUNK_SIZE == 32k, so that makes me think that there's a real benefit to making the window size a multiple of 32k. > I'll attach my version of Craig's fix that increases the MAX_MAP_SIZE > value for large block sizes. It also caps the block size that can be > requested or computed to 256KB (which would make the map buffer in my > patched version max out at 8MB). > > ..wayne.. > --- fileio.c 16 Jul 2004 01:32:02 -0000 1.13 > +++ fileio.c 16 Jul 2004 01:58:34 -0000 > @@ -24,6 +24,8 @@ > > extern int sparse_files; > > +unsigned int max_map_size = MAX_MAP_SIZE; > + > static char last_byte; > static int last_sparse; > > @@ -186,7 +188,7 @@ char *map_ptr(struct map_struct *map,OFF > } else { > window_start = 0; > } > - window_size = MAX_MAP_SIZE; > + window_size = max_map_size; > if (window_start + window_size > map->file_size) { > window_size = map->file_size - window_start; > } > --- generator.c 15 Jul 2004 02:20:08 -0000 1.97 > +++ generator.c 16 Jul 2004 01:58:34 -0000 > @@ -52,6 +52,7 @@ extern int only_existing; > extern int orig_umask; > extern int safe_symlinks; > extern unsigned int block_size; > +extern unsigned int max_map_size; > > extern struct exclude_list_struct server_exclude_list; > > @@ -162,7 +163,9 @@ static void sum_sizes_sqroot(struct sum_ > c >>= 1; > } while (c >= 8); /* round to multiple of 8 */ > blength = MAX(blength, BLOCK_SIZE); > + blength = MIN(blength, MAX_MAP_SIZE); When we would have blength > MAX_MAP_SIZE, this makes max_map_size a multiple (1) of blength. Good. > } > + max_map_size = MIN(MAX_MAP_SIZE, blength * 32); This makes max_map_size a multiple (32) of blength for a large range (blength*32 < MAX_MAP_SIZE), For those two cases, this is probably optimal. But what is max_map_size when MAX_MAP_SIZE/32 < blength < MAX_MAP_SIZE? Ans: max_map_size = MAX_MAP_SIZE, and this is still problematic when e.g. blength = MAX_MAP_SIZE/2 + 1, resulting in memmoves of size MAX_MAP_SIZE/2 - 1, in the common, non-overlapping stride case. (It is, however, problematic over a smaller range of blength then the previous code was.) So, while this certainly does a lot to improve the worst-case performance, I think it leaves a large range of of conditions for which it is sub-optimal. There is a bit of a problem of too-many-free-variables here, because CHUNK_SIZE==32k, and that's not changing with blength. Since 32k has no factor other than 2, if blength turns out to be odd (doesn't really happen, blength is rounded to 3 bits, I think), then no max_map_size less than blength*32k can avoid the memmove. Obviously you would have to balance the desire not to need so much memory with the desire to avoid the memmove. So, you _could_ remove the extra free varible by making CHUNK_SIZE depend on (be a multiple of) blength, just like max_map_length. I don't know what the full impact of that change would be. At the other extreme, you could constrain blength to be a power of two. That makes it a cofactor of CHUNK_SIZE, and max_map_size is optimal (and of reasonable size) when it is the greater of the two. In general, max_map_size can't be optimal for reads both of size CHUNK_SIZE and size blength, unless one is a factor of the other. BTW, I think "max_map_size" is a rather misleading name, since the map size is never less than this (except maybe at the end of the file), and it can become greater than this, depending on the len argument of map_ptr(). Maybe a better name would be "init_map_size"? BIG CAVEAT: Everything I've written has one underlying assumption which I know is false at the moment: that map_ptr's internal window advancements are non-overlapping, which they are not. Why? ISTM, that the only reason to have the slight lag in window advancement is you expected to frequently service requests where the offset was decreasing just a little. I didn't see that happening anywhere. Did I miss something? If not, I think map_ptr should advance the window w/o overlapping. Once I setup a way to measure performace, the first thing I will try is: 1) remove advancement overlap in map_ptr() 2) make the blength calc result in power of 2. 3) set max_map_ptr to MAX(CHUNK_SIZE, blength*32) It's still good to have that factor of 32 because of the overlaps in hash_search(). Any thoughts? -chris > > if (protocol_version < 27) { > s2length = csum_length; > --- options.c 15 Jul 2004 19:06:32 -0000 1.160 > +++ options.c 16 Jul 2004 01:58:34 -0000 > @@ -635,6 +635,12 @@ int parse_arguments(int *argc, const cha > } > #endif > > + if (block_size > MAX_MAP_SIZE) { > + rprintf(FINFO, "limiting block-size to %d bytes\n", > + MAX_MAP_SIZE); > + block_size = MAX_MAP_SIZE; > + } > + > if (write_batch && read_batch) { > rprintf(FERROR, > "write-batch and read-batch can not be used together\n"); -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html