Re: My take on the diff-optimizations-bytes branch

Johan Corveleyn Tue, 25 Jan 2011 18:10:37 -0800

On Tue, Jan 25, 2011 at 1:37 AM, Stefan Fuhrmann <eq...@web.de> wrote:
[ ... snip ...]
> And, as promised, here some ideas how to get more
> speed from the generic code. Your latest commit:
>
> +#if SVN_UNALIGNED_ACCESS_IS_OK
> +
> +      /* Skip quickly over the stuff between EOLs. */
> +      for (i = 0, can_read_word = TRUE; i<  file_len; i++)
> +        can_read_word = can_read_word
> +&&  (file[i].curp + sizeof(apr_size_t)<  file[i].endp);
> +      while (can_read_word)
> +        {
> +          for (i = 1, is_match = TRUE; i<  file_len; i++)
> +            is_match = is_match
> +&&  (   *(const apr_size_t *)file[0].curp
> +                           == *(const apr_size_t *)file[i].curp);
> +
> +          if (!is_match || contains_eol(*(const apr_size_t *)file[0].curp))
> +            break;
> +
> +          for (i = 0; i<  file_len; i++)
> +            file[i].curp += sizeof(apr_size_t);
> +          for (i = 0, can_read_word = TRUE; i<  file_len; i++)
> +            can_read_word = can_read_word
> +&&  (file[i].curp + sizeof(apr_size_t)<  file[i].endp);
> +        }
> +
> +#endif
>
> could be changed to something like the following.
> Please note that I haven't tested any of this:


Thanks. There was one error in your suggestion, which I found out
after testing. See below.

> /* Determine how far we may advance with chunky ops without reaching
>  * endp for any of the files.
>  * Signedness is important here if curp gets close to endp.
>  */
> apr_ssize_t max_delta = file[0].endp - file[0].curp - sizeof(apr_size_t);
> for (i = 1; i<  file_len; i++)
> {
>    apr_ssize_t delta = file[i].endp - file[i].curp - sizeof(apr_size_t);
>    if (delta<  max_delta)
>        max_delta = delta;
> }
>
> /* the former while() loop */
> is_match = TRUE;
> for (delta = 0; delta<  max_delta&&  is_match; delta += sizeof(apr_size_t))
> {
>    apr_size_t chunk = *(const apr_size_t *)(file[0].curp + delta);
>    if (contains_eol(chunk))
>        break;
>
>    for (i = 1; i<  file_len; i++)
>        if (chunk != *(const apr_size_t *)(file[i].curp + delta))
>        {
>            is_match = FALSE;

Here, I inserted:

            delta -= sizeof(apr_size_t);

because otherwise, delta will be increased too far (it will still be
increased by the counting expression of the outer for-loop (after
which it will stop because of !is_match)). Maybe there is a
cleaner/clearer way to break out of the outer for-loop here, without
incrementing delta again, but for now, I've committed it with this
change (r1063565).

>            break;
>        }
> }
>
> /* We either found a mismatch or an EOL at or shortly behind curp+delta
>  * or we cannot proceed with chunky ops without exceeding endp.
>  * In any way, everything up to curp + delta is equal and not an EOL.
>  */
> for (i = 0; i<  file_len; i++)
>    file[i].curp += delta;

Thanks. This gives on my machine/example another 15-20% performance
increase (datasources_open time going down from ~21 s to ~17 s). We
should probably do the same for suffix scanning, but I'm too tired
right now :-) (and suffix scanning is more difficult to grok, so not a
good idea to do at 3 am).

Cheers,
-- 
Johan

Re: My take on the diff-optimizations-bytes branch

Reply via email to