David Turner <dtur...@twopensource.com> writes:

> Optimize check_refname_component using SSE2 on x86_64.
>
> git rev-parse HEAD is a good test-case for this, since it does almost
> nothing except parse refs.  For one particular repo with about 60k
> refs, almost all packed, the timings are:
>
> Look up table: 29 ms
> SSE2:          23 ms
>
> This cuts about 20% off of the runtime.
>
> The configure.ac changes include code from the GNU C Library written
> by Joseph S. Myers <joseph at codesourcery dot com>.
>
> Ondřej Bílka <nel...@seznam.cz> suggested an SSE2 approach to the

One e-mail address is obfuscated while the other not; intended?

> substring searches, which netted a speed boost over the SSE4.2 code I
> had initially written.
>
> Signed-off-by: David Turner <dtur...@twitter.com>
> ---
> diff --git a/git-compat-util.h b/git-compat-util.h
> index f6d3a46..291d46b 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -668,6 +668,16 @@ void git_qsort(void *base, size_t nmemb, size_t size,
>  #endif
>  #endif
>  
> +#if defined(__GNUC__) && defined(__x86_64__)
> +#include <emmintrin.h>
> +/* This is the system memory page size; it's used so that we can read

Style (there are other instances of the same kind).

/*
 * This is the ...

> + * outside the bounds of an allocation without segfaulting.
> + */

> +static int check_refname_component_trailer(const char *cp, const char 
> *refname, int flags)
> +{
> +     if (cp == refname)
> +             return 0; /* Component has zero length. */
> +     if (refname[0] == '.') {
> +             if (!(flags & REFNAME_DOT_COMPONENT))
> +                     return -1; /* Component starts with '.'. */
> +             /*
> +              * Even if leading dots are allowed, don't allow "."
> +              * as a component (".." is prevented by a rule above).
> +              */
> +             if (refname[1] == '\0')
> +                     return -1; /* Component equals ".". */
> +     }
> +     if (cp - refname >= 5 && !memcmp(cp - 5, ".lock", 5))
> +             return -1; /* Refname ends with ".lock". */

This is merely a moved code that retained the same comment, but it
is more like "the current refname component ends with .lock", I
suspect.  In other words, we do not allow "refs/heads/foo.lock/bar".
Am I reading the patch correctly?

> +#if defined(__GNUC__) && defined(__x86_64__)
> +#define SSE_VECTOR_BYTES 16
> +
> +/* Vectorized version of check_refname_format. */
> +int check_refname_format(const char *refname, int flags)
> +{
> +     const char *cp = refname;
> +
> +     const __m128i dot = _mm_set1_epi8 ('.');

Style (there are other instances of the same kind).  No SP between
function/macro name and opening parenthesis.

> +     if (refname[0] == '.') {
> +             if (refname[1] == '/' || refname[1] == '\0')
> +                     return -1;
> +             if (!(flags & REFNAME_DOT_COMPONENT))
> +                     return -1;
> +     }
> +     while(1) {
> +             __m128i tmp, tmp1, result;
> +             uint64_t mask;
> +
> +             if ((uintptr_t) cp % PAGE_SIZE > PAGE_SIZE - SSE_VECTOR_BYTES  
> - 1)

OK, so we make sure we do not overrun by reading too much near the
end of the page, as the next page might be unmapped.

I am showing my ignorance but does cp (i.e. refname) upon entry to
this function need to be aligned in some way?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to