On Sat, 2014-06-14 at 17:22 +0200, Ondřej Bílka wrote:
> On Thu, Jun 05, 2014 at 07:56:15PM -0400, David Turner wrote:
> > Optimize check_refname_component using SSE4.2, where available.
> > git rev-parse HEAD is a good test-case for this, since it does almost
> > nothing except parse refs. For one particular repo with about 60k
> > refs, almost all packed, the timings are:
> > Look up table: 29 ms
> > SSE4.2: 25 ms
> > This is about a 15% improvement.
> > The configure.ac changes include code from the GNU C Library written
> > by Joseph S. Myers <joseph at codesourcery dot com>.
> > Only supports GCC and Clang at present, because C interfaces to the
> > cpuid instruction are not well-standardized.
> Still a SSE4.2 is not that useful, in most cases SSE2 is faster. Here I
> think that difference will not be that big when correctly implemented.
> That will avoid a runtime checks.
Surprisingly to me, this is true! At least, on my machine. Sadly, the
only way to make it avoid a runtime check is to exclude 32-bit machines
(or to make the option non-default, which I would prefer not to do).
> For parallelisation you need to take extra step and paralelize whole
> check than going component-by-component.
> For detecting sequences a faster way is construct bitmasks with SSE2 so
> you could combine these. It avoids needing special casing on 16-byte
That does seem to be faster.
> Below is untested implementation where you could add a bad character
> check with SSE4.2 which would speed it up. Are refs mostly
> alphanumerical? If so we could speed this up by paralelized alnum check
> and handling other characters in slower path.
Twitter's are almost entirely in [-._/a-zA-Z0-9] -- there are only a
handful of exceptions. So, a method that has some bycatch outside of
this range is just as fast as the SSE4.2 bad character check (but
somewhat more code).
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html