Bernd Schmidt wrote:
> On 09/15/2016 03:38 PM, Wilco Dijkstra wrote:
> > __rawmemchr is not the fastest on any target I tried, including x86,
> Interesting. Care to share your test program? I just looked at the libc 
> sources and strlen/rawmemchr are practically identical code so I'd 
> expect any difference to be lost in the noise. Of course there might be 
> inlines interfering with the comparison.

It's glibc/benchtests/bench-strlen.c slightly modified to compare strlen,
rawmemchr and strchr. Even if they appear identical the inner loop of strlen
is much faster than strchr and rawmemchr at larger sizes:

rawmemchr          strlen
Length 4096, alignment 12:      3.35132e+06     2.39842e+06     1.88962e+06

> > So the only reasonable optimization is to always emit a + strlen (a).
> Not sure about "only reasonable" but on the whole I'd agree that it's 
> reasonable and we shouldn't let the perfect be the enemy of the good 
> here. I'm sure we can come up with lots of different ways to do this but 
> let's just pick one and if the one Wilco submitted looks decent let's 
> just put it in.
> Out of curiousity, is there real-world code that this is intended to 
> optimize?

I noticed rawmemchr taking non-trivial amounts of time in various profiles
despite no use of rawmemchr in any of the source code. It's apparently a
common idiom to use strchr (s, 0) to find the end of a string. Given strchr is
slower than strlen, it is changed to rawmemchr by GLIBC headers. However
this makes things even slower since few targets have an optimized rawmemchr,
and for targets that do, strlen is faster.

So this is one of many improvements to ensure GCC/GLIBC by default do 
optimizations in a way that is best for most targets. If a particular target
wants to do something different that is always possible of course.


Reply via email to