On Friday, 30 October 2015 at 21:33:25 UTC, Andrei Alexandrescu
wrote:
Could you please take a look at GCC's generated code and
implementation of memchr? -- Andrei
Copy-and-paste from glibc's memchr(runGLibC) gaves the result
below.
-----
Naive: 21.4 TickDuration(132485705)
SIMD: 3.17 TickDuration(19629892)
SIMDM: 2.49 TickDuration(15420462)
C: 1 TickDuration(6195504)
runGLibC: 4.32 TickDuration(26782585)
SIMDU: 1.8 TickDuration(11128618)
ASM shows memchr is realy called. There is neither compiler magic
nor
local memchr imlementation.
Aligned versions of memchr use aligned load from memory and
unaligned
one uses unaligned load. So at this point optimisation done well.