27-May-2013 01:04, Kiith-Sa пишет:
WRT to the worse Linux64 case:
I recommend infinite-cycling it and testing in perf top.



(If you're on Ubuntu/derivative or maybe Debian, just type "perf top",
  it will tell you what package to install, and once installed, "perf
top" again, while the benchmark is running)

You'll get a precise real-time line-wise (with ability to drill down to
ASM) profile (like "top", but for functions).

With some command-line options (google "linux perf"), you can also look
at cache misses, branch mispredictions, and so on. Compare that with the
original version and you might find why it's slower.

(Don't have time to test anything right now)

Just tried it. Now I at least see that in 32bit my version is faster, whereas on 64bit it isn't (that is on DMD). One curiosity is that the code for ASCII case is the same yet even on English text the difference is about the same. Another one is that both function are not even partially inlined.

--
Dmitry Olshansky

Reply via email to