Re: faster splitter

Andrei Alexandrescu via Digitalmars-d Mon, 30 May 2016 11:27:07 -0700

On 05/30/2016 05:31 AM, qznc wrote:

On Sunday, 29 May 2016 at 21:07:21 UTC, qznc wrote:

When looking at the assembly I don't like the single-byte loads. Since
string (ubyte[] here) is of extraordinary importance, it should be
worthwhile to use word loads [0] instead. Really fancy would be SSE.


So far, the results look disappointing. Andrei find does not get faster
with wordwise matching:

./benchmark.ldc
     std find: 133 ±25    +38 (3384)  -19 (6486)
  manual find: 140 ±37    +64 (2953)  -25 (6962)
    qznc find: 114 ±17    +33 (2610)  -11 (7262)
   Chris find: 146 ±39    +66 (3045)  -28 (6873)
  Andrei find: 126 ±29    +54 (2720)  -19 (7189)
Wordwise find: 130 ±30    +53 (2934)  -21 (6980)

Interesting side-note: On my laptop Andrei find is faster than qznc find
(for LDC), but on my desktop it reverses (see above). Both are Intel i7.
Need to find a simpler processor. Maybe wordwise is faster there.
Alternatively, find is purely memory bound and the L1 cache makes every
difference disappear.

Also, note how std find is faster than manual find! Finding a reliable
benchmark is hard. :/

Please throw this hat into the ring as well: it should improve averagesearch on large vocabulary dramatically.


https://dpaste.dzfl.pl/dc8dc6e1eb53

It uses a BM-inspired trick - once the last characters matched, if thematch subsequently fails it needn't start from the next character in thehaystack. The "skip" is computed lazily and in a separate function so asto keep the loop tight. All in all a routine worth a look. I wanted towrite this for a long time. -- Andrei

Re: faster splitter

Reply via email to