grep 2.18 is slow for below, because always d == 1.

  $ env LANG=C src/grep jk k

Therefore, I wrote 
0001-grep-speed-up-by-using-memchr-in-Boyer-Moore-searchi.patch.

When `d' is small, speeds up.  memchr() is faster than or as fast as
delta1 search even when `d' is sufficiently large.

However, this patch can't apply to case-insensitive matching.
In fast, when `d' is large as following case, memchr_trans() imitated
memchr() will occur slowdown.

  $ env LANG=C src/grep -i kkkkkkkkkkkkkkkkkkkkl k

So my patch works for only case-sensitive matching.  By the way, for below
it's still slow with my patch.  Especially, on x86 machines, it's slower
than grep 2.18.

  $ env LANG=C src/grep -i jk k

I think that the master should be faster than original version (grep 2.18),
which uses not kwset but DFA for it.

Accordingly, I added 
0001-grep-speed-up-by-replacing-incr-to-add-in-x86-and-x8.patch
I noticed that DFA uses increments to proceed text pointer, and found
that the repeat of `++tp;' is faster than tp += d on x86 machines, when
`d' is small.  `++tp; ++tp; ++tp' is faster than `tp += d;' on it.
This patch inprements it.  I looked at the performance with below.

  $ env LANG=C src/grep -i jk k
  $ env LANG=C src/grep -i jkk k
  $ env LANG=C src/grep -i jkkk k
  $ env LANG=C src/grep -i jkkkk k
  $ env LANG=C src/grep -i jkkkkk k
  $ env LANG=C src/grep -i jkkkkkk k
  $ env LANG=C src/grep -i jkkkkkkk k
  $ env LANG=C src/grep -i jkkkkkkkk k
  $ env LANG=C src/grep -i jkkkkkkkkk k

When `d' is greater than 8, It uses tp += d so that it's as fast as
repeat of `++tp;'.  OTOH, when `d' is less than or equal to 8, uses the
repeat of `++tp', so that it's as fast as or faster than `tp += d'.

I used some macros in my patches, because I don't know how to replace it
to other expression without slowdown.

Norihiro




Reply via email to