On 05/24/2016 06:44 AM, qznc wrote:
On Tuesday, 24 May 2016 at 09:34:30 UTC, Chris wrote:
On Tuesday, 24 May 2016 at 07:54:38 UTC, qznc wrote:
Apart from advanced algorithms, find should not be slower than a
naive nested-for-loop implementation.
[0] https://issues.dlang.org/show_bug.cgi?id=16066
From Phobos [1]:
/* Preprocess #2: init skip[] */
/* Note: This step could be made a lot faster.
* A simple implementation is shown here. */
this.skip = new size_t[needle.length];
foreach (a; 0 .. needle.length)
{
size_t value = 0;
while (value < needle.length
&& !needlematch(needle, a, value))
{
++value;
}
this.skip[needle.length - a - 1] = value;
}
[1]
https://github.com/dlang/phobos/blob/master/std/algorithm/searching.d#L335
Unfortunately, it is slower. My current measurements [0]:
$ ./benchmark 10000000 10 # large haystack, few iterations
std find took 753837231
manual find took 129561980
$ ./benchmark 10 10000000 # small haystack, many iterations
std find took 721676721
manual find took 216418870
The nested-for-loop implementation is roughly 4 times faster!
Caveat: I'm not completely sure my benchmark is fair yet.
[0] https://github.com/qznc/find-is-too-slow
One somewhat unrelated thing that can be done with Boyer-Moore: use a
constant-length skip array instead of dynamic allocation. Upon
saturation, continue with brute force. -- Andrei