On Friday, 27 May 2016 at 19:17:44 UTC, Chris wrote:
Oops, I've been found out. :-) Thanks. You're right of course, and I've already noticed that bug as it fails on not found. I got the bounds wrong.
I had the same "bug" when I wrote my search function on the project at work. I also found out that using the simplest loop is the generally the best way to go. The processors nowaday have loop-caches, fast level1 caches etc, all the fancy search algorithms like Boyer-Moore etc. can only overcome their memory and initialisation overhead for really huge haystacks.
Here's the C code of my memmem() function that replaced the Boyer-Moore I had before. This implementation outran the BM search for sizes up to several hundreds megaytes on SPARC64 VII+ machine. On x86-64 I'm sure it would even be worse.
void * memmem(const void *haystack, size_t haystack_len, const void *needle, size_t needle_len)
{ if(!haystack_len || !needle_len || needle_len > haystack_len) return NULL; uint8_t u8 = *((uint8_t *)needle); const uint8_t *p = (const uint8_t *) haystack; for(size_t i = haystack_len - needle_len + 1; i; --i) { if(*p == u8 && !memcmp(p, needle, needle_len)) return (void *)p; else p++; } return NULL; } (sorry for posting C code, as I'm only beginning to learn D).