[PATCHES] strpos() && KMP

2007-08-01 Thread Pavel Ajtkulov
Hello, this patch allow to use Knuth-Morrison-Pratt algorithm for strpos() function (see Cormen et al. Introduction to Algorithms, MIT Press, 2001). It also works with multibyte wchar. In worst case current brute force strpos() takes O(n * m) (n && m is length of strings) time (example: 'aaa.

Re: [PATCHES] strpos() && KMP

2007-08-07 Thread Pavel Ajtkulov
> Do you have any performance test results for this? I describe the worst case in first message (search 'aaa..aab' in 'aa..aa', "complete" N^2). It works some msec instead of several sec (in current version). Patch intends for artificial language (for example DNA, or language with small alphabet

Re: [PATCHES] strpos() && KMP

2007-08-08 Thread Pavel Ajtkulov
Tom Lane writes: > I wonder why you didn't propose Boyer-Moore instead, as that would have > some advantage for natural language text as well. > The difficulty with B-M is the need for a table indexed by character > code, which at first glance looks impractical for wchars. But it seems > to me t

Re: [PATCHES] strpos() && KMP

2007-08-10 Thread Pavel Ajtkulov
Tom Lane writes: >> hash table? > I'd think the cost of hashing would render it impractical. Most of the > papers I've seen on this topic worry about getting single instructions > out of the search loop --- a hash lookup will cost lots more than that. > Moreover, you'd lose the guarantee of not-