Hi,
Sorry for taking so long to answer, but I'm trying to catch up last stuff.
It's known that usually to optimize things for longer inputs you usually end
up making things for short inputs worst. So IMHO, I think you should have
the len==1 optimization and then use the KMP algorithm. Your implementation
can be further optimized (at least on 32 bits machines), but seems ok for
now.
I suggest you to produce a final patch, send it here, and then move on to
optimize other things (like strtr() that I'm sure wikipedia guys will
appreciate).
Nuno
----- Original Message -----
Hello again
I have setup small page where I am publishing all the profiling and
evaluation results. It is here: http://83.168.205.202/~michal/ standard15/
So far I have put there function usage profile, zend_memnstr analysis,
stripos and strrpos analysis including some charts etc. CVS diffs where
applicable are also posted along with comparison of original and patched
code.
Michal
On 2008-06-11, at 09:47, Stanislav Malyshev wrote:
Hi!
Here are some statistics:
- average haystack length: 624.2
- average needle length: 1.9 ! -> 63% of needles of length 1
- avg length of haystacks shorter than avg: 41.0 -> 85% of all
haystacks
- avg length of haystacks longer than avg: 5685.11
I think it would be interesting to see same excluding 1-char needles
since in this case it should do one-char lookup (btw, if we don't do it
on C level, it might be a good idea to).
Although strpos implements fix for that, some other functions don't. My
idea
is than to implement ZEND_MEMNSTR once again in shape:
if (needle_len = 1)
here just linear sweep
else if haystack_len < 5000 (5000 is arbitrary - maybe some more tests
needed to choose good value)
original implementation (as it is the best one in this case)
else
BM/KMP (i think BM will be better in this case, as some people
suggested)
I'm not sure very big haystacks really worth the trouble - how many of
them are used? It may be interesting to see medians instead of averages
for that. But len=1 I think worth having special case.
--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED] http://www.zend.com/
(408)253-8829 MSN: [EMAIL PROTECTED]
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php