Jim Andreou a écrit :
Answering my own question, probably most (all?) faster algorithms seem to need memory proportional to the size of the alphabet, which is kind of huge for Unicode, so that could be the reason.
No see:
http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm

And this algorithmm is currently implemented by the regex package java.util.regex.

Rémi

2009/4/28 Jim Andreou <[email protected] <mailto:[email protected]>>

    Hi,

    I wonder why String#indexOf(String) is implemented as it is.
    Apparently, when a character mismatch with the searched pattern is
    found, the pattern is only shifted by one character, but there are
    faster algorithms, for example
    see 
http://www.cs.utexas.edu/users/moore/best-ideas/string-searching/index.html.
    Was anything smarter tried out but had significant disadvantages
    for general use? What advantages does the current implementation
    have? It looks very pessimistic.

    Regards,
    Dimitris Andreou



Reply via email to