Jim Andreou a écrit :
Answering my own question, probably most (all?) faster algorithms seem to need memory proportional to the size of the alphabet, which is kind of huge for Unicode, so that could be the reason.
No see: http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm
And this algorithmm is currently implemented by the regex package java.util.regex.
Rémi
2009/4/28 Jim Andreou <[email protected] <mailto:[email protected]>>Hi, I wonder why String#indexOf(String) is implemented as it is. Apparently, when a character mismatch with the searched pattern is found, the pattern is only shifted by one character, but there are faster algorithms, for example see http://www.cs.utexas.edu/users/moore/best-ideas/string-searching/index.html. Was anything smarter tried out but had significant disadvantages for general use? What advantages does the current implementation have? It looks very pessimistic. Regards, Dimitris Andreou
