Re: Question on String#indexOf(String)

2009-04-28 Thread Jim Andreou
Many thanks Mike! All this information is very interesting. Yes, I'm sure that even a small degradation in performance in such a common method would be considered a performance bug (I wouldn't like it either), at least it's good to know that the alternatives have been analyzed. For some people out

Re: Question on String#indexOf(String)

2009-04-28 Thread madbot
The overhead of setting up one of the faster algorithms was only worth it when the skip ahead is large and/or the data to search is large. So you need heuristics to determine when the overhead is worth it. Those heuristics would also have to be fast. Of course you could also leave that decision to

Re: Question on String#indexOf(String)

2009-04-28 Thread Joshua Bloch
I vaguely recall that madbot (Mike McCloskey) did some performance work on this method, and came to the conclusion that more sophisticated algorithms didn't actually pay for themselves in the common cases. I'm copying him so he can confirm or deny. Josh On Tue, Apr 28, 2009 at 3:50 AM,

Re: Question on String#indexOf(String)

2009-04-28 Thread Jim Andreou
2009/4/28 Rémi Forax > Jim Andreou a écrit : > >> Answering my own question, probably most (all?) faster algorithms seem to >> need memory proportional to the size of the alphabet, which is kind of huge >> for Unicode, so that could be the reason. >> > No see: > http://en.wikipedia.org/wiki/Boyer

Re: Question on String#indexOf(String)

2009-04-28 Thread Rémi Forax
Jim Andreou a écrit : Answering my own question, probably most (all?) faster algorithms seem to need memory proportional to the size of the alphabet, which is kind of huge for Unicode, so that could be the reason. No see: http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm

Re: Question on String#indexOf(String)

2009-04-28 Thread Jim Andreou
Answering my own question, probably most (all?) faster algorithms seem to need memory proportional to the size of the alphabet, which is kind of huge for Unicode, so that could be the reason. 2009/4/28 Jim Andreou > Hi, > I wonder why String#indexOf(String) is implemented as it is. Apparently, >

Question on String#indexOf(String)

2009-04-28 Thread Jim Andreou
Hi, I wonder why String#indexOf(String) is implemented as it is. Apparently, when a character mismatch with the searched pattern is found, the pattern is only shifted by one character, but there are faster algorithms, for example see http://www.cs.utexas.edu/users/moore/best-ideas/string-searching/