Tor Erik <[EMAIL PROTECTED]> wrote: > Alex Martelli wrote: > > Tor Erik <[EMAIL PROTECTED]> wrote: > > > >> I would be surprised if it is the naive: > > > > Yep -- it's "a mix between Boyer-Moore and Horspool with a few more > > bells and whistles on the top", as documented and implemented in > > Objects/stringlib/fastsearch.h in the Python sources and well discussed > > and explained at http://effbot.org/zone/stringlib.htm . > > > > > > Alex > > Ok. Two questions: > > 1. Is "a in b" simply an alias for "b.find(a)"?
The 'in' operator can be minutely better optimized, but they share the underlying algorithm (in 2.5). > 2. Is this algorithm exclusive to Python 2.5, or is it contained in 2.4 > aswell? It's 2.5 novelty. Look at the performance on the same machine (my 2.0 GHz MBP, MacOSX 10.4.7): brain:~ alex$ python2.4 -mtimeit -s'x="foo";y="bar"*99+x+"baz"*77' 'x in y' 100000 loops, best of 3: 9.04 usec per loop brain:~ alex$ python2.4 -mtimeit -s'x="foo";y="bar"*99+x+"baz"*77' 'y.find(x)!=-1' 100000 loops, best of 3: 2.01 usec per loop brain:~ alex$ python2.5 -mtimeit -s'x="foo";y="bar"*99+x+"baz"*77' 'x in y'1000000 loops, best of 3: 0.452 usec per loop brain:~ alex$ python2.5 -mtimeit -s'x="foo";y="bar"*99+x+"baz"*77' 'y.find(x)!=-1' 1000000 loops, best of 3: 0.842 usec per loop find used to be way faster than 'in' -- now they share algorithms and 'in' can be more optimized (no need to track ``where'' it finds a match, so to speak;-), so find is over twice as fast as it used to be, 'in' is about 20 times as fast as it used to be, in this example -- it gets even better if you look at larger and larger strings, e.g...: brain:~ alex$ python2.4 -mtimeit -s'x="foo"*123;y="bar"*999+x+"baz"*777' 'x in y' 10000 loops, best of 3: 91.9 usec per loop brain:~ alex$ python2.5 -mtimeit -s'x="foo"*123;y="bar"*999+x+"baz"*777' 'x in y' 100000 loops, best of 3: 3.01 usec per loop here, we're going _30_ times as fast, not "just" 20;-). Alex -- http://mail.python.org/mailman/listinfo/python-list