On Jul 8, 2:51 am, Henning Thornblad <[EMAIL PROTECTED]> wrote: > When trying to find an alternative way of solving my problem i found > that running this script: > > #!/usr/bin/env python > > import re > > row="" > for a in range(156000): > row+="a" > print "How many, dude?" > print re.search('/[^ "=]*',row) (the / has moved) > > wouldn't take even a second (The re.search part of course) > > This is for me very strange since this, > in theory, is the same problem as before.
In theory. In practice, it is the same problem *iff* you start at the end of the text and work backwards. Your original pattern can be characterised as X*Y, where X and Y are character classes; your new one is YX*. You are asking each to search a text that contains many Xs and no Ys. The second pattern will cause a naive search engine to examine each character in the text (is it a Y?) as a candidate for the start of a successful match; each test fails, and the whole expedition has cost O(N). OTOH, the first pattern will start at offset 0, cheerfully cruising past all those lovely Xs, but failing (no Y) at the end of the text. It will then think that it's been too greedy, reduce the matched span of X* from N characters to N-1, fail to find Y, N-2, fail, ... so that's O(N) just for the first trial starting from offset 0. Being naive, it will try again starting from offset 1, 2, 3, ... an O(N**2) process. If Y were instead something more complicated, it would be worse than O(N**2). If you were to tell us what you are actually trying to do, we could probably help you with a faster method. BTW, if the text is 'aaa///bbb///ccc', your original search will find 'aaa///bbb///'; if you want it to find 'aaa/', the pattern should be: '[^ "=/]*/' Cheers, John -- http://mail.python.org/mailman/listinfo/python-list