Paddy <[EMAIL PROTECTED]>: > On Jul 4, 1:36 pm, Peter Otten <[EMAIL PROTECTED]> wrote: >> Henning_Thornblad wrote: >> > What can be the cause of the large difference between re.search and >> > grep? >> >> grep uses a smarter algorithm ;) >> >> >> >> > This script takes about 5 min to run on my computer: >> > #!/usr/bin/env python >> > import re >> >> > row="" >> > for a in range(156000): >> > row+="a" >> > print re.search('[^ "=]*/',row) >> >> > While doing a simple grep: >> > grep '[^ "=]*/' input (input contains 156.000 a in >> > one row) >> > doesn't even take a second. >> >> > Is this a bug in python? >> >> You could call this a performance bug, but it's not common enough in real >> code to get the necessary brain cycles from the core developers. >> So you can either write a patch yourself or use a workaround. >> >> re.search('[^ "=]*/', row) if "/" in row else None >> >> might be good enough. >> >> Peter > > It is not a smarter algorithm that is used in grep. Python RE's have > more capabilities than grep RE's which need a slower, more complex > algorithm.
FWIW, grep itself can confirm this statement. The following command roughly takes as long as Python's re.search: # grep -P '[^ "=]*/' input -P tells grep to use real perl-compatible regular expressions. -- Freedom is always the freedom of dissenters. (Rosa Luxemburg) -- http://mail.python.org/mailman/listinfo/python-list