New submission from Caleb Rouleau:
Version info: 2.7.1 (r271:86832, Feb 7 2011, 11:33:02) [MSC v.1500 64 bit
(AMD64)]
The program included never prints done because it never returns from
re.match().
-- Caleb Rouleau
--
components: Regular Expressions
files: RegexBug.py
messages:
Matthew Barnett added the comment:
That's because it uses a pathological regular expression (catastrophic
backtracking).
The problem lies here: (\\?[\w\.\-]+)+
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15515
Tim Peters added the comment:
Matthew is right: the nested quantifiers can cause this to take a very long
time when the regexp doesn't match. Note that the example cannot match,
because nothing in the regexp can match the space before warning in the
example string. But the nested
Caleb Rouleau added the comment:
Thanks for the help. Apologies for the poor understanding of regular
expressions. Closing this issue.
--
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15515
Serhiy Storchaka added the comment:
Make a distinction between a large number of infinity. You have a bad regexp,
the matching time depends exponentially on the length of the string. Try with
short strings. Use the regexp r(\w:)(\\?[\w\.\-]+)((\\[\w\.\-]+)*)(\.[\w ]+):
.
It's not a bug.
Matthew Barnett added the comment:
It's probably inappropriate for me to mention that the alternative 'regex'
module on PyPI completes promptly, so I won't. :-)
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15515
Tim Peters added the comment:
Matthew, yes, PyPy's regex module implements regular expressions of the
computer science (as opposed to POSIX) sense. See Friedl's book for a full
explanation. Short course is that regex's flavor of regexp matching is
linear-time, but cannot support advanced