New submission from Ben Spiller <spiller....@gmail.com>:

These work fine and return instantly:
python -c "import re;  re.compile('.*x').match('y'*(1000*100))"
python -c "import re;  re.compile('x').search('y'*(1000*100))"
python -c "import re;  re.compile('.*x').search('y'*(1000*10))"

This hangs / freezes / livelocks indefinitely, with lots of CPU usage:
python -c "import re;  re.compile('.*x').search('y'*(1000*100))"

Admittedly performing a search() with a pattern starting .* isn't useful, 
however it's worth fixing as:
- it's easily done by inexperienced developers, or users interacting with code 
that's far removed from the actual regex call
- the failure mode of hanging forever (with the GIL held, of course) is quite 
severe (took us a lot of debugging with gdb before we figured out where our 
complex multi-threaded python program was hanging!), and 
- the fact that the behaviour is different based on the length of the string 
being matched suggests there is some kind of underlying bug in how the buffer 
is handled whcih might also affect other, more reasonable regex use cases

----------
components: Regular Expressions
messages: 334949
nosy: benspiller, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: re.search livelock/hang, searching for patterns starting .* in a large 
string
type: crash
versions: Python 2.7, Python 3.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue35915>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to