New submission from Evgeny Kapun:
This pattern matches:
re.match('(?:()|(?(1)()|z)){2}(?(2)a|z)', 'a')
But this doesn't:
re.match('(?:()|(?(1)()|z)){0,2}(?(2)a|z)', 'a')
The difference is that {2} is replaced by {0,2}. This shouldn't prevent the
pattern from matching anywhere where it matched before.
The reason for this misbehavior is a feature which is designed to protect re
engine from infinite loops, but in fact it sometimes prevents patterns from
matching where they should. I think that this feature should be at least
properly documented, by properly I mean that it should be possible to
reconstruct the exact behavior from documentation, as the implementation is not
particularly easy to understand.
----------
components: Regular Expressions
messages: 238330
nosy: abacabadabacaba, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: Undocumented feature prevents re module from finding certain matches
type: behavior
versions: Python 3.4
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue23692>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com