On 12/4/2017 6:21 PM, MRAB wrote:
I've finally come to a conclusion as to what the "correct" behaviour of
zero-width matches should be: """always return the first match, but
never a zero-width match that is joined to a previous zero-width match""".
Is this different from current re or regex?
If it's about to return a zero-width match that's joined to a previous
zero-width match, then backtrack and keep on looking for a match.
Example:
>>> print([m.span() for m in re.finditer(r'|.', 'a')])
[(0, 0), (0, 1), (1, 1)]
re.findall, re.split and re.sub should work accordingly.
If re.finditer finds n matches, then re.split should return a list of
n+1 strings and re.sub should make n replacements (excepting maxsplit,
etc.).
--
Terry Jan Reedy
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com