Serhiy Storchaka <[email protected]> added the comment:
There was a bug in the regular expression engine which caused re.split()
working incorrectly with zero-width patterns. Note that in your example
_DIGIT_BOUNDARY_RE.split("10.0.0") returns ['10.0.0'] on Python 2.7 -- the
result which you unlikely expected.
It was impossible to fix that bug without changing behavior of other functions
in corner cases and breaking existing code. So we first made re.split() raising
an exception instead of returning nonsensical result and added warnings for
some other cases to help users to catch potential bugs in their code and avoid
ambiguous patterns. You see this in 3.6. In 3.7 we fixed the underlying bug. It
caused breakage of some user code, but it made regular expressions more
consistent in long perspective and made zero-width patterns more usable.
In your particular case, if you still need to support Python 2.7 and 3.6, try
to use re.split() with pattern r'(\D+)' or r'(\d+)' (parentheses are meaningful
here). It gives almost the same result, except possible prepended and appended
empty strings.
----------
nosy: +serhiy.storchaka
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue43222>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com