New submission from dkreeft <[email protected]>:
Steps to reproduce (Windows/Python 3.7.7):
1. Define replacement string that starts with an integer:
REPLACEMENT = '12345'
2. Use re.sub() as follows:
re.sub(r'([a-z]+)', fr"\1{REPLACEMENT}", 'something')
3. The outcome is not 'something12345' as expected, but 'J345'.
Note that I am using the group in the replacement argument, which is a raw
f-string.
A quick investigation with other replacement strings renders similar unexpected
behavior:
REPLACEMENT = '1': leads to re.error (invalid group reference 11 at position 1)
REPLACEMENT = '13': 'K'
etc.
So it seems like the f-string is evaluated first, yielding a string starting
with an integer. Python then interprets the '\1' to indicate group 1 as
'\1<first integer>', which leads to the behavior described above. Even if this
is by design, it seems confusing and makes using groups with re.sub()
cumbersome if the replacement f-string starts with an integer.
----------
components: Regular Expressions, Windows
messages: 377669
nosy: dkreeft, ezio.melotti, mrabarnett, paul.moore, steve.dower, tim.golden,
zach.ware
priority: normal
severity: normal
status: open
title: Unexpected behavior re.sub() with raw f-strings
type: behavior
versions: Python 3.7
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue41885>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com