New submission from Ryan Westlund <rlwestl...@gmail.com>:

```
>>> re.sub('a*', '-', 'a')
'--'
>>> re.sub('a*', '-', 'aa')
'--'
>>> re.sub('a*', '-', 'aaa')
'--'
```

Shouldn't it be returning one dash, not two, since the greedy quantifier will 
match all the a's? I understand why substituting on 'b' returns '-a-', but 
shouldn't this constitute only one match? In Python 2.7, it behaves as I expect:

```
>>> re.sub('a*', '-', 'a')
'-'
>>> re.sub('a*', '-', 'aa')
'-'
>>> re.sub('a*', '-', 'aaa')
'-'
```

The original case that led me to this was trying to normalize a path to end in 
one slash. I used `re.sub('/*$', '/', path)`, but a nonzero number of slashes 
came out as two.

----------
components: Regular Expressions
messages: 372104
nosy: Yujiri, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: re.sub treats * incorrectly?
type: behavior
versions: Python 3.10, Python 3.7, Python 3.8

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue41080>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to