[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2020-04-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: If the behavior is obviously wrong (like in issue25054), we can fix it without warnings, and even backport the fix to older versions, because we do not expect that anybody depends on such weird behavior. If we are going to change the behavior, but expect

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2020-04-16 Thread Mark Borgerding
Mark Borgerding added the comment: @serhiy.storchaka Thanks for the link to issue25054 to clarify this change was not done solely for aesthetics. Hopefully that will mollify others like me who find their way to this discussion as they try to figure out why their code broke with a new

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2020-04-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The former implementation was wrong. See issue25054 which contains more obvious examples of that bug: >>> re.sub(r"\b|:+", "-", "a::bc") '-a-:-bc-' Not all colons were replaced despite the fact that the pattern matches all colons. --

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2020-04-16 Thread Mark Borgerding
Mark Borgerding added the comment: So third-party code was knowingly broken to satisfy an aesthetic notion that substitution should be more like iteration. Would not a FutureWarning have been a kinder way to stage this implementation? A foolish consistency, indeed. -- nosy: +Mark

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2020-01-20 Thread Anders Hovmöller
Anders Hovmöller added the comment: We were also bitten by this. In fact we still run a compatibility shim in production where we log if the new and old behavior are different. We also didn't think this "bug fix" made sense or was treated with the appropriate gravity in the release notes.

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2020-01-20 Thread David Barnett
David Barnett added the comment: We were also bitten by this behavior change in https://github.com/google/vroom/issues/110. I'm kinda baffled by the new behavior and assumed it had to be an accidental regression, but I guess not. If you have any other context on the BDFL conversation and

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2019-04-12 Thread Matthew Barnett
Matthew Barnett added the comment: Consider re.findall(r'.{0,2}', 'abcde'). It finds 'ab', then continues where it left off to find 'cd', then 'e'. It can also find ''; re.match(r'.*', '') does match, after all. It could, in fact, an infinite number of ''. And what about re.match(r'()*',

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2019-04-12 Thread Anders Hovmöller
Anders Hovmöller added the comment: That might be true, but that seems like a weak argument. If anything, it means those others are broken. What is the logic behind "(.*)" returning the entire string (which is what you asked for) and exactly one empty string? Why not two empty strings? 3? 4?

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2019-04-11 Thread Matthew Barnett
Matthew Barnett added the comment: It's now consistent with Perl, PCRE and .Net (C#), as well as re.split(), re.sub(), re.findall() and re.finditer(). -- ___ Python tracker

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2019-04-11 Thread Anders Hovmöller
Anders Hovmöller added the comment: Just as a comparison, sed does the 3.6 thing: > echo foo | sed 's/\(.*\)/x\1y/g' xfooy -- ___ Python tracker ___

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2019-04-11 Thread Anders Hovmöller
Anders Hovmöller added the comment: This was a really bad idea in my opinion. We just found this and we have no way to know how this will impact production. It's really absurd that re.sub('(.*)', r'foo', 'asd') is "foo" in python 1 to 3.6 but 'foofoo' in python 3.7. -- nosy:

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2018-01-04 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2018-01-04 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset fbb490fd2f38bd817d99c20c05121ad0168a38ee by Serhiy Storchaka in branch 'master': bpo-32308: Replace empty matches adjacent to a previous non-empty match in re.sub(). (#4846)

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2017-12-26 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Could anybody please make a review of at least the documentation part? -- ___ Python tracker

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2017-12-13 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- keywords: +patch pull_requests: +4734 stage: -> patch review ___ Python tracker ___

[issue32308] Replace empty matches adjacent to a previous non-empty match in re.sub()

2017-12-13 Thread Serhiy Storchaka
New submission from Serhiy Storchaka : Currently re.sub() replaces empty matches only when not adjacent to a previous match. This makes it inconsistent with re.findall() and re.finditer() which finds empty matches adjacent to a previous non-empty match and with