[issue23532] regex | behavior differs from documentation

2015-02-26 Thread Rick Otten
Changes by Rick Otten rottenwindf...@gmail.com: -- components: Regular Expressions nosy: Rick Otten, ezio.melotti, mrabarnett priority: normal severity: normal status: open title: regex | behavior differs from documentation type: behavior versions: Python 2.7

[issue23532] regex | behavior differs from documentation

2015-02-26 Thread Mark Shannon
Mark Shannon added the comment: This looks like the expected behaviour to me. re.sub matches the leftmost occurence and the regular expression is greedy so (x|xy) will always match xy if it can. -- nosy: +Mark.Shannon ___ Python tracker

[issue23532] regex | behavior differs from documentation

2015-02-26 Thread Rick Otten
Rick Otten added the comment: Can the documentation be updated to make this more clear? I see now where the clause As the target string is scanned, ... is describing what you have listed here. I and a coworker both read the description several times and missed that. I thought it first tried

[issue23532] regex | behavior differs from documentation

2015-02-26 Thread Matthew Barnett
Matthew Barnett added the comment: @Mark is correct, it's not a bug. In the first example: It tries to match each alternative at position 0. Failure. It tries to match each alternative at position 1. Failure. It tries to match each alternative at position 2. Failure. It tries to match each

[issue23532] regex | behavior differs from documentation

2015-02-26 Thread Rick Otten
New submission from Rick Otten: The documentation states that | parsing goes from left to right. This doesn't seem to be true when spaces are involved. (or \s). Example: In [40]: mystring Out[40]: 'rwo incorporated' In [41]: re.sub('incorporated| inc|llc|corporation|corp| co', '',