New submission from Rick Otten:
The documentation states that "|" parsing goes from left to right. This
doesn't seem to be true when spaces are involved. (or \s).
Example:
In [40]: mystring
Out[40]: 'rwo incorporated'
In [41]: re.sub('incorporated| inc|llc|corporation|corp| co', '', mystring)
Out[41]: 'rwoorporated'
In this case " inc" was processed before incorporated.
If I take the space out:
In [42]: re.sub('incorporated|inc|llc|corporation|corp| co', '', mystring)
Out[42]: 'rwo '
incorporated is processed first.
If I put a space with each, then " incorporated" is processed first:
In [43]: re.sub(' incorporated| inc|llc|corporation|corp| co', '', mystring)
Out[43]: 'rwo'
And If use \s instead of a space, it is processed first:
In [44]: re.sub('incorporated|\sinc|llc|corporation|corp| co', '', mystring)
Out[44]: 'rwoorporated'
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue23532>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com