R. David Murray added the comment:
Only group the stuff you want to see in the result:
>>> re.split(r'(^>.*$)', '>Homo sapiens catenin (cadherin-associated)')
['', '>Homo sapiens catenin (cadherin-associated)', '']
>>> re.split(r'^(>.*)$', '>Homo sapiens catenin (cadherin-associated)')
['', '>Homo sapiens catenin (cadherin-associated)', '']
If you are using grouping to get alternatives, you can use a non-capturing
group:
>>> re.split(r'(ca(?:t|d))', '>Homo sapiens catenin (cadherin-associated)')
['>Homo sapiens ', 'cat', 'enin (', 'cad', 'herin-associated)']
(By the way, I'm a bit confused as to what exactly you are splitting in your
original example, since you seem to be matching the whole string, and only if
it is the whole string. On the other hand, regular expressions regularly
confuse me... :)
I indeed do not think it is worth complicating the interface to handle the
unusual case of accepting and applying unknown regexes. The one change I could
see as a possibility would be to allow all of the groups matched by the split
regex to appear as a single sublist. But I'm not the maintainer of this module
either :)
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue17668>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com