Changes by Serhiy Storchaka storch...@gmail.com:
--
resolution: - not a bug
stage: needs patch - resolved
status: pending - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17668
___
Changes by Serhiy Storchaka storch...@gmail.com:
--
status: open - pending
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17668
___
___
Changes by Mike Hoy mho...@gmail.com:
--
nosy: +mikehoy
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17668
___
___
Python-bugs-list mailing list
Tomasz J. Kotarba added the comment:
The example I gave was the simplest possible to illustrate my point but yes,
you are correct, I often match the whole string as I do recursive matches. I
do use non-capturing groups but they would not solve the problem I talked
about. Anyway, I had
R. David Murray added the comment:
Only group the stuff you want to see in the result:
re.split(r'(^.*$)', 'Homo sapiens catenin (cadherin-associated)')
['', 'Homo sapiens catenin (cadherin-associated)', '']
re.split(r'^(.*)$', 'Homo sapiens catenin (cadherin-associated)')
['', 'Homo sapiens
New submission from Tomasz J. Kotarba:
Tested in 2.7 but possibly affects the other versions as well.
A real life example (note the first character '' being lost):
import re
re.split(r'^(.*)$', 'Homo sapiens catenin (cadherin-associated)')
produces:
['', 'Homo sapiens catenin
Matthew Barnett added the comment:
It's not a bug.
The documentation says Split string by the occurrences of pattern. If
capturing parentheses are used in pattern, then the text of all groups in the
pattern are also returned as part of the resulting list.
You're splitting on r'^(.*)$', but
R. David Murray added the comment:
Thanks for the report, but as Matt said it doesn't look like there is any bug
here. The behavior you report is what the docs say it is, and it seems to me
that your most useful suggestion would discard the information about the
group match, making
Tomasz J. Kotarba added the comment:
Hi Matthew,
Thanks for such a quick reply. I know I can get the by putting it in
grouping parentheses. That's not the issue here. The documentation you quoted
says that it splits the string by the occurrences _OF_PATTERN_ and that texts
of all groups
Tomasz J. Kotarba added the comment:
Hi R. David Murray,
Thanks for your reply. I just explained in my previous message to Matthew that
documentation does actually support my view (i.e. it is an issue according to
the documentation). Re. the issue you mentioned (discarding information
Tomasz J. Kotarba added the comment:
Marking as open till I get your response. I hope you reconsider.
--
resolution: invalid -
status: closed - open
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17668
R. David Murray added the comment:
re.split('-', 'abc-def-jlk')
['abc', 'def', 'jlk']
re.split('(-)', 'abc-def-jlk')
['abc', '-', 'def', '-', 'jlk']
Does that make it a bit clearer? Maybe we need an actual example in the docs.
--
assignee: - docs@python
components: +Documentation
Tomasz J. Kotarba added the comment:
I agree that introducing an example like that plus making some slight changes
in wording would be a welcome change to the docs to clearly explain the current
behaviour. Still, I maintain it would be useful to give users the option I
described to allow
R. David Murray added the comment:
As you pointed out, you can already get that behavior by enclosing the entire
split expression in a group. I don't see that there is any functionality
missing here.
--
___
Python tracker rep...@bugs.python.org
Tomasz J. Kotarba added the comment:
Hi,
I can still see one piece of functionality I have mentioned missing. Using my
first example, even when one uses '^((.*))$' one cannot get ['', 'Homo
sapiens catenin (cadherin-associated)', ''] as one will get a four-element list
and need to deal with
15 matches
Mail list logo