Chris Jerdonek added the comment:
> I wasn't really happy with the addition of that sentence about split in the
> first place.
I think the instinct to put that sentence in there is a good one. It is a key,
perhaps subtle difference.
> I don't understand what your splitlines examples are trying to say, they all
> look clear to me based on the fact that we are splitting *lines*.
I perhaps included too many examples and so clouded my point. :) I just needed
one. The examples were simply to show why the existing language is not
correct. The current language says, "if the string ends with line boundary
characters the returned list does not have an empty last element."
However, the examples are of strings that do end with line boundary characters
but that *do* have an empty last element.
The point is that splitlines() does not count a terminal line break as an
additional line, while split('\n') (for example) does. But this is different
from whether the returned list *has* an empty last element, which is what the
current language says.
The returned list can have empty last elements because of line breaks at the
end. It's just that the one at the *very* end doesn't count towards that --
unlike the case for split():
>>> 'a'.splitlines()
['a']
>>> 'a\n'.splitlines()
['a']
>>> 'a\n\n'.splitlines()
['a', '']
>>> 'a\n\n\n'.splitlines()
['a', '', '']
>>> 'a\n\n\n'.split('\n') # counts terminal line break as an extra line
['a', '', '', '']
I'm open to improving the language. Maybe "does not count a terminal line
break as an additional line" instead of the original "a terminal line break
does not delimit an additional empty line"?
> There's another issue for creating a central description of universal-newline
> parsing, perhaps this entry could link to that discussion (and that
> discussion could perhaps mention splitlines).
I created that issue (issue 15543), and a patch is in the works along the lines
you suggest. ;)
> The split behavior without a specified separator might actually be a bug (if
> so, it is not a fixable one), but in any case you are right that that
> clarification should be added if the existing sentence is kept.
Perhaps, but at least split() documents the behavior. :)
"runs of consecutive whitespace are regarded as a single separator, and the
result will contain no empty strings at the start or end if the string has
leading or trailing whitespace."
(from http://docs.python.org/dev/library/stdtypes.html#str.split )
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue15554>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com