Chris Jerdonek added the comment:

> I wasn't really happy with the addition of that sentence about split in the 
> first place.

I think the instinct to put that sentence in there is a good one.  It is a key, 
perhaps subtle difference.

> I don't understand what your splitlines examples are trying to say, they all 
> look clear to me based on the fact that we are splitting *lines*.  

I perhaps included too many examples and so clouded my point. :)  I just needed 
one.  The examples were simply to show why the existing language is not 
correct.  The current language says, "if the string ends with line boundary 
characters the returned list does not have an empty last element."

However, the examples are of strings that do end with line boundary characters 
but that *do* have an empty last element.

The point is that splitlines() does not count a terminal line break as an 
additional line, while split('\n') (for example) does.  But this is different 
from whether the returned list *has* an empty last element, which is what the 
current language says.

The returned list can have empty last elements because of line breaks at the 
end.  It's just that the one at the *very* end doesn't count towards that -- 
unlike the case for split():

>>> 'a'.splitlines()
['a']
>>> 'a\n'.splitlines()
['a']
>>> 'a\n\n'.splitlines()
['a', '']
>>> 'a\n\n\n'.splitlines()
['a', '', '']
>>> 'a\n\n\n'.split('\n')  # counts terminal line break as an extra line
['a', '', '', '']

I'm open to improving the language.  Maybe "does not count a terminal line 
break as an additional line" instead of the original "a terminal line break 
does not delimit an additional empty line"?

> There's another issue for creating a central description of universal-newline 
> parsing, perhaps this entry could link to that discussion (and that 
> discussion could perhaps mention splitlines).

I created that issue (issue 15543), and a patch is in the works along the lines 
you suggest. ;)

> The split behavior without a specified separator might actually be a bug (if 
> so, it is not a fixable one), but in any case you are right that that 
> clarification should be added if the existing sentence is kept.

Perhaps, but at least split() documents the behavior. :)

"runs of consecutive whitespace are regarded as a single separator, and the 
result will contain no empty strings at the start or end if the string has 
leading or trailing whitespace."

(from http://docs.python.org/dev/library/stdtypes.html#str.split )

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue15554>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to