Martin v. Löwis wrote:
> Walter Dörwald wrote:
>
>>I think a maxsplit argument (just as for unicode.split()) would help too.
>
> Correct - that would allow to get rid of the quadratic part.
OK, such a patch should be rather simple. I'll give it a try.
> We should also strive for avoiding the second copy of the line,
> if the user requested keepends.
Your suggested unicode method islinebreak() would help with that. Then
we could add the following to the string module:
unicodelinebreaks = u"".join(unichr(c) for c in xrange(0,
sys.maxunicode) if unichr(c).islinebreak())
Then
if line and not keepends:
line = line.splitlines(False)[0]
could be
if line and not keepends:
line = line.rstrip(string.unicodelinebreaks)
> I wonder whether it would be worthwhile to cache the .splitlines result.
> An application that has just invoked .readline() will likely invoke
> .readline() again. If there is more than one line left, we could return
> the first line right away (potentially trimming the line ending if
> necessary). Only when a single line is left, we would attempt to
> read more data. In a plain .read(), we would first join the lines
> back.
OK, this would mean we'd have to distinguish between a direct call to
read() and one done by readline() (which we do anyway through the
firstline argument).
Bye,
Walter Dörwald
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com