Re: [Python-Dev] file() vs open(), round 7

Martin v. Löwis Tue, 27 Dec 2005 09:54:39 -0800

M.-A. Lemburg wrote:
>>Here's a rough draft:
>>
>>    def textopen(name, mode="r", encoding=None):
>>        if "U" not in mode:
>>            mode += "U"
> 
> 
> The "U" is not needed when opening files using codecs -
> these always break lines using .splitlines() which
> breaks lines according to the Unicode rules and also
> knows about the various line break variants on different
> platforms.


Still, codecs typically don't implement universal newlines
correctly. If you specify 'U', then do .read(), you deserve
to get \n (U+0010) as the line separator; with most codecs,
you get whatever line breaks where in the file.

Passing 'U' to the underlying stream is wrong, as well:
if the stream is double-byte oriented (e.g. UTF-16),
the 'U' filtering will rarely do anything, but if it does
something, it will be wrong.

I agree that it would be desirable to have textopen always
default to universal newlines, however, this is difficult
to implement.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] file() vs open(), round 7

Reply via email to