M.-A. Lemburg wrote: >>Here's a rough draft: >> >> def textopen(name, mode="r", encoding=None): >> if "U" not in mode: >> mode += "U" > > > The "U" is not needed when opening files using codecs - > these always break lines using .splitlines() which > breaks lines according to the Unicode rules and also > knows about the various line break variants on different > platforms.
Still, codecs typically don't implement universal newlines correctly. If you specify 'U', then do .read(), you deserve to get \n (U+0010) as the line separator; with most codecs, you get whatever line breaks where in the file. Passing 'U' to the underlying stream is wrong, as well: if the stream is double-byte oriented (e.g. UTF-16), the 'U' filtering will rarely do anything, but if it does something, it will be wrong. I agree that it would be desirable to have textopen always default to universal newlines, however, this is difficult to implement. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com