[issue691291] codecs.open(filename, 'U', 'UTF-16') corrupts text
Florent Xicluna la...@yahoo.fr added the comment: slight update. -- stage: - patch review type: feature request - behavior versions: +Python 2.7 Added file: http://bugs.python.org/file15697/issue691291_v2.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue691291 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue691291] codecs.open(filename, 'U', 'UTF-16') corrupts text
Changes by Florent Xicluna la...@yahoo.fr: Removed file: http://bugs.python.org/file15435/issue691291.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue691291 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue691291] codecs.open(filename, 'U', 'UTF-16') corrupts text
Changes by flox la...@yahoo.fr: Added file: http://bugs.python.org/file15435/issue691291.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue691291 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue691291] codecs.open(filename, 'U', 'UTF-16') corrupts text
Changes by flox la...@yahoo.fr: Removed file: http://bugs.python.org/file15422/issue691291.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue691291 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue691291] codecs.open(filename, 'U', 'UTF-16') corrupts text
flox la...@yahoo.fr added the comment: Proposed patch following suggestion of And Clover. Compliant with documentation: «Files are always opened in binary mode, even if no binary mode was specified. This is done to avoid data loss due to encodings using 8-bit values. This means that no automatic conversion of '\n' is done on reading and writing.» -- keywords: +patch nosy: +flox Added file: http://bugs.python.org/file15422/issue691291.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue691291 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue691291] codecs.open(filename, 'U', 'UTF-16') corrupts text
Changes by flox la...@yahoo.fr: Added file: http://bugs.python.org/file15423/issue691291_py3k.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue691291 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue691291] codecs.open(filename, 'U', 'UTF-16') corrupts text
And Clover a...@doxdesk.com added the comment: The problem is that codecs.open() forces binary mode on the underlying file object, and this defeats the U mode. Actually the problem is it doesn't defeat it! The function is documented to force binary, but it actually only does mode = mode + 'b', which can leave you with a mode of 'rUb'. This mode should be invalid but in practice the 'U' wins out, and causes the expected problems for UTF-16 and some East Asian codecs. Until such time as text/universal mode is supported at the overlying decoded stream level, I suggest that 'U' should be .replace()d out of the mode as well as 'b' being added, as the documentation would imply. -- nosy: +aclover ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue691291 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue691291] codecs.open(filename, 'U', 'UTF-16') corrupts text
Christian Heimes added the comment: Checks this for 2.6 -- components: +Library (Lib) -None nosy: +tiran versions: +Python 2.6 Tracker [EMAIL PROTECTED] http://bugs.python.org/issue691291 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com