[issue2131] "codecs" module on Windows uses incorrect end-of-line, wiriting broken Unicode (UTF-8) files

2008-02-17 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Thanks, Georg. __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pytho

[issue2131] "codecs" module on Windows uses incorrect end-of-line, wiriting broken Unicode (UTF-8) files

2008-02-17 Thread Georg Brandl
Georg Brandl added the comment: The note in the docstring wasn't in the documentation. Fixed this in r60873. -- nosy: +georg.brandl __ Tracker <[EMAIL PROTECTED]> __ __

[issue2131] "codecs" module on Windows uses incorrect end-of-line, wiriting broken Unicode (UTF-8) files

2008-02-17 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: As Antoine already pointed out: the codecs.open() function does not support the C lib's text mode. As a result, no magical conversion of a single newline to a CRLF takes place. Closing as invalid. -- resolution: -> invalid status: open -> closed _

[issue2131] "codecs" module on Windows uses incorrect end-of-line, wiriting broken Unicode (UTF-8) files

2008-02-16 Thread Antoine Pitrou
Antoine Pitrou added the comment: As stated in the codecs.open() docstring: """Files are always opened in binary mode, even if no binary mode was specified. This is done to avoid data loss due to encodings using 8-bit values""". This certainly means you have to insert "\r\n" yourself (instead of

[issue2131] "codecs" module on Windows uses incorrect end-of-line, wiriting broken Unicode (UTF-8) files

2008-02-16 Thread Technologov
Technologov added the comment: OK: try filewr.write("abc"+"\n"+"abc") The file will be generated with 7 bytes in it (must be 8, because Windows has two-byte line-end). Without using "codecs" modules, everything works fine, and the file will have 8-bytes in it. (see 2nd example) Plus, the text

[issue2131] "codecs" module on Windows uses incorrect end-of-line, wiriting broken Unicode (UTF-8) files

2008-02-16 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Could explain what exactly is wrong with the end-of-line on Windows ? Note that "Unicode text files" on Windows are generally interpreted as UTF-16 encoded files. Perhaps that's what makes you think there's a bug. -- nosy: +lemburg

[issue2131] "codecs" module on Windows uses incorrect end-of-line, wiriting broken Unicode (UTF-8) files

2008-02-16 Thread Technologov
New submission from Technologov: "codecs" module on Windows writes incorrect end-of-line, making it impossible to write Unicode files. See below, how-to reproduce bug (Python 2.5.1 on Windows XP) === #buggy unicode support module: