[issue18291] codecs.open interprets FS, RS, GS as line ends

Marc-Andre Lemburg Fri, 05 Oct 2018 04:40:04 -0700


Marc-Andre Lemburg <m...@egenix.com> added the comment:


Sorry, I probably wasn't clear: the codecs interface is a direct 
interface to the Unicode codecs and thus has to work according to 
what Unicode defines.

Your PR changes this to be non-compliant and does this for all codecs.
That's a major backwards and Unicode incompatible change and I'm -1
on such a change for the stated reasons.

If people want to have ASCII only line break handling, they should
use the io module, which only uses the codecs and can apply different
logic (as it does).

Please note that many file formats where not defined for Unicode,
and it's only natural that using Unicode codecs on them will
result in some differences compared to the ASCII world. Line breaks
are one of those differences, but there are plenty others as well,
e.g. potentially breaking combining characters or bidi sections,
different ideas about upper and lower case handling, different
interpretations of control characters, etc.

The approach to this has to be left with the applications dealing
with these formats. The stdlib has to stick to standards and
clear documentation.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue18291>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue18291] codecs.open interprets FS, RS, GS as line ends

Reply via email to