Re: [Python-ideas] Fix default encodings on Windows

Stephen J. Turnbull Tue, 16 Aug 2016 06:50:00 -0700

Nick Coghlan writes:

 > At an ecosystem level, that means we're faced with a choice between
 > implicitly encouraging folks to make their code *nix only, and
 > finding a way to provide a more *nix like experience when running
 > on Windows (where UTF-8 encoded binary data just works, and either
 > other encodings lead to mojibake or else you use chardet to figure
 > things out).


Most of the time we do know what the encoding is, we can just ask
Windows (although Steve proposes to make Python fib about that, we
could add other APIs).

This change means that programs that until now could be encoding-
agnostic and just pass around bytes on Windows, counting on Python to
consistently convert those to the appropriate form for the API, can't
do that any more.  They have to find out what the encoding is, and
transcode to UTF-8, or rewrite to do their processing as text.  This
is a potential burden on existing user code.

I suppose that there are such programs, for the same reasons that
networking programs tend to use bytes I/O: ports from Python 2, an
(misplaced?) emphasis on performance, etc.

 > Steve is suggesting that the latter option is preferable, a view I
 > agree with since it lowers barriers to entry for Windows based
 > developers to contribute to primarily *nix focused projects.

Sure, but do you have any idea what the costs might be?  Aside from
the code burden mentioned above, there's a reputational issue.  Just
yesterday I was having a (good-natured) Perl vs. Python discussion on
my LUG ML, and two developers volunteered that they avoid Python
because "the Python developers frequently break backward
compatibility".  These memes tend to go off on their own anyway, but
this change will really feed that one.

 > Promoting cross-platform consistency often leads to enabling
 > patterns that are considered a bad idea from a native platform
 > perspective, and this strikes me as an example of that (just as the
 > binary/text separation itself is a case where Python 3 diverged
 > from the POSIX text model to improve consistency across *nix,
 > Windows, JVM and CLR environments).

I would say rather Python 3 chose an across-the-board better, more
robust model supporting internationalization and multilingualization
properly.  POSIX's text model is suitable at best for a fragile
localization.

This change, OTOH, is a step backward we wouldn't consider except for
the intended effect on ease of writing networking code.  That's
important, but I really don't think that's going to be the only major
effect, and I fear it won't be the most important effect.

Of course that's FUD -- I have no data on potential burden to existing
use cases, or harm to reputation.  But neither do you and Steve. :-(


_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] Fix default encodings on Windows

Reply via email to