Georg Brandl schrieb:
> Well, subject says it all. While 2.5 sets sys.std*.encoding correctly to
> UTF-8, 3k sets it to 'latin-1', breaking output of Unicode strings.

And not surprisingly so: io.py says

        if encoding is None:
            # XXX This is questionable
            encoding = sys.getfilesystemencoding() or "latin-1"

First, at the point where this call is made, sys.getfilesystemencoding
is still None, plus the code is broken as getfilesystemencoding is not
the correct value for sys.stdout.encoding. Instead, the way it should
be computed is:

1. On Unix, use the same value that sys.getfilesystemencoding will get,
   namely the result of nl_langinfo(CODESET); if that is not available,
   fall back - to anything, but the most logical choices are UTF-8
   (if you want output to always succeed) and ASCII (if you don't want
   to risk mojibake).
2. On Windows, if output is to a terminal, use GetConsoleOutputCP.
   Else fall back, probably to CP_ACP (ie. "mbcs")
3. On OSX, I don't know. If output is to a terminal, UTF-8 may be
   a good bet (although some people operate their Terminal.apps
   not in UTF-8; there is no way to find out). Otherwise, use the
   locale's encoding - not sure how to find out what that is.

Regards,
Martin
_______________________________________________
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Reply via email to