On 8/9/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Georg Brandl schrieb: > > Well, subject says it all. While 2.5 sets sys.std*.encoding correctly to > > UTF-8, 3k sets it to 'latin-1', breaking output of Unicode strings. > > And not surprisingly so: io.py says > > if encoding is None: > # XXX This is questionable > encoding = sys.getfilesystemencoding() or "latin-1"
Guilty as charged. Alas, I don't know much about the machinery of console and filesystem encodings, so I need help! > First, at the point where this call is made, sys.getfilesystemencoding > is still None, What can we do about this? Set it earlier? It should really be set by the time site.py is imported (which sets sys.stdin/out/err), as this is the first time a lot of Python code is run that touches the filesystem (e.g. sys.path mangling). > plus the code is broken as getfilesystemencoding is not > the correct value for sys.stdout.encoding. Instead, the way it should > be computed is: > > 1. On Unix, use the same value that sys.getfilesystemencoding will get, > namely the result of nl_langinfo(CODESET); if that is not available, > fall back - to anything, but the most logical choices are UTF-8 > (if you want output to always succeed) and ASCII (if you don't want > to risk mojibake). > 2. On Windows, if output is to a terminal, use GetConsoleOutputCP. > Else fall back, probably to CP_ACP (ie. "mbcs") > 3. On OSX, I don't know. If output is to a terminal, UTF-8 may be > a good bet (although some people operate their Terminal.apps > not in UTF-8; there is no way to find out). Otherwise, use the > locale's encoding - not sure how to find out what that is. Feel free to add code that implements this. I suppose it would be a good idea to have a separate function io.guess_console_encoding(...) which takes some argument (perhaps a raw file?) and returns an encoding name, never None. This could then be implemented by switching on the platform into platform-specific functions and a default. -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com