2014-03-18 10:48 GMT+01:00 Nick Coghlan <ncogh...@gmail.com>: > Well, the concern has always been the risk of silently generating bad > data if there is a mismatch between the OS encoding and the stream > encodings.
Data can be loaded from OS functions, from files and from stdin. These 3 sources may use various different and incompatible encodings. surrogateescape is used by OS functions, and now also by stdin when the POSIX locale is used. When the POSIX locale is used, OS functions and stdin can use different encodings if the PYTHONIOENCODING environment variable is used. Since we are consentent adults, I guess that you understand what you are doing when you set PYTHONIOENCODING. On Windows, the encoding of standard streams is the OEM code page, or the ANSI code page if a stream is redirected, it's unrelated to the LC_CTYPE locale. So surrogateecape can be used when if the encoding of standard streams is not ASCII. We may handle Windows differently to use strict even if the LC_CTYPE locale is "C". Note: On FreeBSD, Solaris and OpenIndiana, nl_langinfo(CODESET) announces an alias of the ASCII encoding when the LC_CTYPE locale is POSIX, whereas mbstowcs() and wcstombs() functions use the ISO-8859-1 encoding. Python 3 now uses the ASCII encoding for its "filesystem" (OS) encoding. Victor _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com