On Fri, Jan 10, 2014 at 4:35 PM, Nick Coghlan <ncogh...@gmail.com> wrote: > On 10 January 2014 13:32, Lennart Regebro <rege...@gmail.com> wrote: >> No, because your environment have a default language. And Python has a >> default encoding. You only get problems when some file doesn't use the >> default encoding. > > The reason Python 3 currently tries to rely on the POSIX locale > encoding is that during the Python 3 development process it was > pointed out that ShiftJIS, ISO-2022 and various CJK codec are in > widespread use in Asia, since Asian users needed solutions to the > problem of representing kana, ideographs and other non-Latin > characters long before the Unicode Consortium existed. > > This creates a problem for Python 3, as assuming utf-8 means we have a > high risk of corrupting user's data at least in Asian locales, as well > as anywhere else where non-UTF-8 encodings are common (especially when > encodings that aren't ASCII compatible are involved).
>From my experience, the concept of a default locale is deeply flawed. What if I log into a (Linux) machine using an old latin-1 putty from the Windows XP era, have most file names and contents in UTF-8 encoding, except for one directory where people from eastern Europe upload files via FTP in whatever encoding they choose. What should the "default" encoding be now? That's why I make it a principle to always unset all LC_* and LANG variables, except when working locally, which happens rather rarely. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com