On Mon, Mar 13, 2017 at 8:01 PM, Nick Coghlan <ncogh...@gmail.com> wrote: > On 13 March 2017 at 18:37, INADA Naoki <songofaca...@gmail.com> wrote: >> >> But locale coercing works nice on platforms like android. >> So how about simplified version of PEP 538? Just adding configure >> option for locale coercing >> which is disabled by default. No envvar options and no warnings. > > > That doesn't solve my original Linux distro problem, where locale > misconfiguration problems show up as "Python 2 works, Python 3 doesn't work" > behaviour and bug reports.
Sorry, I meant "PEP 540 + Simplified PEP 538 (coercing by configure option)". distros can enable the configure option, off course. > > The problem is that where Python 2 was largely locale-independent by default > (just passing raw bytes through) such that you'd only get immediate encoding > or decoding errors if you had a Unicode literal or a decode() call somewhere > in your code and would otherwise pass data corruption problems further down > the chain, Python 3 is locale-*aware* by default, and eagerly decodes: > > - command line parameters > - environment variables > - responses from operating system API calls > - standard stream input > - file contents > > You *can* still write locale-independent Python 3 applications, but they > involve sprinkling liberal doses of "b" prefixes and suffixes and mode > settings and "surrogateescape" error handler declarations in various places > - you can't just run python-modernize over a pre-existing Python 2 > application and expect it to behave the same way in the C locale as it did > before. > > Once implemented, PEP 540 will partially solve the problem by introducing a > locale independent UTF-8 mode, but that still leaves the inconsistency with > other locale-aware components that are needing to deal with Python 3 API > calls that accept or return Unicode objects where Python 2 allowed the use > of 8-bit strings. I feel problems PEP 538 solves, but PEP 540 doesn't solve are relatively small compared with complexity introduced PEP 538. As my understanding, PEP 538 solves problems only when: * python executable is used. (GUI applications linking Python for plugin is not affected) * One of C.UTF-8, C.utf8 or UTF8 is accepted for LC_CTYPE. * The "locale aware components" uses something other than ASCII or UTF-8 on C locale, but uses UTF-8 on UTF-8 locale. Can't we reduce options from 3 (2 configure, 1 envvar) when PEP 540 is accepted too? > > Folks that really want the old behaviour back will be able to set > PYTHONCOERCECLOCALE=0 (as that no longer emits any warnings), or else build > their own CPython from source using `--without-c-locale-coercion` and > ``--without-c-locale-warning`. However, they'll also get the explicit > support notification from PEP 11 that any Unicode handling bugs they run > into in those configurations are entirely their own problem - we won't fix > them, because we consider those configurations unsupportable in the general > case. > > That puts the additional self-support burden on folks doing something > unusual (i.e. insisting on running an ASCII-only environment in 2017), > rather than on those with a more conventional use case (i.e. running an up > to date \*nix OS using UTF-8 or another universal encoding for both local > and remote interfaces). > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com