On 6 May 2017 at 18:33, Nick Coghlan <ncogh...@gmail.com> wrote: > On 6 May 2017 at 18:00, Nick Coghlan <ncogh...@gmail.com> wrote: >> On 5 March 2017 at 17:50, Nick Coghlan <ncogh...@gmail.com> wrote: >>> Hi folks, >>> >>> Late last year I started working on a change to the CPython CLI (*not* the >>> shared library) to get it to coerce the legacy C locale to something based >>> on UTF-8 when a suitable locale is available. >>> >>> After a couple of rounds of iteration on linux-sig and python-ideas, I'm now >>> bringing it to python-dev as a concrete proposal for Python 3.7. >>> >>> For most folks, reading the Abstract plus the draft docs updates in the >>> reference implementation will tell you everything you need to know (if the >>> C.UTF-8, C.utf8 or UTF-8 locales are available, the CLI will automatically >>> attempt to coerce the legacy C locale to one of those rather than persisting >>> with the latter's default assumption of ASCII as the preferred text >>> encoding). >> >> I've just pushed a significant update to the PEP based on the >> discussions in this thread: >> https://github.com/python/peps/commit/2fb53e7c1bbb04e1321bca11cc0112aec69f6398 >> >> The main change at the technical level is to modify the handling of >> the coercion target locales such that they *always* lead to >> "surrogateescape" being used by default on the standard streams. That >> means we don't need to call "Py_SetStandardStreamEncoding" during >> startup, that subprocesses will behave the same way as their parent >> processes, and that Python in Linux containers will behave >> consistently regardless of whether the container locale is set to >> "C.UTF-8" explicitly, or is set to "C" and then coerced to "C.UTF-8" >> by CPython. > > Working on the revised implementation for this, I've ended up > refactoring it so that all the heavy lifting is done by a single > function exported from the shared library: "_Py_CoerceLegacyLocale()". > > The CLI code then just contains the check that says "Are we running in > the legacy C locale? If so, call _Py_CoerceLegacyLocale()", with all > the details of how the coercion actually works being hidden away > inside pylifecycle.c. > > That seems like a potential opportunity to make the 3.7 version of > this a public API, using the following pattern: > > if (Py_LegacyLocaleDetected()) { > Py_CoerceLegacyLocale(); > } > > That way applications embedding CPython that wanted to implement the > same locale coercion logic would have an easy way to do so.
OK, the reference implementation has been updated to match the latest version of the PEP: https://github.com/ncoghlan/cpython/commit/188e7807b6d9e49377aacbb287c074e5cabf70c5 For now, the implementation in the standalone CLI looks like this: /* [snip] */ extern int _Py_LegacyLocaleDetected(void); extern void _Py_CoerceLegacyLocale(void); /* [snip] */ if (_Py_LegacyLocaleDetected()) { _Py_CoerceLegacyLocale(); } If we decide to make this a public API for 3.7, the necessary changes would be: - remove the leading underscore from the function names - add the function prototypes to the pylifecycle.h header - add the APIs to the C API documentation in the configuration & initialization section - define the APIs in the PEP - adjust the backport note in the PEP to say that backports should NOT expose the public C API, but keep it private Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com