Eryk Sun <[email protected]> added the comment:
On most platforms, unless UTF-8 mode is enabled,
locale.getpreferredencoding(False) returns the LC_CTYPE encoding of the current
locale. For example, in Linux:
>>> locale.setlocale(locale.LC_CTYPE, 'en_US.UTF-8')
'en_US.UTF-8'
>>> locale.getpreferredencoding(False)
'UTF-8'
>>> locale.setlocale(locale.LC_CTYPE, 'en_US.iso-88591')
'en_US.iso-88591'
>>> locale.getpreferredencoding(False)
'ISO-8859-1'
If the designers of the io module had wanted the preferred encoding to always
be the default encoding from setlocale(LC_CTYPE, ""), they would have used and
documented locale.getpreferredencoding(True).
---
In Windows, locale.getpreferredencoding(False) always returns the default
encoding from locale.getdefaultlocale(), which is the process active (ANSI)
code page. Changing it to track the LC_CTYPE locale would be convenient for
applications and scripts running in Windows 10, for which the CRT's POSIX
locale implementation has supported UTF-8 since spring of 2018.
The base behavior can't be changed at this point, but a -X option and/or
environment variable could enable locale.getpreferredencoding(False) -- i.e.
locale._get_locale_encoding() -- to return the current LC_CTYPE encoding in
Windows, as it does in POSIX.
----------
nosy: +eryksun
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue43140>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com