On Thu, Aug 18, 2016 at 3:25 PM, Steve Dower <steve.do...@python.org> wrote:
> allow us to change locale.getpreferredencoding() to utf-8 on Windows

_bootlocale.getpreferredencoding would need to be hard coded to return
'utf-8' on Windows. _locale._getdefaultlocale() itself shouldn't
return 'utf-8' as the encoding because the CRT doesn't allow it as a
locale encoding.

site.aliasmbcs() uses getpreferredencoding, so it will need to be
modified. The codecs module could add get_acp and get_oemcp functions
based on GetACP and GetOEMCP, returning for example 'cp1252' and
'cp850'. Then aliasmbcs could call get_acp.

Adding get_oemcp would also help with decoding output from
subprocess.Popen. There's been discussion about adding encoding and
errors options to Popen, and what the default should be. When writing
to a pipe or file, some programs use OEM, some use ANSI, some use the
console codepage if available, and far fewer use Unicode encodings.
Obviously it's better to specify the encoding in each case if you know
it.

Regarding the locale module, how about modernizing
_locale._getdefaultlocale to return the Windows locale name [1] from
GetUserDefaultLocaleName? For example, it could return a tuple such as
('en-UK', None) and ('uz-Latn-UZ', None) -- always with the encoding
set to None. The CRT accepts the new locale names, but it isn't quite
up to speed. It still sets a legacy locale when the locale string is
empty. In this case the high-level setlocale could call
_getdefaultlocale. Also _parse_localename, which is called by
getlocale, needs to return a tuple with the encoding as None.
Currently it raises a ValueError for Windows locale names as defined
by [1].

[1]: https://msdn.microsoft.com/en-us/library/dd373814
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to