On 10.02.2021 23:10, Eryk Sun wrote:
> On 2/10/21, M.-A. Lemburg <m...@egenix.com> wrote:
>>
>> setx PYTHONUTF8 1
>>
>> does the trick in an admin command shell on Windows globally.
> 
> The above command sets the variable only for the current user, which
> I'd recommend anyway. It does not require administrator access. To set
> a machine value, run `setx /M PYTHONUTF8 1`, which of course requires
> administrator access. Also, run `set PYTHONUTF8=1` in CMD or
> `$env:PYTHONUTF8=1` in PowerShell to set the variable in the current
> shell.

Thanks for the correction.

> Unrelated to UTF-8 mode and long-term plans to make UTF-8 the
> preferred encoding, what I want, from the perspective of writing
> applications and scripts (not libraries), is a -X option and/or
> environment variable to make local._get_locale_encoding() behave like
> it does in POSIX. It should return the LC_CTYPE codeset of the current
> locale, not just the default locale.

That's what getlocale(LC_CTYPE) is intended for, unless I'm
missing something.

getdefaultlocale(), which uses _locale._getdefaultlocale() on
Windows, is meant to determine the locale settings,
setlocale(locale.LC_ALL, '') would be setting for the current
process, without actually doing this.

The reason we have this API is because setlocale() is not
thread-safe and could therefore cause problems in other threads
when simply trying to call setlocale(locale.LC_ALL, '') and then
reset this again if needed.

> This would allow setlocale() in
> Windows to change the default for encoding=None, just as it does in
> POSIX. Technically it's not hard to implement in a way that's as
> reliable as nl_langinfo(CODESET) in POSIX. The code page of the
> current CRT locale is a public field. In Windows 10 the CRT has
> supported UTF-8 for 3 years -- regardless of the process active code
> page returned by GetACP(). Just call setlocale(LC_CTYPE, ".UTF-8") or
> setlocale(LC_CTYPE, (getdefaultlocale()[0], 'UTF-8')).

I think the main problem here is that open() doesn't use
locale.getlocale()[1] as default for the encoding parameter,
but instead locale.getpreferredencoding(False).

The latter doesn't change when you adjust the locale for the
current process on Windows:

>>> import locale
>>> locale.getdefaultlocale()
('de_DE', 'cp1252')
>>> locale.getlocale()
('de_DE', 'cp1252')
>>> locale.setlocale(locale.LC_CTYPE, ('de_DE', 'UTF-8'))
'de_DE.UTF-8'
>>> locale.getpreferredencoding(False)
'cp1252'
>>> f = open(r'some-file.txt')
>>> f.encoding
'cp1252'

On Linux, locale.getpreferredencoding(False) does return
changes made using setlocale().

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Feb 11 2021)
>>> Python Projects, Coaching and Support ...    https://www.egenix.com/
>>> Python Product Development ...        https://consulting.egenix.com/
________________________________________________________________________

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               https://www.egenix.com/company/contact/
                     https://www.malemburg.com/
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5HX2GW6KTFEWZFRDCWG2CDBAMXC663HY/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to