On Tue, Jan 26, 2021 at 4:36 PM Eryk Sun <eryk...@gmail.com> wrote:
>
> One concern is what to do for the special "ansi" and "oem" encodings.
> If scripts rely on them for IPC, such as with subprocess.Popen(), then
> it could be frustrating if they're just synonyms for UTF-8 (code page
> 65001). I've tested that it's possible for Python to peg "ansi" and
> "oem" to the system ANSI and OEM code pages via GetLocaleInfoEx() with
> LOCALE_NAME_SYSTEM_DEFAULT and the LCType constants
> LOCALE_IDEFAULTANSICODEPAGE and LOCALE_IDEFAULTCODEPAGE (OEM). But
> then they're no longer accurate within the current process, for which
> ANSI and OEM are UTF-8.

You are right. That's why I didn't change the default encoding of
subprocess in the PEP 597.
UTF-8 version Python should change only default text encoding. So it
shouldn't use UTF-8 code page.

Current UTF-8 mode has the same problem. It affects PIPE encoding too.
But we can change its behavior on Windows to:

* The default encoding of TextIOWrapper and most wrappers (e.g.
open(), Path.open(), Path.read_text(), gzip.open(), ...) become
"utf-8".
* locale.getpreferredencoding(False) returns code page encoding (e.g. "cp932")
* subprocess module uses `locale.getpreferredencoding(False)` for the
default PIPE encoding.

And we can provide two versions of Python for Windows.

* "Python (UTF-8 version)" will enable the UTF-8 mode by default.
* "Python (ANSI version)" will disable the UTF-8 mode by default.

User can override the default by `-Xutf8` option and `PYTHONUTF8`
environment variable.

Does this idea make sense?

-- 
Inada Naoki  <songofaca...@gmail.com>
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KRT45DM5NJMLM22BHYHSWVYLPAXDK23A/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to