[issue42261] Windows legacy I/O mode mistakenly ignores the device encoding

2020-11-04 Thread Eryk Sun


Eryk Sun  added the comment:

> It would be nice to get an unit test for this case.

The process code page from GetACP() is either an ANSI code page or CP_UTF8 
(65001). It should never be a Western OEM code page such as 850. In that case, 
a reliable unit test would check that the configured encoding is a particular 
OEM code page. For example, spawn a new interpreter in a windowless console 
session (i.e. creationflags=CREATE_NO_WINDOW). Set the session's input code 
page to 850 via ctypes.WinDLL('kernel32').SetConsoleCP(850). Set 
os.environ['PYTHONLEGACYWINDOWSSTDIO'] = '1'. Then spawn [sys.executable, '-c', 
'import sys; print(sys.stdin.encoding)'], and verify that the output is 'cp850'.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42261] Windows legacy I/O mode mistakenly ignores the device encoding

2020-11-04 Thread Eryk Sun


Eryk Sun  added the comment:

> The solution here is to fix config_init_stdio_encoding() to use 
> GetConsoleCP() and GetConsoleOutputCP() to build a "cpXXX" string.

But, as I mentioned, that's only possible by replacing config->stdio_encoding 
with three separate settings: config->stdin_encoding, config->stdout_encoding, 
and config->stderr_encoding.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42261] Windows legacy I/O mode mistakenly ignores the device encoding

2020-11-04 Thread STINNER Victor


STINNER Victor  added the comment:

> This is based on config_init_stdio_encoding() in Python/initconfig.c, which 
> sets config->stdio_encoding via config_get_locale_encoding(). Cannot 
> config->stdio_encoding be set to NULL for default behavior?

I would like to get a PyConfig structure fully populated to make the Python 
initialization more deterministic and reliable. So PyConfig fully control used 
encodings.

The solution here is to fix config_init_stdio_encoding() to use GetConsoleCP() 
and GetConsoleOutputCP() to build a "cpXXX" string.

This issue seems to be a regression that I introduced in Python 3.8 with the 
PEP 587 (PyConfig). I didn't notice this subtle case during my refactoring. 
Relying on os.device_encoding() when the encoding is NULL is not obvious. 
That's why I prefer to get PyConfig full populated ;-)

It would be nice to get an unit test for this case.

--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42261] Windows legacy I/O mode mistakenly ignores the device encoding

2020-11-04 Thread Eryk Sun


Eryk Sun  added the comment:

There's a related issue that affects opening duplicated file descriptors and 
opening "CON", "CONIN$", and "CONOUT$" in legacy I/O mode, but this case has 
always been broken. For Windows, _Py_device_encoding needs to be generalized to 
use _get_osfhandle and GetNumberOfConsoleInputEvents to detect and 
differentiate console input and output, instead of using isatty() and hard 
coding file descriptors 0-2.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42261] Windows legacy I/O mode mistakenly ignores the device encoding

2020-11-04 Thread Eryk Sun


New submission from Eryk Sun :

In Python 3.8+, legacy standard I/O mode uses the process code page from GetACP 
instead of the correct device encoding from GetConsoleCP and 
GetConsoleOutputCP. For example:

C:\>chcp 850
Active code page: 850
C:\>set PYTHONLEGACYWINDOWSSTDIO=1

C:\>py -3.7 -c "import sys; print(sys.stdin.encoding)"
cp850
C:\>py -3.8 -c "import sys; print(sys.stdin.encoding)"
cp1252
C:\>py -3.9 -c "import sys; print(sys.stdin.encoding)"
cp1252

This is based on config_init_stdio_encoding() in Python/initconfig.c, which 
sets config->stdio_encoding via config_get_locale_encoding(). Cannot 
config->stdio_encoding be set to NULL for default behavior?

Computing this ahead of time would require separate encodings 
config->stdin_encoding, config->stdout_encoding, and config->stderr_encoding. 
And _Py_device_encoding would have to be duplicated as something like 
config_get_device_encoding(PyConfig *config, int fd, wchar_t **device_encoding).

--
components: IO, Interpreter Core, Windows
messages: 380329
nosy: eryksun, paul.moore, steve.dower, tim.golden, zach.ware
priority: normal
severity: normal
stage: needs patch
status: open
title: Windows legacy I/O mode mistakenly ignores the device encoding
type: behavior
versions: Python 3.10, Python 3.8, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com