Dāvis added the comment:
Of course I agree that proper solution is to use Unicode/Wide API but that's
much more work to implement and I rather have now this half-fix which works for
most of cases than nothing till whenever Unicode support is implemented which
might be nowhere soon.
> IMO, it makes more sense for programs to use UTF-8, or even UTF-16. Codepages
> are a legacy that we need to move beyond. Internally the console uses
> UTF-16LE.
yes that's true, but we can't do anything about current existing programs and
so if we default to UTF-8 it will be even worse than defaulting to ANSI because
there aren't many programs on Windows which would use UTF-8, in fact it's quite
rare because there's not even good UTF-8 support for console itself like you
mentioned. Also here I'm talking only about ANSI WinAPI programs with
console/pipe encoding and not internal or file encoding which here we don't
really care about.
> Note that patch 3 requires setting `encoding` for even python.exe as a child
> process, because sys.std* default to ANSI when isatty(fd) isn't true.
I think Python is a bit broken here and IMO it should also use console's
encoding not ANSI when outputting to console pipe and use ANSI if it really is
a file.
on Windows 10 with Python 3.5.1
>chcp
Active code page: 775
>python -c "print('ā')"
ā
>python -c "print('ā')" | echo
ECHO is on.
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w'
encoding='cp1257'>
OSError: [Errno 22] Invalid argument
>chcp 1257
Active code page: 1257
>python -c "print('ā')" | echo
ECHO is on.
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w'
encoding='cp1257'>
OSError: [Errno 22] Invalid argument
in PowerShell
>[Console]::OutputEncoding.CodePage
775
>python -c "print('ā')" | Out-String
Ō
>[Console]::OutputEncoding = [System.Text.Encoding]::UTF8
>python -c "print('ā')" | Out-String
�
>[Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding(1257)
>python -c "print('ā')" | Out-String
ā
> I proposed using the "/u" switch for shell=True only to facilitate getting
> results back from cmd's internal commands such as `set`. But it doesn't
> change anything if you're using the shell to run other programs.
but you can only do that if you know that command you execute is cmd's command
but if it's user passed command then there isn't really reliable way to detect
if it will execute inside cmd or not, for example "cmd /u /c chcp.exe" will
return result in UTF-16 because such program doesn't exist and cmd's error
message will be outputted. Also if user have set.exe in %System32% then "cmd /u
/c set" won't be in UTF-16 because it will execute that program.
>> by calling GetConsoleOutputCP inside child process with CreateRemoteThread
> That's not the only way. You can also start a detached Python process (via
> pythonw.exe or DETACHED_PROCESS) to run a script that calls AttachConsole and
> returns the result of calling GetConsoleOutputCP:
while useful to know it's still messy because I think you would need to prevent
your target process from exiting before you've called AttachConsole and also
most likely you want to get GetConsoleOutputCP before program's exit and not at
start (say with CREATE_SUSPENDED) as it might have changed it somewhere in
middle of program's execution. so looks like this route isn't worth going for.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue27179>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com