Dāvis added the comment: Of course I agree that proper solution is to use Unicode/Wide API but that's much more work to implement and I rather have now this half-fix which works for most of cases than nothing till whenever Unicode support is implemented which might be nowhere soon.
> IMO, it makes more sense for programs to use UTF-8, or even UTF-16. Codepages > are a legacy that we need to move beyond. Internally the console uses > UTF-16LE. yes that's true, but we can't do anything about current existing programs and so if we default to UTF-8 it will be even worse than defaulting to ANSI because there aren't many programs on Windows which would use UTF-8, in fact it's quite rare because there's not even good UTF-8 support for console itself like you mentioned. Also here I'm talking only about ANSI WinAPI programs with console/pipe encoding and not internal or file encoding which here we don't really care about. > Note that patch 3 requires setting `encoding` for even python.exe as a child > process, because sys.std* default to ANSI when isatty(fd) isn't true. I think Python is a bit broken here and IMO it should also use console's encoding not ANSI when outputting to console pipe and use ANSI if it really is a file. on Windows 10 with Python 3.5.1 >chcp Active code page: 775 >python -c "print('ā')" ā >python -c "print('ā')" | echo ECHO is on. Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='cp1257'> OSError: [Errno 22] Invalid argument >chcp 1257 Active code page: 1257 >python -c "print('ā')" | echo ECHO is on. Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='cp1257'> OSError: [Errno 22] Invalid argument in PowerShell >[Console]::OutputEncoding.CodePage 775 >python -c "print('ā')" | Out-String Ō >[Console]::OutputEncoding = [System.Text.Encoding]::UTF8 >python -c "print('ā')" | Out-String � >[Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding(1257) >python -c "print('ā')" | Out-String ā > I proposed using the "/u" switch for shell=True only to facilitate getting > results back from cmd's internal commands such as `set`. But it doesn't > change anything if you're using the shell to run other programs. but you can only do that if you know that command you execute is cmd's command but if it's user passed command then there isn't really reliable way to detect if it will execute inside cmd or not, for example "cmd /u /c chcp.exe" will return result in UTF-16 because such program doesn't exist and cmd's error message will be outputted. Also if user have set.exe in %System32% then "cmd /u /c set" won't be in UTF-16 because it will execute that program. >> by calling GetConsoleOutputCP inside child process with CreateRemoteThread > That's not the only way. You can also start a detached Python process (via > pythonw.exe or DETACHED_PROCESS) to run a script that calls AttachConsole and > returns the result of calling GetConsoleOutputCP: while useful to know it's still messy because I think you would need to prevent your target process from exiting before you've called AttachConsole and also most likely you want to get GetConsoleOutputCP before program's exit and not at start (say with CREATE_SUSPENDED) as it might have changed it somewhere in middle of program's execution. so looks like this route isn't worth going for. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue27179> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com