Dāvis added the comment:

> qprocess.exe (also to console)
> quser.exe (also to console)

these are broken (http://i.imgur.com/0zIhHrv.png)

    >chcp 1257
    >quser
     USERNAME              SESSIONNAME
     dƒvis                 console

    > chcp 775
    > quser
     USERNAME              SESSIONNAME
     dāvis                 console


we've to decide which codepage to use as default and it should cover most cases 
not some minority of programs so I would say using console's code page when 
it's available makes the most sense and when isn't then fallback to ANSI 
codepage

Now for these special cases where our guess is wrong only user can know which 
encoding would be right and so he must specify that.


I also checked that cmd /u flag is totally useless because it applies only to 
cmd itself not to any other programs and so to use it would need to check if 
returned output is actual UTF-16 or some other encoding which might even pass 
as valid UTF-16

for example:
    cmd /u /c "echo ā"
will return
ā in UTF-16

but
    cmd /u /c "sc query"

result will be encoded in OEM codepage (775 for me) and no sign of UTF-16


I looked if there's some function to get used encoding for child process but 
there isn't, I would have expected something like GetConsoleOutputCP(hThread)
So the only way to get it, is by calling GetConsoleOutputCP inside child 
process with CreateRemoteThread and it's not really pretty and quite hacky, but 
it does work, I tested.

anyway even with that would need to change something about TextIOWrapper 
because we're creating it before process is even started and encoding isn't 
changeable later.




I updated patch which fixes issues with creationflags and also added option to 
change encoding based on subprocess3.patch (from #6135)

so now with my patch it really works for most cases.

    >python -c "import subprocess; subprocess.getstatusoutput('ā')"

works correctly for me with correct encoding when console's code page is set to 
any of 775 (OEM), 1257 (ANSI) and 65001 (UTF-8)

it also works correctly with any of DETACHED_PROCESS, CREATE_NEW_CONSOLE, 
CREATE_NO_WINDOW

    >python -c "import subprocess; subprocess.getstatusoutput('ā', 
creationflags=0x00000008)"


this also works correctly with console's encodings: 775, 1257, 65001

    >python -c "from distutils import _msvccompiler; 
_msvccompiler._get_vc_env('')"



and finally 

   > chcp 1257
   > python -c "import subprocess; print(subprocess.check_output('quser', 
encoding='cp775'))"
    USERNAME              SESSIONNAME
    dāvis                 console

also works correctly with any of console's encoding even if it didn't showed 
correct encoding inside cmd itself.

----------
Added file: http://bugs.python.org/file43183/subprocess_fix_encoding_v3.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27179>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to