[issue27179] subprocess uses wrong encoding on Windows

2016-09-05 Thread Steve Dower
Steve Dower added the comment: Chatting about this with Victor we've decided to close this as a duplicate of issue6135 and continue the discussion there, and also focus mainly on exposing the parameter rather than trying to guess the correct encoding. I'll post more details on issue6135.

[issue27179] subprocess uses wrong encoding on Windows

2016-09-03 Thread Steve Dower
Steve Dower added the comment: I'll take a look during the week. I like parts of the patch but not all of it, but while we're inevitably discussing my PEPs it's sure to come up. -- assignee: -> steve.dower ___ Python tracker

[issue27179] subprocess uses wrong encoding on Windows

2016-09-03 Thread Dāvis
Dāvis added the comment: That is a great PIP, but it will take a lot of time to be implemented and it doesn't really solve this issue. This is different issue than filename/path encoding. Here we need to decode binary output from other applications and that for a lot of applications will be

[issue27179] subprocess uses wrong encoding on Windows

2016-09-03 Thread STINNER Victor
STINNER Victor added the comment: You should take a look at the recent PEP 529 "Change Windows filesystem encoding to UTF-8": https://www.python.org/dev/peps/pep-0529/ -- ___ Python tracker

[issue27179] subprocess uses wrong encoding on Windows

2016-09-03 Thread Dāvis
Dāvis added the comment: ping? Could someone review my patch? -- ___ Python tracker ___ ___ Python-bugs-list

[issue27179] subprocess uses wrong encoding on Windows

2016-06-08 Thread Dāvis
Changes by Dāvis : Added file: http://bugs.python.org/file43316/subprocess_fix_encoding_v4fixed.patch ___ Python tracker ___

[issue27179] subprocess uses wrong encoding on Windows

2016-06-08 Thread Dāvis
Changes by Dāvis : Removed file: http://bugs.python.org/file43315/subprocess_fix_encoding_v4.patch ___ Python tracker ___

[issue27179] subprocess uses wrong encoding on Windows

2016-06-08 Thread Dāvis
Dāvis added the comment: > Note that patch 3 requires setting `encoding` for even python.exe as a child > process, because sys.std* default to ANSI when isatty(fd) isn't true. I've updated my patch so that Python outputs in consoles encoding for pipes too. So now in PowerShell

[issue27179] subprocess uses wrong encoding on Windows

2016-06-05 Thread Eryk Sun
Eryk Sun added the comment: > so if we default to UTF-8 it will be even worse than > defaulting to ANSI because there aren't many programs > on Windows which would use UTF-8 I didn't say subprocess should default to UTF-8. What I wish is for Python to default to using UTF-8 for its own pipe

[issue27179] subprocess uses wrong encoding on Windows

2016-06-04 Thread Dāvis
Dāvis added the comment: Of course I agree that proper solution is to use Unicode/Wide API but that's much more work to implement and I rather have now this half-fix which works for most of cases than nothing till whenever Unicode support is implemented which might be nowhere soon. > IMO,

[issue27179] subprocess uses wrong encoding on Windows

2016-06-04 Thread Eryk Sun
Eryk Sun added the comment: Another set of counterexamples are the utilities in the GnuWin32 collection, which use ANSI in a pipe: >>> call('chcp.com') Active code page: 437 0 >>> '¡'.encode('1252') b'\xa1' >>> '\xa1'.encode('437') b'\xad' >>> os.listdir('.')

[issue27179] subprocess uses wrong encoding on Windows

2016-06-04 Thread Eryk Sun
Eryk Sun added the comment: >> so ANSI is the natural default for a detached process > > To clarify - ANSI is the natural default *for programs that > don't support Unicode*. By natural, I meant in the context of using GetConsoleOutputCP(), since WideCharToMultiByte(0, ...) encodes text as

[issue27179] subprocess uses wrong encoding on Windows

2016-06-04 Thread Dāvis
Dāvis added the comment: it makes no sense to not use better encoding default in some cases like my patch does. Most programs use console's encoding not ANSI codepage and thus by limiting default only to ANSI codepage we force basically everyone to always specify encoding. This is current

[issue27179] subprocess uses wrong encoding on Windows

2016-06-04 Thread STINNER Victor
STINNER Victor added the comment: "To clarify - ANSI is the natural default *for programs that don't support Unicode*." Exactly. For this reason, I am stronly opposed to chnage the default encoding. I'm ok to add helper functions or new flags. It looks like it took more than five years to

[issue27179] subprocess uses wrong encoding on Windows

2016-06-03 Thread Dāvis
Dāvis added the comment: > qprocess.exe (also to console) > quser.exe (also to console) these are broken (http://i.imgur.com/0zIhHrv.png) >chcp 1257 >quser USERNAME SESSIONNAME dƒvis console > chcp 775 > quser USERNAME

[issue27179] subprocess uses wrong encoding on Windows

2016-06-03 Thread Steve Dower
Steve Dower added the comment: > so ANSI is the natural default for a detached process To clarify - ANSI is the natural default *for programs that don't support Unicode*. Unfortunately, since "Unicode" on Windows is an incompatible data type (wchar_t rather than char), targeting Unicode

[issue27179] subprocess uses wrong encoding on Windows

2016-06-03 Thread Eryk Sun
Eryk Sun added the comment: > I would say almost all Windows console programs does use > console's encoding for input/output because otherwise > user wouldn't be able to read it. While some programs do use the console codepage, even when writing to a pipe or disk file -- such as more.com,

[issue27179] subprocess uses wrong encoding on Windows

2016-06-03 Thread Eryk Sun
Changes by Eryk Sun : -- Removed message: http://bugs.python.org/msg267090 ___ Python tracker ___

[issue27179] subprocess uses wrong encoding on Windows

2016-06-03 Thread Eryk Sun
Eryk Sun added the comment: > I would say almost all Windows console programs does use > console's encoding for input/output because otherwise > user wouldn't be able to read it. While some programs do use the console codepage, even when writing to a pipe or disk file -- such as more.com,

[issue27179] subprocess uses wrong encoding on Windows

2016-06-02 Thread Dāvis
Dāvis added the comment: if there's no console then os.device_encoding won't fail, it will just return None which means that ANSI codepage will be used like it currently is and so here it doesn't change anything, current behavior stays. Also like I showed TextIOWrapper already calls

[issue27179] subprocess uses wrong encoding on Windows

2016-06-02 Thread Steve Dower
Steve Dower added the comment: > There is right encoding, it's encoding that's actually used. This is true, but it puts the decision entirely in the hands of the developer(s) of the two processes involved. All IPC on Windows uses bytes, and encodings _always_ need to be negotiated by the

[issue27179] subprocess uses wrong encoding on Windows

2016-06-02 Thread Martin Panter
Martin Panter added the comment: Patch B changes _Py_device_encoding() to accept a file descriptor of 3, which seems wrong to me. Patch A is like the earlier patch, but calls os.device_encoding(1) instead of relying on sys.stdout, etc. I think this will still fail when the Python parent’s

[issue27179] subprocess uses wrong encoding on Windows

2016-06-02 Thread Dāvis
Dāvis added the comment: There is right encoding, it's encoding that's actually used. Here we're inside subprocess.Popen which does the actual winapi.CreateProcess call and thus we can check for any creationflags and adjust encoding logic accordingly. I would say almost all Windows console

[issue27179] subprocess uses wrong encoding on Windows

2016-06-02 Thread Dāvis
Changes by Dāvis : Added file: http://bugs.python.org/file43101/subprocess_fix_encoding_v2_b.patch ___ Python tracker ___

[issue27179] subprocess uses wrong encoding on Windows

2016-06-02 Thread Dāvis
Changes by Dāvis : Removed file: http://bugs.python.org/file43094/subprocess_fix_encoding.patch ___ Python tracker ___

[issue27179] subprocess uses wrong encoding on Windows

2016-06-01 Thread Martin Panter
Martin Panter added the comment: I think Issue 6135 has a bit of discussion on adding encoding and error parameters to subprocess.Popen etc. -- ___ Python tracker

[issue27179] subprocess uses wrong encoding on Windows

2016-06-01 Thread Eryk Sun
Eryk Sun added the comment: There is no right encoding as far as I can see. If it's attached to a console (i.e. conhost.exe), then cmd.exe uses the console's output codepage when writing to a pipe or file, which is the scenario that your patch attempts to address. But if you pass

[issue27179] subprocess uses wrong encoding on Windows

2016-06-01 Thread Steve Dower
Steve Dower added the comment: Even sys.__stdout__ can be missing. In this context, falling back on the default encoding is probably fine, but for 3.6 I'd like to make everything default to UTF-8 on Windows, and force the console mode on startup (restore on finalize) - apart from the input()

[issue27179] subprocess uses wrong encoding on Windows

2016-06-01 Thread Martin Panter
Martin Panter added the comment: I don’t know much about the conventions for stdout etc encoding on Windows. But in general, the patch does not seem robust. Does it work if sys.stdout is a pipe or file (not a console)? I doubt it will work when sys.stdout has been replaced by e.g. StringIO,

[issue27179] subprocess uses wrong encoding on Windows

2016-06-01 Thread Dāvis
Dāvis added the comment: I looked at #27048 and indeed it's affected by this bug, it happens to me too (I've non-ASCII symbols in %PATH%) and this my patch fixes that. on my system without patch > python -c "from distutils import _msvccompiler; _msvccompiler._get_vc_env('')" Traceback (most

[issue27179] subprocess uses wrong encoding on Windows

2016-06-01 Thread Dāvis
Dāvis added the comment: there's no such "ā" command, it's just used to get non-ASCII output cmd will return: 'ā' is not recognized as an internal or external command, operable program or batch file. and this will be encoded in consoles encoding (UTF8 in my example or whatever chcp is set

[issue27179] subprocess uses wrong encoding on Windows

2016-06-01 Thread STINNER Victor
STINNER Victor added the comment: What is the ā command? Where does it come from? What is its output? Do you know the encoding of its output? It remembers me the "set" issue in distutils: issue #27048. -- ___ Python tracker

[issue27179] subprocess uses wrong encoding on Windows

2016-06-01 Thread Dāvis
New submission from Dāvis: subprocess uses wrong encoding on Windows. On Windows 10 with Python 3.5.1 from Command Prompt (cmd.exe) > chcp 65001 > python -c "import subprocess; subprocess.getstatusoutput('ā')" Traceback (most recent call last): File "", line 1, in File