On 22 May 2017 at 12:28, Thomas Kluyver <tho...@kluyver.me.uk> wrote: > What if it wants to send a character which can't be encoded in the > locale encoding? It's quite easy on Windows to end up with a character > that you can't encode as cp1252. If the build tool uses .encode(loc_enc, > 'replace'), then you've lost information even before it gets to the > install tool. > > It's 2017, I really don't want to go down the 'locale specified > encoding' route again. UTF-8 everywhere!
Hang on. Can we take a step back here? I just re-read the PEP and remembered (!) that hooks are *in-process* Python entry points (I've been working with pip's current backend-as-subprocess model, and mixed up in my mind the original 2 proposals here). I think this encoding debate may be a red herring. If a hook is being called as a Python method call, then it can print what it likes to stdout and stderr. And it's the backend's responsibility to ensure that it never fails when printing - so the *backend* has to deal with the fact that anything it wants to print must be representable in sys.stdout.encoding, with the default (raise an exception) error handling. Given this fact, and the fact that sys.stdout and sys.stderr are *text* output streams, build frontends like pip can reasonably just replace sys.std{out,err} (for example with a StringIO object) to get hook output. There's no encoding issue for frontends, they just capture the text sent to the stdio streams. The rules needed for *backends* are then: 1. Backends MUST NOT write to raw IO channels, all output MUST go via sys.stdout and sys.stderr. Build frontends MAY redirect these streams to post-process them, but are not required to do so. As a consequence: 1a. Backends MUST be prepared to deal with the possibility that those IO streams have the limitations of the platform IO streams (e.g., limited subset of Unicode allowed, fails with an exception when invalid characters are written). 1b. Backends MUST capture and manage the output from any subprocesses they spawn (so that they can follow the other rules). 1c. Backends cannot assume that they can write output that the user will see - frontends may suppress or modify any output passed on stdout. Conversely, backends should not bypass the ability of frontends to capture stdout, as frontends are responsible for user interaction. Some of those MUSTs could be replaced by SHOULD, if we want to allow backends to write directly to the screen. But that is likely to corrupt the UI of the frontend, so I'm inclined to say that we don't allow that. Paul _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig