Good point regarding the fact that the Windows 16-bit APIs only come
into play for interactive sessions (even in 3.6+), while for PEP 517
we're specifically interested in the 8-bit pipes used to communicate
with build subprocesses launched by an installation tool.

On 20 May 2017 at 19:11, Paul Moore <p.f.mo...@gmail.com> wrote:
> The bigger question, though, is to what extent we want to mandate that
> build tools that run external tools such as compilers take
> responsibility for the encoding of the output of those tools (rather
> than simply passing the output through to the output stream
> unmodified). And if we do want to, whether we want to allow an
> exception for setuptools/distutils.
>
> Also, a question regarding Unix - do we really want to mandate UTF-8
> even if the system locale is set to something else? Won't that mean
> that build tools have the same problem with compilers generating
> output in the encoding the tool wants that we already have on Windows?

Yeah, I think that problem was starting to occur to me, hence the
reference to handling RPM and DEB build environments.

At least for non-Windows systems, I see two possible recommendations:

1. We advise installation tools to use binary streams to communicate
with build tools, and treat the results as opaque binary data. If it
needs to be written out to the installation tool's own streams, then
use the binary level APIs for those interfaces to inject the build
tool output directly, rather than decoding and re-encoding it first.

2. We advise installation tools to adopt a PEP 538 style solution,
where they mostly just trust the result of
locale.getpreferredencoding() *unless*
"codecs.lookup(locale.getpreferredencoding()).name == 'ascii'". In the
latter case, we'd advise them to set LC_CTYPE (and potentially LANG)
appropriately for the running OS. Regardless of whether or not that
locale coercion was needed, we'd recommend setting "replace" or
"backslashreplace" when decoding the stream output from the
subprocess.

At the specification level, I think option 1 probably makes the most
sense - we'd be advising insallation tools that they're free to kick
any mojibake problems further down the automation pipeline if they
don't want to worry about it. It's also the only one of the two
recommendations we can readily make cross platform.

At a quality-of-implementation level, there's a lot of potential value
in option 2 (at least on non-Windows systems) - we just wouldn't
require or recommend it at the level of the interoperability
specifications.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to