On Mon, May 22, 2017, at 11:36 PM, Steve Dower wrote:
> IMHO, #2 is definitely the right way to go. Yes, the platform specific 
> code now has to worry about the encoding, but... the encoding is 
> platform specific? So... that seems exactly right? :) Maybe I'm still 
> missing something here, but I'm totally happy to leave it to Thomas to 
> decide (which I think he has, but I haven't gotten to looking at that PR 
> yet).

I think I broadly agree with this as well. My reservation is that the
build backend might be running a subprocess which produces output in an
*unknown* encoding, especially if it allows the package author or the
end user to configure a command to run. If it doesn't know the encoding,
I'd rather get the raw bytes from the subprocess in the log (e.g. dumped
to a file), rather than attempting to transcode them to UTF-8 - the
conversion risks losing information, and even if it doesn't, it makes it
harder to work out what was really meant.

I feel like we're spending a lot of energy on a point that's not really
central to the PEP, though. I think we've established that there's a
potential for bugs and mojibake whatever we put in the spec. So I'd like
to put something relatively simple and move on. I still stand by my PR,
which amounts to "backends try to make it UTF-8, frontends don't crash
if it isn't". I might be persuaded to add a recommendation that
frontends dump the bytes to a file if they're not UTF-8, so the user can
pull it apart if necessary.

Thomas
_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to