On Mon, May 22, 2017, at 11:36 PM, Steve Dower wrote: > IMHO, #2 is definitely the right way to go. Yes, the platform specific > code now has to worry about the encoding, but... the encoding is > platform specific? So... that seems exactly right? :) Maybe I'm still > missing something here, but I'm totally happy to leave it to Thomas to > decide (which I think he has, but I haven't gotten to looking at that PR > yet).
I think I broadly agree with this as well. My reservation is that the build backend might be running a subprocess which produces output in an *unknown* encoding, especially if it allows the package author or the end user to configure a command to run. If it doesn't know the encoding, I'd rather get the raw bytes from the subprocess in the log (e.g. dumped to a file), rather than attempting to transcode them to UTF-8 - the conversion risks losing information, and even if it doesn't, it makes it harder to work out what was really meant. I feel like we're spending a lot of energy on a point that's not really central to the PEP, though. I think we've established that there's a potential for bugs and mojibake whatever we put in the spec. So I'd like to put something relatively simple and move on. I still stand by my PR, which amounts to "backends try to make it UTF-8, frontends don't crash if it isn't". I might be persuaded to add a recommendation that frontends dump the bytes to a file if they're not UTF-8, so the user can pull it apart if necessary. Thomas _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig