> On 25 Jul 2018, at 13:39, Damien Pollet <damien.pollet+ph...@gmail.com> wrote:
>
> On Tue, 24 Jul 2018 at 11:39, Alistair Grant <akgrant0...@gmail.com> wrote:
> > On 23 Jul 2018, at 12:07, Sven Van Caekenberghe <s...@stfx.eu> wrote:
> So,
>
> Stdio stdout
>
> should return return a character write stream with UTF-8 encoding while
>
> Stdio binaryStdout
>
> should be the lower level binary one.
> This would be more in line with the other streams.
> A non-UTF-8 encoding can be used as per Pavel's example.
>
> +1
>
> I didn't suggest this earlier because it isn't backward compatible. But I do
> think it is the better solution.
>
> +2
>
> I had a look at Stdio recently for Clap. The current implementation with
> Stdio stdout returning the binary stream is a bit confusing, but at least you
> can wrap it.
> The above proposition with an explicit binaryStdout for the lower level
> uncommon case would be much clearer indeed.
>
> Related issue: command line arguments come from VM system attributes as
> ByteStrings… and thus interpreted as iso-8859-1, which is incorrect in most
> cases nowadays, even though it seems to work as long as you only use ASCII.
> Decoding them is easy enough, but it requires two copies (asByteString
> utf8Decoded)
Yes this is a really big issue. Anything coming in as command line arg or
environment variable (or clipboard) is in a basically unknown OS determined
encoding. I would assume/hope the UTF-8 is the sensible default today, but
apparently not. And it is hard to find a cross platform solution.
We've had serious issues already with this, like $HOME set to a non-ASCII path
that then breaks almost everything.