> On 25 Jul 2018, at 13:39, Damien Pollet <damien.pollet+ph...@gmail.com> wrote:
> 
> On Tue, 24 Jul 2018 at 11:39, Alistair Grant <akgrant0...@gmail.com> wrote:
> > On 23 Jul 2018, at 12:07, Sven Van Caekenberghe <s...@stfx.eu> wrote:
> So,
> 
>   Stdio stdout
> 
> should return return a character write stream with UTF-8 encoding while
> 
>   Stdio binaryStdout
> 
> should be the lower level binary one.
> This would be more in line with the other streams.
> A non-UTF-8 encoding can be used as per Pavel's example.
> 
> +1
> 
> I didn't suggest this earlier because it isn't backward compatible.  But I do 
> think it is the better solution.
> 
> +2
> 
> I had a look at Stdio recently for Clap. The current implementation with 
> Stdio stdout returning the binary stream is a bit confusing, but at least you 
> can wrap it.
> The above proposition with an explicit binaryStdout for the lower level 
> uncommon case would be much clearer indeed.
> 
> Related issue: command line arguments come from VM system attributes as 
> ByteStrings… and thus interpreted as iso-8859-1, which is incorrect in most 
> cases nowadays, even though it seems to work as long as you only use ASCII. 
> Decoding them is easy enough, but it requires two copies (asByteString 
> utf8Decoded)

Yes this is a really big issue. Anything coming in as command line arg or 
environment variable (or clipboard) is in a basically unknown OS determined 
encoding. I would assume/hope the UTF-8 is the sensible default today, but 
apparently not. And it is hard to find a cross platform solution.

We've had serious issues already with this, like $HOME set to a non-ASCII path 
that then breaks almost everything. 

Reply via email to