On Tuesday, 7 April 2015 at 09:04:09 UTC, Walter Bright wrote:
On 4/7/2015 1:19 AM, Dicebot wrote:
I have doubts about it similar to Vladimir. Main problem is that I have no idea what actually happens if replacement characters appear in some unicode text my
program processes.

It's much like floating point NaN values, which are 'sticky'.

Yes, but std.conv doesn't return NaN if you try to convert "banana" to a double.

With UTF strings, if you care about invalid UTF (a surprisingly large amount of operations done on strings simply don't care about invalid UTF) the validation can be done as a separate step.

So can converting invalid UTF to replacement characters.

Also it is worrying to see so much effort put into `nothrow` in language which
endorses exceptions as its main error reporting mechanism.

There is definitely a tug of war going on there. Exceptions are great, except they aren't free.

What I've tried to do is design things so that erroneous input is not possible - that all possible input has straightforward output. In other words, try to define the problem out of existence. Then there are no errors.

I think the correct solution to that is to kill auto-decoding :) Then all decoding is explicit, and since it is explicit, it is trivial to allow specifying the desired behavior upon encountering invalid UTF-8.

Reply via email to