On Tuesday, 7 April 2015 at 07:50:40 UTC, Vladimir Panteleev
wrote:
On Tuesday, 7 April 2015 at 07:42:02 UTC, w0rp wrote:
Maybe autodecoding could throw an Error (No 'new' allowed)
when debug mode is on, and use replacement characters in
release mode. I haven't thought it through, but that's an idea.
No no no, terrible idea. This means your program will pass your
test suite in debug mode (which, of course, is never going to
test behavior with bad UTF in all the relevant places), but
silently corrupt real-world data in release mode. Errors and
asserts are for logic errors, not for validating user input!
I'd say that invalid UTF8 in `string`s _is_ a logic error,
because these are defined to be valid UTF8. If they aren't,
someone didn't correctly validate their inputs.
Unfortunately, not even the runtime cares about UTF correctness:
void main(string[] args) {
import std.utf;
args[1].validate; // throws
}
# ./testutf8 `echo 'äöü' | recode utf8..latin1`