On Saturday, 22 August 2015 at 13:41:49 UTC, Sönke Ludwig wrote:
There is more than the actual call to validate(), such as
writing tests and making sure the surroundings work, adjusting
the interface and writing documentation. It's not *that* much
work, but nonetheless wasted work.
I also still think that this hasn't been a bad idea at all.
Because it speeds up the most important use case, parsing JSON
from a non-memory source that has not yet been validated. I
also very much like the idea of making it a programming error
to have invalid UTF stored in a string, i.e. forcing the
validation to happen before the cast from bytes to chars.
Also see "utf/unicode should only be validated once"
https://issues.dlang.org/show_bug.cgi?id=14919
If combining lexing and validation is faster (why?) then a ubyte
consuming interface should be available, though why couldn't it
be done by adding a lazy ubyte->char validator range to std.utf.
In any case during lexing we should avoid autodecoding of narrow
strings for redundant validation.