https://issues.dlang.org/show_bug.cgi?id=14519

Marc Schütz <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[email protected]

--- Comment #18 from Marc Schütz <[email protected]> ---
(In reply to Walter Bright from comment #15)
> If you have a pipeline A.B.C.D, then A throws on invalid UTF, and B.C.D
> never are executed. But if A does not throw, then B.C.D guaranteed to be
> getting valid UTF, but they still pay the penalty of the compiler thinking
> they can allocate memory and throw.

When `assert()` is used, whatever cost there is will of course disappear with
`-release`.

And IMO asserting is the right thing to do. Quoting the spec [1]:

"char[] strings are in UTF-8 format. wchar[] strings are in UTF-16 format.
dchar[] strings are in UTF-32 format."

Note how it says "are in UTF-x format", not "should be". Therefore, a `string`
not containing UTF8 is by definition a bug.

Data with other (or unknown) encodings needs to be stored in `ubyte[]`.

[1] http://dlang.org/arrays.html#strings

--

Reply via email to