https://issues.dlang.org/show_bug.cgi?id=14519
Marc Schütz <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] --- Comment #18 from Marc Schütz <[email protected]> --- (In reply to Walter Bright from comment #15) > If you have a pipeline A.B.C.D, then A throws on invalid UTF, and B.C.D > never are executed. But if A does not throw, then B.C.D guaranteed to be > getting valid UTF, but they still pay the penalty of the compiler thinking > they can allocate memory and throw. When `assert()` is used, whatever cost there is will of course disappear with `-release`. And IMO asserting is the right thing to do. Quoting the spec [1]: "char[] strings are in UTF-8 format. wchar[] strings are in UTF-16 format. dchar[] strings are in UTF-32 format." Note how it says "are in UTF-x format", not "should be". Therefore, a `string` not containing UTF8 is by definition a bug. Data with other (or unknown) encodings needs to be stored in `ubyte[]`. [1] http://dlang.org/arrays.html#strings --
