On Friday, May 13, 2016 11:00:19 Marc Schütz via Digitalmars-d wrote: > On Friday, 13 May 2016 at 10:38:09 UTC, Jonathan M Davis wrote: > > Ideally, algorithms would be Unicode aware as appropriate, but > > the default would be to operate on code units with wrappers to > > handle decoding by code point or grapheme. Then it's easy to > > write fast code while still allowing for full correctness. > > Granted, it's not necessarily easy to get correct code that > > way, but anyone who wants fully correctness without caring > > about efficiency can just use ranges of graphemes. Ranges of > > code points are rare regardless. > > char[], wchar[] etc. can simply be made non-ranges, so that the > user has to choose between .byCodePoint, .byCodeUnit (or > .representation as it already exists), .byGrapheme, or even > higher-level units like .byLine or .byWord. Ranges of char, wchar > however stay as they are today. That way it's harder to > accidentally get it wrong.
It also means yet more special cases. You have arrays which aren't treated as ranges when every other type of array out there is treated as a range. And even if that's what we want to do, there isn't really a clean deprecation path. > There is a simple deprecation path that's already been suggested. > `isInputRange` and friends can output a helpful deprecation > warning when they're called with a range that currently triggers > auto-decoding. How would you put a deprecation message inside of an eponymous template like isInputRange? Deprecation messages are triggered when a symbol is used, not when it passes or fails a static if inside of a template. And even if we did something like put a pragma in isInputRange, you'd get a _flood_ of messages in any program that does much of anything with ranges and strings. It's a possible path, but it sure isn't a pretty one. Honestly, I'd have to wonder whether just outright breaking code would be better. - Jonathan M Davis
