On Tuesday, July 17, 2018 17:28:19 Seb via Digitalmars-d wrote: > On Tuesday, 17 July 2018 at 16:58:37 UTC, Jonathan M Davis wrote: > > On Tuesday, July 17, 2018 15:21:30 Seb via Digitalmars-d wrote: > >> [...] > > > > If it's not a range by default, why would you expect _anything_ > > which operates on ranges to work with rcstring directly? IMHO, > > if it's not a range, then range-based functions shouldn't work > > with it, and I don't see how they even _can_ work with it > > unless you assume code units, or code points, or graphemes as > > the default. If it's designed to not be a range, then it should > > be up to the programmer to call the appropriate function on it > > to get the appropriate range type for a particular use case, in > > which case, you really shouldn't need to add much of any > > overloads for it. > > > > - Jonathan M Davis > > Well, there are few cases where the range type doesn't matter and > one can simply compare bytes, e.g. > > equal (e.g. "ä" == "ä" <=> [195, 164] == [195, 164]) > commonPrefix > find > ...
That effectively means treating rcstring as a range of char by default rather than not treating it as a range by default. And if we then do that only with functions that overload on rcstring rather than making rcstring actually a range of char, then why aren't we just treating it as a range of char in general? IMHO, the fact that so many alogorithms currently special-case on arrays of characters is one reason that auto-decoding has been a disaster, and adding a bunch of overloads for rcstring is just compounding the problem. Algorithms should properly support arbitrary ranges of characters, and then rcstring can be passed to them by calling one of the functions on it to get a range of code units, code points, or graphemes to get an actual range - either that, or rcstring should default to being a range of char. going halfway and making it work with some functions via overloads really doesn't make sense. Now, if we're talking about functions that really operate on strings and not ranges of characters (and thus do stuff like append), then that becomes a different question, but we've mostly been trying to move away from functions like that in Phobos. > Of course this assumes that there's no normalization necessary, > but the current auto-decoding assumes this too. You can still normalize with auto-decoding (the code units - and thus code points - are in a specific order even when encoded, and that order can be normalized), and really, anyone who wants fully correct string comparisons needs to be normalizing their strings. With that in mind, rcstring probably should support normalization of its internal representation. - Jonathan M Davis
