Andrei Alexandrescu wrote: > ... > > What can be done about that? I see a number of solutions: > > (a) Do not operate the change at all. > > (b) Operate the change and mention that in range algorithms you should > check hasLength and only then use "length" under the assumption that it > really means "elements count". > > (c) Deprecate the name .length for UTF-8 and UTF-16 strings, and define > a different name for that. Any other name (codeUnits, codes etc.) would > do. The entire point is to not make algorithms believe strings have a > .length property. > > (d) Have std.range define a distinct property called e.g. "count" and > then specialize it appropriately. Then change all references to .length > in std.algorithm and elsewhere to .count. > > What would you do? Any ideas are welcome. > > > Andrei
I'm leaning towards (c) here. To me the .length on char[] and wchar[] are kinda like doing this: struct SomePOD { int a, b; double y; } SomePOD pod; auto len = pod.length; assert(len == 16); // true. I'll admit it's not a perfect analogy. What I'm playing on here is that the .length on char[] and wchar[] returns the /size of/ the string in bytes rather than the /length/ of the string in number of (well-formed) characters. Unfortunately .sizeof is supposed to return the size of the string's reference (8 bytes on x86 systems) and not the size of the string, IIRC. So that's taken. So perhaps a .bytes or .nbytes property. Maybe make it work for arrays of structs and things like that too. A tuple (or any container) of non-homogeneous elements could probably benefit from this property as well. Given such a property being available, I wouldn't miss .length at all. It's quite misleading.