On Saturday, 8 March 2014 at 01:23:27 UTC, Andrei Alexandrescu wrote:
Yup, the grapheme issue. This should work.

No. It does not work because grapheme segmentation is not the same as normalization. Even if you fix the code (should be: assert(s.byGrapheme.canFind!"a[] == b"("é"))), it will not work because byGrapheme does not normalize (and not all graphemes can be normalized to a single code point anyway). And there is more than one type of normalization - you need to use the one depending on what you're trying to achieve.

Graphemes are the next level of Nirvana above code points, but that doesn't mean it's graphemes or nothing.

It's not about types, it's about algorithms. It's never "X or nothing" - unless X is "actually understanding Unicode". Everything else is a compromise.

Compromises are acceptable, but not when they are built into the language as the standard way of working with text, thus hiding the problems that come with them.

Reply via email to