On Thursday, 2 June 2016 at 20:13:14 UTC, Andrei Alexandrescu wrote:
On 06/02/2016 03:34 PM, tsbockman wrote:
Your 'ö' examples will NOT work reliably with auto-decoded code points, and for nearly the same reason that they won't work with code units; you
would have to use byGrapheme.

They do work per spec: find this code point. It would be surprising if 'ö' were found but the string were positioned at a different code point.

Your examples will pass or fail depending on how (and whether) the 'ö' grapheme is normalized. They only ever succeeds because 'ö' happens to be one of the privileged graphemes that *can* be (but often isn't!) represented as a single code point. Many other graphemes have no such representation.

Working directly with code points is sometimes useful anyway - but then, working with code units can be, also. Neither will lead to inherently "correct" Unicode processing, and in the absence of a compelling context, your examples fall completely flat as an argument for the inherent superiority of processing at the code unit level.

The fact that you still don't get that, even after a dozen plus attempts by the community to explain the difference, makes you unfit to direct
Phobos' Unicode support.

Well there's gotta be a reason why my basic comprehension is under constant scrutiny whereas yours is safe.

Who said mine is safe? I *know* that I'm not qualified to be in charge of this.

Your comprehension is under greater scrutiny because you are proposing to overrule nearly all other active contributors combined.

Please, either go study Unicode until you
really understand it, or delegate this issue to someone else.

Would be happy to. To whom would I delegate?

If you're serious, I would suggest Dmitry Olshansky. He seems to be our top Unicode expert, based on his contributions to `std.uni` and `std.regex`. But, if he is unwilling/unsuitable for some reason there are other candidates participating in this thread (not me).

Reply via email to