On 05/27/2016 08:42 PM, Andrei Alexandrescu wrote:
Which languages are covered by code points, and which languages require
graphemes consisting of multiple code points? How does normalization
play into this? -- Andrei
I don't think there is value in distinguishing by language. The point of
Unicode is that you shouldn't need to do that.
I think there are scripts that use combining characters extensively, but
Unicode also has stuff like combining arrows. Those can make sense in an
otherwise plain English text.
For example: 'a' + U+20D7 = a⃗.
There is no combined character for that, so normalization can't do
anything here.