/[One consequence of the string policy is that ]/we can no longer
    encode new precomposed characters for grapheme clusters that are
    already encoded in any existing standard form/[.]/

And you've truncated the end of my sentence
Well, I have not, unless you really want to count that little period. While we're at it, I have a habit of trying to quote linguistic constituents whenever it is possible to do so without too much contortion, and the period has scope over the whole sentence in a way, so if I omit the first part I want to omit the period as well, but that's really just me ;-)

Okay. So I had truncated away /supplementary content/
yes technically they could still be encoded [...]
In other words, this duplicate encoding (without canonical equivalence) is not needed and not desirable.
which is great to know, but still not part of the issue. Maybe it was a mistake to precompose only {ᾱ,ῑ,ῡ}; or they were part of some preexisting character set (and then Unicode policy prevented us from adding more – a policy whose content I'm not exploring and whose sense I'm not debating). It will also depend on the reason for precomposing Greek in the first place. The question remains: If someone was thinking far enough to be wanting to have these in some set, that person should have known they're likely gonna be used in an environment where additional diacritics may be added. Pitch accent depends on vowel length, so it's unlikely someone would want these three characters only in an otherwise diacritic-free string or only for say parenthetical comment after a word or phrase. Here's another idea: perhaps a keyboard layout didn't want to bother with a macron dead key but already provided dead keys for the breathings and accent marks (but that'd waste 2 keys) – and an encoding reflected that. But this is all speculation. It's a little oddity, and I'm asking whether someone knows /something/ about this topic.

S

Reply via email to