From: "Peter Kirk" <[EMAIL PROTECTED]>

> Thanks for the clarification. In principle we might be able to go a 
> little further: we could define both <c, CCO> and <CCO, c> as 
> canonically equivalent to c for all c in combining class zero. This 
> would have to be some kind of decomposition exception so that c is never 
> decomposed by adding CCO before or after it. This would not remove CCO 
> between two combining characters, so, if 0<c1<c2, <c1, c2> and <c1, CCO, 
> c2> would remain not canonically equivalent while logically equivalent. 
> In practice this would be a small price to pay as it is relevant only in 
> the almost unique case of two vowels on one consonant which actually 
> happen to be in canonical order.

Why that?

As CCO is not defined in any past versions, the stability pact does
not say that we must forbid its _removal_ when computing NFC or NFD
or NFKC or NFKD forms. It just says that we must _not insert_ it in a
source string <c1, c2> where c1 and c2 are already assigned.

So we are fine: we can define a canonical equivalence between
<c1, CCO, c2> and <c1, c2> where the later is simultaneously in
NFC, NFD, NFKC and NFKD forms, for all (c1, c2) pair such that
CC(c1)<=CC(c2) or CC(c2)=0.

But we cannot define it within the UCD, but algorithmically, like for
Hangul syllables/jamos...


Reply via email to