From: "Peter Kirk" <[EMAIL PROTECTED]> > Thanks for the clarification. In principle we might be able to go a > little further: we could define both <c, CCO> and <CCO, c> as > canonically equivalent to c for all c in combining class zero. This > would have to be some kind of decomposition exception so that c is never > decomposed by adding CCO before or after it. This would not remove CCO > between two combining characters, so, if 0<c1<c2, <c1, c2> and <c1, CCO, > c2> would remain not canonically equivalent while logically equivalent. > In practice this would be a small price to pay as it is relevant only in > the almost unique case of two vowels on one consonant which actually > happen to be in canonical order.
Why that? As CCO is not defined in any past versions, the stability pact does not say that we must forbid its _removal_ when computing NFC or NFD or NFKC or NFKD forms. It just says that we must _not insert_ it in a source string <c1, c2> where c1 and c2 are already assigned. So we are fine: we can define a canonical equivalence between <c1, CCO, c2> and <c1, c2> where the later is simultaneously in NFC, NFD, NFKC and NFKD forms, for all (c1, c2) pair such that CC(c1)<=CC(c2) or CC(c2)=0. But we cannot define it within the UCD, but algorithmically, like for Hangul syllables/jamos...

