On 27/10/2003 16:16, Philippe Verdy wrote:

...


So, all we can do is to define compatibility equivalence between: <c1, CCO, c2> and: <c1, c2> if and only if: CC(c1) > CC(c2) > 0.

This won't affect the NFC and NFD conversion algorithms, but it can affect
the NFKC and NFKD conversion algorithms. This means that XML, SGML and
HTML are not affected by this change [ and the W3C is happy :-> ].



Thanks for the clarification. In principle we might be able to go a little further: we could define both <c, CCO> and <CCO, c> as canonically equivalent to c for all c in combining class zero. This would have to be some kind of decomposition exception so that c is never decomposed by adding CCO before or after it. This would not remove CCO between two combining characters, so, if 0<c1<c2, <c1, c2> and <c1, CCO, c2> would remain not canonically equivalent while logically equivalent. In practice this would be a small price to pay as it is relevant only in the almost unique case of two vowels on one consonant which actually happen to be in canonical order.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to