Soobok Lee <[EMAIL PROTECTED]> wrote: > Even though 0049 0307 === 0130 (modulo NFC), two have different > output labels .
Oh dear, that is just wrong. It violates the Unicode principle that canonically equivalent strings should always be treated the same. I think this points to a larger problem: The Unicode Consortium has provided a normalization algorithm that squashes equivalent variations, and a folding algorithm that squashes case differences, but they haven't provided an algorithm that squashes both equivalent variations and case differences. So the IETF has tried to build one, and we've gotten it wrong. I suggest that the Unicode Consortium should define two new algorithms: one that is like NFC but also squashes case, and one that is like NFKC but also squashes case. Then nameprep can simply refer to one of those (and also specify a set of prohibited characters). AMC
