The draft update to LDML for collation is at http://unicode.org/repos/cldr/trunk/specs/ldml/tr35-collation.html.
Bugs or requests can be filed at http://unicode.org/cldr/trac/newticket . Mark <https://plus.google.com/114199149796022210033> * * *— Il meglio è l’inimico del bene —* ** On Mon, Feb 11, 2013 at 9:35 AM, Richard Wordingham < richard.wording...@ntlworld.com> wrote: > On Mon, 11 Feb 2013 02:45:27 +0100 > Philippe Verdy <verd...@wanadoo.fr> wrote: > > > 2013/2/10 Richard Wordingham <richard.wording...@ntlworld.com>: > > > The term "pathological" could aplpy to these cases where a "naive" > > implementation may in fact break the expectations. How then can a > > collator become a "conforming" process if it has to differentiate > > canonically equivalent input strings ? > > There is a UCA collation option, 'normalization' set to 'off', which > allows such incorrect operation if strings are not FCD. (Both NFC and > NFD strings are FCD.) The UCA and LMDL definitions *still* together > falsely claim that omitting normalisation will give the correct result > on FCD strings; counter-examples include default collation <U+0F71 > TIBETAN VOWEL SIGN AA, U+0F73 TIBETAN VOWEL SIGN II> and Danish (still > at CLDR Version 22.1) <U+0061 LATIN SMALL LETTER A, U+00E5 LATIN SMALL > LETTER A WITH RING ABOVE>. > > Richard. > >