Re: Text in composed normalized form is king, right? Does anyone generate text in decomposed normalized form?

Mark Davis ☕ Mon, 11 Feb 2013 01:35:25 -0800

The draft update to LDML for collation is at
http://unicode.org/repos/cldr/trunk/specs/ldml/tr35-collation.html.


Bugs or requests can be filed at http://unicode.org/cldr/trac/newticket .

Mark <https://plus.google.com/114199149796022210033>
*
*
*— Il meglio è l’inimico del bene —*
**


On Mon, Feb 11, 2013 at 9:35 AM, Richard Wordingham <
[email protected]> wrote:

> On Mon, 11 Feb 2013 02:45:27 +0100
> Philippe Verdy <[email protected]> wrote:
>
> > 2013/2/10 Richard Wordingham <[email protected]>:
>
> > The term "pathological" could aplpy to these cases where a "naive"
> > implementation may in fact break the expectations. How then can a
> > collator become a "conforming" process if it has to differentiate
> > canonically equivalent input strings ?
>
> There is a UCA collation option, 'normalization' set to 'off', which
> allows such incorrect operation if strings are not FCD.  (Both NFC and
> NFD strings are FCD.) The UCA and LMDL definitions *still* together
> falsely claim that omitting normalisation will give the correct result
> on FCD strings; counter-examples include default collation <U+0F71
> TIBETAN VOWEL SIGN AA, U+0F73 TIBETAN VOWEL SIGN II> and Danish (still
> at CLDR Version 22.1) <U+0061 LATIN SMALL LETTER A, U+00E5 LATIN SMALL
> LETTER A WITH RING ABOVE>.
>
> Richard.
>
>

Re: Text in composed normalized form is king, right? Does anyone generate text in decomposed normalized form?

Reply via email to