On Tue, 6 Aug 2013 19:27:56 +0200 Philippe Verdy <[email protected]> wrote:
> But there's an admitted exception : sorting with UCA may change the > relative order between the source strings, simply because sort > stability is not always wanted (it has a cost), and binary sorting > the results using the code point values as an additional collation > level is not always wanted, and normalization remains optional in > UCA. No, unless you are observing that the ordering of canonically equivalent output in a sorted list is undefined. (Strings that compare 'equal' may appear in either order.) An implementation of the UCA may neglect to normalise, but then it should only be used when normalisation is unnecessary. There is another, obscure type of conforming process - those that do casing operations by the rules. The rules fail to preserve canonical equivalence, though in some such cases it is arguable that neither result is linguistically correct. For example, I think the proper upper-casing of <U+1FB3 GREEK SMALL LETTER ALPHA WITH YPOGEGRAMMENI, U+0359 COMBINING ASTERISK BELOW> is <U+0391 GREEK CAPITAL LETTER ALPHA, U+0359, U+0196 LATIN CAPITAL LETTER IOTA, U+0359>. I don't expect this ever to be captured by the default casing. Richard.

