On Wed, 23 May 2012 17:47:09 -0700 Markus Scherer <markus....@gmail.com> wrote:
> Also, I just saw that > http://www.unicode.org/Public/UCA/latest/CollationAuxiliary.zipcontains > allkeys_CLDR.txt which should correspond 1:1 with the > FractionalUCA*.txt in the same .zip file. > One format difference: <snip> I spotted two differences flicking through the end of the differences - DUCET allkeys.txt gives the same 4th level weight to U+2FA6, U+328E and U+F90A, although U+91D1 is only a compatibility decomposition of the first two. By contrast, allkeys_CLDR.txt follows the documented process of setting the 4th level weight according to the canonical decomposition. This pattern seems to repeated throughout the CJK characters. The second difference is again at 4th level - allkeys_CLDR.txt gives different 4th level weights to the canonically equivalent U+1B40 and <U+1B3E, U+1B35>, which is wrong. It's as though DUCET and the root locale collation were generated from imperfectly aligned programs rather than one being derived from the other. As ICU does not load the 4th level weight, it will be shielded from these issues. Richard.