CLDR doesn't modify anything but primaries in the root ordering. Particular
languages may modify any of the levels, but I don't think anything is
typically done except for primary and secondary (with the exception of
Japanese, which is quite complicated).
Mark
*— Il meglio è l’inimico del bene —*
*
*
*
[https://plus.google.com/114199149796022210033]
*
On Fri, Jan 27, 2012 at 13:51, Ken Whistler k...@sybase.com wrote:
On 1/27/2012 1:16 PM, Matt Ma wrote:
Hi,
There are a few characters having no decomposition type defined in
UnicodeData.txt, but they were assigned tertiary weight in
allkeys.text as if the characters had decomposition type. Here are a
few examples (version 6.0.0),
...
U+A733, U+A732, U+1F1E6 were given tertiary weight as they were
compat, while U+31B4 as it werefinal.
Yep, that is all done deliberately, to make the default sorting a bit more
consistent.
The normative decompositions in UnicodeData.txt are only the starting point
for attempting to give more consistent default weights for collation.
Is this something documented outside of UCA?
No, because it is only relevant *to* UCA. At least as far as documentation
written by the UTC is concerned.
Well, I suppose it is also relevant to CLDR, because CLDR bases its
collation
tables on a tailoring of allkeys.txt from UCA. I don't know what
documentation
there may or may not be about the default treatment for tertiary weights
in CLDR. Somebody involved in the details of CLDR collation will have
to answer that one.
--Ken