On Thu, 14 Mar 2013 00:19:15 +0000 "Whistler, Ken" <[email protected]> wrote:
> What is being corrected in the current text of the standard is > separating the description of the format of DUCET, which *does* use 3 > 16-bit fields to record the 3 weights for each entry, from the > logical description of tables and the algorithm, which does not > depend on any particular bit size for the weight values. Actually, there is a subtle and nasty difference, but probably one that will very rarely strike practical use. It's most obvious manifestation is in the application of the UCA parametric tailoring topVariable="u2FD5". U+2FD5 KANGXI RADICAL FLUTE is the last symbol in UnicodeData.txt by collating order and has a compatibility decomposition to U+9FA0 and therefore the same primary weights. Although I can't find a clear official definition of the semantics of 'topVariable', I do remember being told that it simply uses the first positive primary in the collation key as the maximum variable weight. Now in allkeys.txt, U+2FD5 expands to two collation elements. However, in FractionalUCA.txt, which specifies 32-bit (fractional) weights, it has a single collation element. Consequently, the effect of this tailoring will be different depending on how the collation elements are expressed! For what it is worth, I think the interpretation based on 32-bit weights is more natural. The natural solution is to treat 'large weights' as being composed of an integer part and a fractional part for the purposes of variable weighting. Richard.

