On Fri, 15 Mar 2013 13:52:39 -0700 Markus Scherer <[email protected]> wrote:
> On Fri, Mar 15, 2013 at 12:50 PM, Richard Wordingham < > [email protected]> wrote: > > Not quite. The characterisation of variable weights knows nothing > > of the concept, and that is the problem. > That's a problem in some implementations, but not a problem with the > concept. Nothing prevents you from defining a variableTop that > contains a string of "large weights", and comparing that lexically > with the string of "large weights" that you look up for a character > or substring. In fact, that's really what ICU does, except the > current code is limited to one-or-two units (bytes). I would say that the UCA Section 6.2 stops me. It clearly says that the generic example '[(X+1).zzzz.wwww], [yyyy.0000.0000]' is two collation elements, not one. Now, if I used allkeys_CLDR.txt as a convenient expression of FractionalUCA.txt rather than in its own right, I might now be able to argue that the large weights were just a convenient internal representation of a 32-bit weight. A possible argument is that although a tailoring has to be defined by a 'well-defined syntax' (What syntax defines FractionalUCA.txt? Is it 'Use this instead'?), there is no requirement that this syntax has to have well-defined semantics. So, if the string specified by variableTop has a primary starting with a 'large weight', I could interpret that to mean that the 2-element large weights are to be converted to 32-bit weights. Does anyone accept this argument? > In CLDR/ICU's FractionalUCA.txt, all but 40 or so of the primary > weights (and many of the secondary weights) use the "large weights" > mechanism. No, they're 32-bit weights expressed by omitting trailing zero bytes. More precisely, are they not defined to be fractional weights? Richard.

