On Wed, Jul 4, 2012 at 6:12 PM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
I've automated the check and have something like a 6 page list
of anomalies in level 4 weights, with anomalies for DUCET and for the
CLDR root locale.
From what I have heard, the level-4 weights in
On Fri, 25 May 2012 12:34:01 -0700
Markus Scherer markus@gmail.com wrote:
On Thu, May 24, 2012 at 5:36 PM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
I spotted two differences flicking through the end of the
differences -
Nice work! Please submit your findings via
On Thu, May 24, 2012 at 5:36 PM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
I spotted two differences flicking through the end of the differences -
Nice work! Please submit your findings via the Unicode reporting
formhttp://www.unicode.org/reporting.html
.
As ICU does not load
On Wed, 23 May 2012 17:47:09 -0700
Markus Scherer markus@gmail.com wrote:
Also, I just saw that
http://www.unicode.org/Public/UCA/latest/CollationAuxiliary.zipcontains
allkeys_CLDR.txt which should correspond 1:1 with the
FractionalUCA*.txt in the same .zip file.
One format difference:
On Tue, May 22, 2012 at 2:22 PM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
I can dig up the ICU code that computes the
collation case bits for a string.
It would be helpful. I can't see well enough how the data gets in.
I found the code that computes the case bits (2
On Wed, 23 May 2012 10:35:46 -0700
Markus Scherer markus@gmail.com wrote:
On Tue, May 22, 2012 at 2:22 PM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
I found the code that computes the case bits (2 bits for
lower/mixed/upper) for building ICU tailorings. Search for
On Wed, May 23, 2012 at 2:01 PM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
While we're picking on that poor routine - it looks as though it could
come unstuck with kana in the supplementary planes - the Kana
Supplement, and possibly also the Enclosed Ideographic Supplement.
On Wed, 23 May 2012 15:50:24 -0700
Markus Scherer markus@gmail.com wrote:
On Wed, May 23, 2012 at 2:01 PM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
While we're picking on that poor routine - it looks as though it
could come unstuck with kana in the supplementary
On Wed, May 23, 2012 at 5:17 PM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
Is there a definition of the precise
relationship between DUCET and FractionalUCA.txt, or does
FractionalUCA.txt define the relationship?
See
On Wed, 23 May 2012 17:47:09 -0700
Markus Scherer markus@gmail.com wrote:
On Wed, May 23, 2012 at 5:17 PM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
The order of code points and contractions as listed in
FractionalUCA.txt and allkeys.txt should be the same, except for
On Wed, 23 May 2012 15:50:24 -0700
Markus Scherer markus@gmail.com wrote:
On Wed, May 23, 2012 at 2:01 PM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
Is there a definition of the precise
relationship between DUCET and FractionalUCA.txt, or does
FractionalUCA.txt
On Wed, May 23, 2012 at 7:19 PM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
A practical example is the four contractions I've proposed to restore
the collation of Tibetan vowels following a subscript RA. If they're
added to DUCET, will they automatically be included in the
On Mon, 21 May 2012 17:07:33 -0700
Markus Scherer markus@gmail.com wrote:
In principle, it's straightforward: Lowercase and uppercase follow
Unicode (UCD) case properties. We distinguish an intermediate mixed
case for titlecase characters and mixed-case contractions. I believe
we also
On Tue, May 22, 2012 at 1:09 AM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
On Mon, 21 May 2012 17:07:33 -0700
Markus Scherer markus@gmail.com wrote:
In principle, it's straightforward: Lowercase and uppercase follow
Unicode (UCD) case properties. We distinguish an
On Tue, 22 May 2012 08:33:43 -0700
Markus Scherer markus@gmail.com wrote:
On Tue, May 22, 2012 at 1:09 AM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
On Mon, 21 May 2012 17:07:33 -0700
Markus Scherer markus@gmail.com wrote:
I can dig up the ICU code that
What are the definitions of upper and lower case for the caseFirst
tailoring for the UCA and for LDML? I can't find any obvious
definition.
My suspicion is that they are defined by assignment of the DUCET
tertiary weights, UTS#10 Issue 23 (Version 6.1.0) Section 7.2.
Although these largely
On Mon, May 21, 2012 at 4:37 PM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
What are the definitions of upper and lower case for the caseFirst
tailoring for the UCA and for LDML? I can't find any obvious
definition.
I am having trouble finding a published definition too. I
On 5/21/2012 4:37 PM, Richard Wordingham wrote:
Again, even the interpretation of uppercase in terms of weights is not
certain, for the ISO/IEC 14651:2007 example of a tailoring for
uppercase first does not adjust the collation elements with a tertiary
weight of 1C, although they are listed as
On Mon, 21 May 2012 17:43:27 -0700
Ken Whistler k...@sybase.com wrote:
For example, when caseFirst is set to
uppercase, ICU orders U+1D34 MODIFIER LETTER CAPITAL H before
U+0068 LATIN SMALL LETTER H, but anomalously order U+A7F8 MODIFIER
LETTER CAPITAL H WITH STROKE*after* U+0127 LATIN
19 matches
Mail list logo