Re: CaseFirst and CaseLevel Tailorings of UCA and LDML

Richard Wordingham Tue, 22 May 2012 01:20:24 -0700

On Mon, 21 May 2012 17:07:33 -0700
Markus Scherer <markus....@gmail.com> wrote:


> In principle, it's straightforward: Lowercase and uppercase follow
> Unicode (UCD) case properties. We distinguish an intermediate "mixed
> case" for titlecase characters and mixed-case contractions. I believe
> we also distinguish small/normal Kana as lowercase/uppercase. I can
> dig up the ICU code that computes the collation case bits for a
> string.

Is this code in ICU 4.4.2 (the version for the Linux I run), or should
I be looking at ICU 49?

> I don't know whether CLDR/LDML should require all of the details, but
> there should at least be informative documentation.

If they are to define collation, they have to define how the order
results from the tailoring.  Of course, it can be done by reference,
but while saying 'as in UCA' is entirely appropriate where the UCA is
adequately defined (some tailorings clearly are not, and work is under
way to fix some of these shortfalls), I am uneasy at 'as in ICU'. 

Richard.

Re: CaseFirst and CaseLevel Tailorings of UCA and LDML

Reply via email to