On Mon, 21 May 2012 17:07:33 -0700 Markus Scherer <markus....@gmail.com> wrote:
> In principle, it's straightforward: Lowercase and uppercase follow > Unicode (UCD) case properties. We distinguish an intermediate "mixed > case" for titlecase characters and mixed-case contractions. I believe > we also distinguish small/normal Kana as lowercase/uppercase. I can > dig up the ICU code that computes the collation case bits for a > string. Is this code in ICU 4.4.2 (the version for the Linux I run), or should I be looking at ICU 49? > I don't know whether CLDR/LDML should require all of the details, but > there should at least be informative documentation. If they are to define collation, they have to define how the order results from the tailoring. Of course, it can be done by reference, but while saying 'as in UCA' is entirely appropriate where the UCA is adequately defined (some tailorings clearly are not, and work is under way to fix some of these shortfalls), I am uneasy at 'as in ICU'. Richard.