2014-06-25 10:52 GMT+02:00 Daniel Bünzli <[email protected]>:
> Le mercredi, 25 juin 2014 à 09:10, Richard Wordingham a écrit : > > Yes - with the caveat that the uppercase mapping of U+0345 is too > > complicated to defined formally. > > > > On the other hand, the Lowercase_Mapping property seems to be inadequate > > for the default lowercase mapping - Greek final sigma is the > > complication. > > So what you seem to imply is that Unicode’s default full casing are > defined by applying > > 1) The unconditional mappings of SpecialCasing.txt > 2) The conditional mappings of SpecialCasing.txt (there’s only one, the > final > sigma case). > There's also the Turkic i or j (problems related to letters that are usually soft-dotted in the Latin script except in Turkic languages, whose case mapping is context-dependant with the right side to see if we need to add a combining dot above). We could insist to have Turkish texts using an explicit combining dot above after the dotless i (or j), biut most Turkish texts just use the plain ASCII letter, by reinterpreting its soft-dot as a hard dot, that needs to be added when converting to uppercase, and removed when conertng to lowercase. Note also that the dotless i or dotless j are not part of any case pair. For Turkish readers, a dotless i followed by an explicit combining dot above (hard dot) is not recommanded, and they use the standard ASCII letter directly, as if it was a precombined but decomposable letter. In Turkish texts, a dotless i without diacritic pairs with a capital ASCII letter I directly (this mapping to uppercase is *not* contextual,but the reverse conversion to lowercase *is* contextual).
_______________________________________________ Unicode mailing list [email protected] http://unicode.org/mailman/listinfo/unicode

