Am 23.06.2011 um 13:06 schrieb Taco Hoekwater: > > > On 06/23/11 13:02, Philipp Stephani wrote: >> >> Am 23.06.2011 um 12:35 schrieb Taco Hoekwater: >> >>> >>>> >>>> No, and even without those local-dependent cases, it would still be >>>> impossible to build a correct lccode/uccode table since >>>> lowercasing/uppercasing one character is context-dependent and can >>>> result in more than one character: the uppercase of ß is SS. >>> >>> Well, in this particular example, there is ... U+1E9E ! :) :) >> >> No, the capital ß is not the uppercase version of ß. SpecialCasing.txt has > > I think you missed the smileys.
Those weren't ;-) smileys, right? ;-) > >>> >>> You do not happen to be bored and looking for something interesting >>> to do in the coming few weeks, by any chance? >> >> The casing algorithms are already implemented in the ICU library (and >> probably in other libraries as well). > > Sure, but that does not help luatex, does it? I, for one, am not going > to wade through the ICU sources trying to extract an algorithm that > is fairly easy to implement based on the actual specification. Where do you want these algorithms actually? I think for case conversion user should just ignore \lowercase etc. and use a (to-be-written) Lua wrapper to ICU (icu4lua seems to be dormant and includes only a tiny fraction of ICU). Hyphenation is another story, but I think it has to be rethought anyway (multistage hyphenation, composed words, dictionary-based algorithms, locale-tailored algorithms etc.). Unfortunately I'm not aware of a good free hyphenation library.
