Am 23.06.2011 um 11:46 schrieb Patrick Gundlach:

> Another question regarding hyphenation (thanks Paul and Taco for the answer 
> to the first one).
> 
> Hyphenation is only done when the lccode of each char is not 0. Now most 
> languages have chars beyond a-z, such as Ä or é or Л. Now how do I set these 
> lccodes?
> 
> Currently I do something like:
> 
> for i in string.utfvalues("äÄöÖüÜß") do
>  tex.lccode[i] = i
> end
> 
> but this has two disadvantages I can see:
> 
> 1) I have to manually pick the foreign characters and set the lccode manually
> 2) What is the lowercase of I (LATIN CAPITAL LETTER I)? Is it i or ı (LATIN 
> SMALL LETTER DOTLESS I)?
> 
> 
> I guess that I should use a unicode data table for the characters. But that 
> is still not 100% correct for languages like Turkish and Azeri, right? Since 
> the lccodes are not language-local, we cannot achieve a 100% correct 
> solution, correct?

No, and even without those local-dependent cases, it would still be impossible 
to build a correct lccode/uccode table since lowercasing/uppercasing one 
character is context-dependent and can result in more than one character: the 
uppercase of ß is SS. \lccode/\uccode (and by extension, \lowercase/\uppercase) 
is just not usable in the Unicode world. LuaTeX might implement the casing 
algorithms (with tailoring) described in section 3.13 of the standard. This 
includes
- Locale-dependent mappings
- Context-dependent mappings
- Length-changing mappings

Reply via email to