[XeTeX] \hyphenation{} and combining diacritics

Joshua and Amy Fri, 08 Jul 2011 13:04:08 -0700

I'm creating some hyphenation rules for Jarai texts that I'm
interlinearizing. Here's the problem: In various texts, a complex character
such as LATIN SMALL LETTER A WITH BREVE might be encoded as a single code
point (U+0103) or as a combination of code points (LATIN SMALL LETTER A:
U+0061 plus COMBINING BREVE: U+0306). The \hyphenation{} command does not
treat the two things as the same, meaning that I have to create two versions
of a word if it has one accented character, four versions if it has two
accented character, nine versions if it has three, etc. For example:


\hyphenation{hơ-nuă hơ-nuă hơ-nuă hơ-nuă}

(because O WITH HORN can be two code points or one)

Is there a simple way to tell (Xe)LaTeX to treat precomposed and uncomposed
characters identically without having to put in all the possibilities?


--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

[XeTeX] \hyphenation{} and combining diacritics

Reply via email to