Dear TeX hyphenators,
On 27.07.14, Mojca Miklavec wrote: ... > > Is setting the lccode of a character to itself the "normal" way for small > > letters? > Yes. That always needs to be done. Usually you don't need to do it for > latin scripts since LaTeX (and probably also plain TeX) already does > that for you, at least for the ascii range. XeTeX also sets the codes > for more or less the whole Unicode, I think. BTW: This is done by polyglossia (and since version 1.5 also by babel) via an excerpt from Apostolos' "xgreek" package, the file xgreek-fixes.def. However, this file is derived from an older version of xgreek.sty and misses some fixes done in Version 2.1 of package xgreek I have introduced some new \lccode-\uccode pairs that reflect current changes in Unicode 5.2 while I have corrected the values for an existing pair. It would be good if "polyglossia" could ship an updated xgreek-fixes.def. ... > > * The hyph-utf8 package has conversion rules for several 8-bit TeX font > > encodings. Currently not for LGR but this could/should be changed. > I would be happy to accept patches. I'm not competent enough in TeX > (as the Turing complete programming language) to write the conversion > myself. Would it be sufficient to provide a data file similar to the ones in hyph-utf8/source/generic/hyph-utf8/data/encodings ? It should be relatively easy to produce a file data/encodings/lgr.dat from the CB-Fonts' CB.enc. Is the format of the *.dat files documented? A problem might be that some pre-composed Unicode characters (accented capital Greek letters) are represented by two characters in LGR. > > The hyph-utf8 package shows that an automatic transcoding of the > > hyphenation pattern files is possible. I hope a cooperation between > > Dimitrios and Mojca will be able to overcome obstacles. > See above. I'm not saying it isn't possible, but I don't think it's > worth the effort (and it's awfully ugly code, for whoever is willing > to come up with it). In particular there's not much point in doing > on-the-fly conversion because we ended up doing external conversion > for the sake of pTeX anyway. So, it may be better to provide a conversion script from Greek Unicode hyphenation patterns to LGR-encoded ones in a modern script language. Alternatively, Claudio is working on a "hand conversion" of the patterns. Any thoughts? Günter
