Re: [tex-hyphen] missing hyphen points in Greek

Claudio Beccari Mon, 28 Jul 2014 14:06:07 -0700

Dear hyphenators,

I am addressing thei mai to the mailng list, but I hyghlight theoriginal participants in order to be sure that they receive thecontinuation of the discussion we started some weeks ago. May be Mojkarecieves double messages; in case let me know, and I know that you arerecieving the message as a member of the mailing list.I just wanted to tell you that I manually upgraded the the pattern filefor LGR encoded pattern relative to polytonic Greek.

When Dimitrios returns home, please, woulld you please send me a shortsignificant text in modern polytonic Greek, written with utf-8 encoding,because I have no access to such kind of texts. I tested the upgradedpolytonic patterns, I actually used a polytonic stretch of ancient greektext, but of course ancient Greek does not contain any neologism, modernnames, nor Greek renderings of foreign words.


In a day or two I start the upgrading of the ancient Greek patterns.

Cheers
Claudio



On 27/07/2014 22:51, Guenter Milde wrote:

Dear TeX hyphenators,


On 27.07.14, Mojca Miklavec wrote:

...

Is setting the lccode of a character to itself the "normal" way for small
letters?

Yes. That always needs to be done. Usually you don't need to do it for
latin scripts since LaTeX (and probably also plain TeX) already does
that for you, at least for the ascii range. XeTeX also sets the codes
for more or less the whole Unicode, I think.

BTW: This is done by polyglossia (and since version 1.5 also by babel)
via an excerpt from Apostolos' "xgreek" package, the file
xgreek-fixes.def.
However, this file is derived from an older version of xgreek.sty and
misses some fixes done in

   Version 2.1 of package xgreek

I have introduced some new \lccode-\uccode pairs that

   reflect current changes in Unicode 5.2 while I have corrected the
   values for an existing pair.

It would be good if "polyglossia" could ship an updated xgreek-fixes.def.

...

* The hyph-utf8 package has conversion rules for several 8-bit TeX font
  encodings. Currently not for LGR but this could/should be changed.

I would be happy to accept patches. I'm not competent enough in TeX
(as the Turing complete programming language) to write the conversion
myself.

Would it be sufficient to provide a data file similar to
the ones in  hyph-utf8/source/generic/hyph-utf8/data/encodings ?

It should be relatively easy to produce a file data/encodings/lgr.dat
from the CB-Fonts' CB.enc.

Is the format of the *.dat files documented?

A problem might be that some pre-composed Unicode characters (accented
capital Greek letters) are represented by two characters in LGR.

The hyph-utf8 package shows that an automatic transcoding of the
hyphenation pattern files is possible. I hope a cooperation between
Dimitrios and Mojca will be able to overcome obstacles.

See above. I'm not saying it isn't possible, but I don't think it's
worth the effort (and it's awfully ugly code, for whoever is willing
to come up with it). In particular there's not much point in doing
on-the-fly conversion because we ended up doing external conversion
for the sake of pTeX anyway.

So, it may be better to provide a conversion script from Greek Unicode
hyphenation patterns to LGR-encoded ones in a modern script language.

Alternatively, Claudio is working on a "hand conversion" of the patterns.
Any thoughts?

Günter

Re: [tex-hyphen] missing hyphen points in Greek

Reply via email to