On 4/10/17, Barbara Beeton <[email protected]> wrote: > the basic ("knuthian") tex hyphenation algorithm > does not handle any words with diacritics, and > that is what the us list is based on.
I see. Werner (or anyone else familiar with the groff side of things), is this limitation also present in groff? Or could groff's version of tmac/hyphenex.us be put into Latin-9 encoding to accommodate these words? The Latin-1 and -2 encodings of groff's other two tmac/hyphenex.* files suggests that the groff framework can handle such words. > i'm surprised that the > encoding is (still?) listed as latin-* -- there > has been an effort to support utf8, so i (perhaps > rashly) assumed that would be the base encoding. To clarify, I'm only looking at the groff code base, and these groff files haven't been touched in a decade: http://git.savannah.gnu.org/gitweb/?p=groff.git;a=history;f=tmac/hyphenex.cs;h=dc8a711a872959ab41d30edd5cea3cec82229fdd;hb=HEAD http://git.savannah.gnu.org/gitweb/?p=groff.git;a=history;f=tmac/hyphenex.det;h=c74eebabff8e35353fdfb176a5c98df56c3e4ea0;hb=HEAD Their encodings on the TeX side may have been updated, and the changes never pulled to groff. In contrast (and probably because of this thread), groff's tmac/hyphenex.us was updated from TeX four days ago: http://git.savannah.gnu.org/gitweb/?p=groff.git;a=history;f=tmac/hyphenex.us;h=3f3075cbe9944d38cc6c1355cfdc1e18dbb07ed5;hb=HEAD This file does not specify any encoding, but its entire contents fall into 7-bit ASCII. _______________________________________________ bug-groff mailing list [email protected] https://lists.gnu.org/mailman/listinfo/bug-groff
