On 4/10/17, Barbara Beeton <[email protected]> wrote:
> the basic ("knuthian") tex hyphenation algorithm
> does not handle any words with diacritics, and
> that is what the us list is based on.

I see.  Werner (or anyone else familiar with the groff side of
things), is this limitation also present in groff?  Or could groff's
version of tmac/hyphenex.us be put into Latin-9 encoding to
accommodate these words?  The Latin-1 and -2 encodings of groff's
other two tmac/hyphenex.* files suggests that the groff framework can
handle such words.

> i'm surprised that the
> encoding is (still?) listed as latin-* -- there
> has been an effort to support utf8, so i (perhaps
> rashly) assumed that would be the base encoding.

To clarify, I'm only looking at the groff code base, and these groff
files haven't been touched in a decade:

http://git.savannah.gnu.org/gitweb/?p=groff.git;a=history;f=tmac/hyphenex.cs;h=dc8a711a872959ab41d30edd5cea3cec82229fdd;hb=HEAD

http://git.savannah.gnu.org/gitweb/?p=groff.git;a=history;f=tmac/hyphenex.det;h=c74eebabff8e35353fdfb176a5c98df56c3e4ea0;hb=HEAD

Their encodings on the TeX side may have been updated, and the changes
never pulled to groff.  In contrast (and probably because of this
thread), groff's tmac/hyphenex.us was updated from TeX four days ago:

http://git.savannah.gnu.org/gitweb/?p=groff.git;a=history;f=tmac/hyphenex.us;h=3f3075cbe9944d38cc6c1355cfdc1e18dbb07ed5;hb=HEAD

This file does not specify any encoding, but its entire contents fall
into 7-bit ASCII.

_______________________________________________
bug-groff mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-groff

Reply via email to