On 5/21/17, Barbara Beeton <[email protected]> wrote: > not a legal requirement that would hold up in court, > but a courtesy to knuth, which I suspect would be > backed by a large segment of the computer science > community,
I've no interest in trying to unseat tradition. What I wondered was whether it's practical to create a superset file that, when processed to remove non-ASCII lines, generates the historical Knuthian pattern file. This allows unchanged historical functionality while not impeding modern relevancy. But Karl Berry points to perhaps a better way forward for groff: On 5/22/17, Karl Berry <[email protected]> wrote: > 1) Gerard Kuikens created, years ago, a huge set of additional patterns > for US English. As I recall, they covered all known exceptions at the > time he made them. They have been available in TeX Live as language > "usenglishmax" (among other names). As far as I know, he is still > willing to maintain it, if anyone had bugs/requests. The patterns are > (nowadays) in TL's file > texmf-dist/tex/generic/hyph-utf8/patterns/txt/hyph-en-us.pat.txt Does it make sense for groff to use a pattern list that can be updated as needed, rather than one frozen by tradition? Is the one cited above a good choice? On my system, texmf-dist/tex/generic/hyph-utf8/patterns/txt/hyph-en-us.pat.txt contains only ASCII, while many other files in this directory have UTF-8 characters. This implies to me that there's no technical limitation to adding non-ASCII patterns to hyph-en-us.pat.txt -- is that accurate? > 2) Although we certainly aren't going to change the default typesetting > done by "tex" (or "latex" or "pdflatex", or, I suppose, "groff"), I see > nothing in principle that stops the addition of UTF-8/Latin-N/whatever > patterns, to be enabled in a given document. The frozenness of Knuth's > patterns, while certainly true, is not a block to moving forward. In groff, I think a better design decision is to break all English words correctly by default rather than requiring an option or request to enable such behavior. But it's a hypothetical decision until base groff knows how to handle words with accented characters at all. > 3) Besides Liang's thesis, you may be interested in the > information/links about the current state of TeX hyphenation at > http://tug.org/tex-hyphen. Also Mojca and Arthur's paper (they are the > instigators and principal maintainers of hyph-utf8) last year about it: > http://tug.org/TUGboat/tb37-2/tb116miklavec.pdf > > best, > karl _______________________________________________ bug-groff mailing list [email protected] https://lists.gnu.org/mailman/listinfo/bug-groff
