So, if I understand the situation correctly, groff gets its hyphenation information from TeX. TeX isn't accommodating any English words with non-ASCII characters because of its hyphenation algorithm's limitations, and Werner is reluctant to have groff accommodate them because of the maintenance complexity of modifying or augmenting the TeX rules. Is this a fair summation?
Can TeX's list of patterns be expanded to include letters with diacritics without breaking TeX's English hyphenation algorithm? That is, if Latin-9 characters are included, will the algorithm simply ignore them, or fail? On 4/12/17, Werner LEMBERG <[email protected]> wrote: > The very issue is rather that *users* are not accomodated to select an > input and/or font encoding while typesetting US English texts. Probably true in general. However, those English users who write about résumés or Blue Öyster Cult -- and who care enough to get details correct -- will either learn how to produce Latin-1 characters (which groff accepts), or learn the escape sequences in groff (and I presume TeX has an equivalent mechanism) that allow these characters to be represented with ASCII input. The user can, of course, use .hw to correctly break the occasional such word in predominantly ASCII English text, However, it's far from intuitive that such accommodation is the user's responsibility, when all other hyphenation Just Works without the user having to think about it. It would be nice if these sorts of words worked out of the box. Side note: groff does, I observe, correctly break "öyster" (which is technically not even a real English word) but not "résumé" (which is not only a real word, but needs the accents to distinguish it from the unrelated word "resume"). I assume this is because no hyphenation point of öyster is adjacent to the non-ASCII letter. _______________________________________________ bug-groff mailing list [email protected] https://lists.gnu.org/mailman/listinfo/bug-groff
