URL:
  <https://savannah.gnu.org/bugs/?67310>

                 Summary: [troff] want a "hyphenation barrier" node type
                   Group: GNU roff
               Submitter: gbranden
               Submitted: Sat 12 Jul 2025 06:03:41 PM GMT
                Category: Core
                Severity: 1 - Wish
              Item Group: Feature change
                  Status: Postponed
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
         Planned Release: None


    _______________________________________________________

Follow-up Comments:


-------------------------------------------------------
Date: Sat 12 Jul 2025 06:03:41 PM GMT By: G. Branden Robinson <gbranden>
Our "sboxes.pdf" document exhibits a general problem with hyphenation control
in GNU _troff_.

Its presentation of the "BOXSTART" macro reads in part:


latter for a border with no fill.  The specified WEIGHT is used if the box is
OUT-
LINED.


That word "OUTLINED" is in bold, and because it is literal, it would be nice
if it weren't hyphenated.

The document quotes its own source code, so we can gain some insight into the
intended formatting.  And diagnose carelessness if present.


.\" Define a macro for code literals; use bold and disable hyphenation. 
.de Lt
.  ft B
.  nh
.  nop \&\\$1\c
.  hy \\n[HY]
.  ft
.  nop \&\\$2
..


Later, we have...


The specified
.Lt WEIGHT
is used if the box is
.Lt OUTLINED .


We can see that `nh` was used prior to setting the term.

And then reënabled for the "trailer" text set abutting the term.  In this
case, that's punctuation.

"OUTLINED" was clearly formatted with `nh` in effect.  The period was exposed
to hyphenation, but who cares?  The formatter won't hyphenate around a period
by default anyway; both its "cflags" and its hyphenation code of zero prevent
that.

So what went wrong?

The problem appears to be deeply internal: the only hyphenation mode that
applies to breaking decisions is the one in effect at the position the
formatter wants to break it.  No node is emitted into the output line to
indicate a change of hyphenation mode.  That was thought unnecessary, I
surmise, because the hyphenation mode is a property of the environment.

Here's what I think happened.

* The formatter added "OUTLINED." to the pending output line.
* It hit a (breakable) space character (a newline--same thing when filling is
enabled), so fired up the line-breaking machine.
* The line-breaking machine noticed with alarm that the pending output line
length exceeded the environment's configured line length.
* That machine furthermore noticed that hyphenation was enabled.  (All that
matters for our purposes is that the mode was nonzero.)
* It therefore started marching backwards along the output line, looking for
characters with hyphenation codes permitting hyphenation breaks to be
imposed.
* It found one.

Here's another demonstration.


$ printf '.ll 40n\n.nh\ndonthyphenatemebro donthyphenatemebro
donthyphenatemebro\\c\n.hy\n\\&.\\|.\\|.\n' | groff -a
<beginning of page>
donthyphenatemebro donthyphenatemebro don<hy>
thyphenatemebro...


I don't aim to tackle this before the _groff_ 1.25 development cycle opens.







    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?67310>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature

Reply via email to