Follow-up Comment #16, bug #67363 (group groff):

[comment #14 comment #14:]
>> I think Russ went with option (2), which is good because sure enough
>> _groff_ 1.24.0 is not available for inclusion in Debian trixie
>> (scheduled for 9 August 2025).
> 
> Correct. podlators v6.0.0 and later now add ".if n .ds AD l" to every
> generated page.

Cool.

You might want to add


.if n .nr HY 0


to podlators-generated man pages.

>> My plan is for (1) _groff_'s _man_ (and _mdoc_) packages to check at
>> package initialization time for defined `AD` string (and defined
>> `HY` register). **If these exist**, they necessarily represent the
>> user's preferences (since they weren't read from a document) and are
>> each copied to a shared private name (prefixed with "andoc*" to
>> communicate this shared status; (2) upon encountering any new _man_
>> document at a `TH` macro call, or new _mdoc_ document at a `Dd`
>> macro call, these saved user preferences are reasserted to configure
>> the document's rendering, and existing `AD` and `HY` objects
>> removed; (3) page-local assignments of `AD` and `HY` continue to be
>> honored as before, and (4) at every new section, subsection, and
>> paragraph, adjustment and hyphenation modes are reset to the page's
>> preference if configured and the user's otherwise.
> 
> I like the combination of (1) and (2) as a solution for this problem
> and agree that this should work with podlators.

Good to hear.

> I think the important part for podlators going forward, given this
> behavior, is to make sure that the ".if n .ds AD l" line stays in the
> preamble (in other words, before .TH) so that groff can reassert user
> preferences at .TH and not be confused with (3). That's easy enough
> (and natural) to do.
That should not be necessary, and in fact I expect such an early
operation to be ignored.

The `TH` macro now (in recent _groff_ Git) does this:


.  \" When rendering multiple documents, we want to clear any page-local
.  \" manipulation of hyphenation and adjustment modes from the previous
.  \" document.
.  rr HY
.  rm AD
.
.  an*reset-hyphenation-mode \\n[andoc*HY]
.  an*reset-adjustment-mode \\*[andoc*AD]
.  an*reset-section-parameters
.  an*reset-paragraph-parameters
.  ll \\n[LL]u
.  in 0 \" Well-formed documents call `SH` after `TH`.
.  an*reset-tab-stops
.  an*reset-paragraph-spacing


So, what you want to do is keep that


.if n .ds AD l


right where it is.  I have automated tests that rely on this.


.ds AD l


is how, under the design proposed in this ticket and now implemented in
the Git trunk, a _man_(7) document tells the _groff_ _man_
implementation that it wants adjustment turned off ("aligned left").

[comment #15 comment #15:]
> My main remaining concern about (4) is that it sounds like you're
> planning on re-enabling hyphenation in the middle of the page. I am
> trying fairly hard to turn off hyphenation since in my experience the
> results of hyphenation are confusing and bad for technical documents.
> (Among other things, the hyphens are easy to confuse with ASCII
> hyphen-minus and interfere with cut and paste in some situations, and
> I also have no way of knowing the language the page is written in and
> English hyphenation rules may be completely wrong.) If the user really
> wants it and wants to override the page, that is, of course, fine.
> 
> Do I also need to start doing something with the HY register in order
> to set a preference for no hyphenation going forward?

Yes, as noted above,


.if n .nr HY 0


Is probably what you want, and moreover should work with every _groff_
in the world even in old deployments.  It's an old feature.[2]

_groff_man_(7):

Authors
     James Clark wrote the initial GNU implementation of the man macro
     package.  Later, Werner Lemberg ⟨w...@gnu.org⟩ supplied the S, LT,
     and cR registers, the last a 4.3BSD‐Reno mdoc(7) feature.  Larry
     Kollar ⟨kol...@alltel.net⟩ added the FT, HY, and SN registers; the
     HF string; and the PT and BT macros in groff 1.19 (2003). ...


(You don't have to worry about _mandoc_(1) because it never adjusts
output lines and never automatically hyphenates.  Like _groff_, though,
it does break at explicit hyphens.


$ { echo '.Pp'; for n in $(seq 10); do echo foo-bar; done; } | mandoc -man
()
()

foo-bar foo-bar foo-bar foo-bar foo-bar foo-bar foo-bar foo-bar foo-bar foo-
bar


()


>> I'm uncertain how prominently I want to document the last fact.  My
>> own preference regarding man page authorship practices is that
>> documents refrain from trying to manipulate these formatting
>> parameters.
> 
> And of course my preference is that software intended for output in
> terminals not use typesetting techniques like justification and
> hyphenation that were designed for typeset pages with fine-grained
> control over interword spacing and which, in the case of hyphenation,
> were intended as input to a human process that adjusted the
> hyphenation rules for the specific needs of that static layout of the
> document, something that will never apply to man pages formatted on
> wildly varying devices with wildly varying widths.

That is why modern typesetting systems like _TeX_ and _groff_ offer a
variety of configuration knobs for automatic hyphenation.

_groff_(7):

     The places within a word that are eligible for hyphenation are
     determined by language‐specific data (.hla, .hpf, and .hpfa) and
     lettercase relationships (.hcode and .hpfcode).  Furthermore,
     hyphenation of a word might be suppressed due to a limit on
     consecutive hyphenated lines (.hlm), a minimum line length
     threshold (.hym), or because the line can instead be adjusted with
     additional inter‐word space (.hys).


However, such fine-tuning is not really the point of this extension/
reform to the _man_ and _mdoc_ macro languages.  The point is expressed
in
[https://cgit.git.savannah.gnu.org/cgit/groff.git/tree/tmac/an.tmac?id=ffcb765b4526b16c4ddc940ddd24d9c924af4c9e#n155
the code comments].


.\" Resetting the adjustment mode is a complicated dance.
.\"   1.  Man pages sometimes disable adjustment--when they do, they
.\"       often forget to put it back the way it was.
.\"   2.  When they do remember to put it back, they often fail to do
.\"       so correctly because of the `ad` request's quirky semantics
.\"       starting from Seventh Edition Unix troff/nroff.  Briefly, the
.\"       `ad` request without arguments turns adjustment back on after
.\"       an `na` even if the previous adjustment mode was `l` (align to
.\"       the left with NO adjustment).
.\"   3.  The default adjustment mode historically has not been
.\"       predictable; it can depend on nroff vs. troff mode and on the
.\"       vendor of the *roff system in use.
.\"   4.  It's possible (and portable) to obtain the previous adjustment
.\"       mode via the `.j` register so that it can be saved prior to
.\"       meddling and restored later, but in practice man page authors
.\"       neglect to do so.
.\"   5.  groff man(7)'s `AD` string isn't supported everywhere.
.\"   6.  We want user preferences, if expressed, to override the page
.\"       author's.
.\"   7.  Even if we didn't want (6), one page author's can override
.\"       another's when formatting multiple man(7) documents in
.\"       sequence--we thus keep track of the initial adjustment mode.

.\" Resetting the hyphenation mode is a complicated dance.
.\"   1.  Man pages sometimes disable automatic hyphenation--when they
.\"       do, they nearly always forget to put it back the way it was.
.\"   2.  In AT&T troff there was no register exposing the hyphenation
.\"       mode (nor the enablement status of automatic hyphenation), so
.\"       no idioms for performing such restoration have arisen.
.\"   3.  groff man(7)'s `HY` register isn't supported everywhere.
.\"   4.  We want user preferences, if expressed, to override the page
.\"       author's.
.\"   5.  Even if we didn't want (4), one page author's can override
.\"       another's when formatting multiple man(7) documents in
.\"       sequence--we thus keep track of the initial hyphenation mode.


You may perceive that the problem here is that the _man_ macro language
had inadequate mechanisms for manipulating adjustment and hyphenation,
even on a temporary basis, and **even for typeset documents**.

With this extension, I serve two sets of users:

* Those who, like _podlators_, want to just shut these things off and
  keep them off until the page is done rendering.  (They have no right
  to dictate how subsequent pages get formatted.)
* Those who favor or don't mind adjustment and hyphenation, but are
  fine-tuning their typeset _man_ document, and wish to disable these
  features at the granularity of a paragraph.

I concede that my novelty here is inadequate to cope with situations
where adjustment and automatic hyphenation need to come and go _within a
paragraph_, but those are just about taken care of by other _man_ and
*roff features.

1.  The automatic hyphenation of any particular word can be disabled or
    overridden with the `\%` escape sequence, which is universally
    portable.

2.  Temporary alteration of adjustment _within a paragraph_ is almost
    never done.

    2a.  One might want a semi-displayed (that is, with line breaks
    before and after, but not vertically spaced) code example embedded
    within a paragraph.  For that, one can use `EX`/`EE`, optionally
    within `RS`/`RE`.

    2b.  Theoretically, one might want to toggle adjustment within a
    paragraph with no other change to rendering.  I have never seen an
    example of this, cannot imagine a use case, and see no point adding
    features to the _man_ or _mdoc_ packages to support it.  In such
    bizarre circumstances, I'd recommend punching through the floor to
    formatter requests.  Since the semantic content of the page should
    not change consequent to this sort of formatting alteration, I think
    we can't expect non-_roff_ formatters to support such contrivances
    and I would advise man page authors of that fact.

To elaborate slightly on the last point, when fine-tuning the
typesetting of _groff_'s own man pages for our PDF compilation thereof,
I make occasional recourse to formatter features, mainly to achieve more
pleasant page breaks.  (In commit messages and *roff comments, I term
these "cheats", but I understand them to be standard practices employed
by human typographers, with centuries of precedent.[1])  I feel no guilt
about this because if a non-_roff_ formatter discards these, no
information is lost.  (The human reader is probably employing a
character-cell terminal with an "infinite" page length anyway.)

Here's a recent example.


commit fa8b882f032d385efe295d89477b654234c58f8a
Author: G. Branden Robinson <g.branden.robin...@gmail.com>
Date:   Sat Aug 2 19:36:22 2025 -0500

    groff_man*(7): Improve page breaks.

    ...using techniques upright (economizing prose), questionable (poor
    man's keeps [`ne` requests]), and shady (manipulations of vertical
    spacing and inter-paragraph space).


> But the art of living in a civilization is finding useful compromises
> with people who don't realize that they're wrong about everything they
> disagree with us on. :)

Another aspect is not surrendering to despair and futility when
presented with technical challenges.  ;-)

Regards,
Branden

[1] W. Richard Stevens wrote in depth about applying these storied
    practices to book composition with *roff.

    http://kohala.com/start/pagelayout.html

[2] The code comments concede that the `HY` register isn't supported
    everywhere.  That's true, but Solaris 10, DWB 3.3, and Seventh
    Edition Unix _troff_s are all historical implementations, and
    Plan 9 from User Space is a something of a niche exhibit--which
    might nevertheless accept a patch from me if I submitted one.[3]
    _mandoc_ doesn't support it, but doesn't need to.

[3]



commit 10564b11755ff2d48d0f5073c46571e806fa6fb4
Author: Dmitri Vereshchagin <dmitri.vereshcha...@gmail.com>
Date:   Wed Jan 31 20:47:13 2024 +0300

    tmac/tmac.an: define .MR in a groff compatible way
    
    groff 1.23.0 added .MR to its -man macro package.  The NEWS file states
    that the inclusion of the macro "was prompted by its introduction to
    Plan 9 from User Space's troff in August 2020."  From d32deab it seems
    that the name for Plan 9 from User Space's implementation was suggested
    by groff maintainer G. Brandon Robinson.
    
    Not sure if the intention was to make these definitions compatible, but
    it would be nice if they were.
    
    Currently, Plan 9 from User Space's .MR expects its second argument to
    be parenthesized.  groff's .MR does not.  This results in extra
    parentheses appearing in manual references when viewing Plan 9 from User
    Space's manual pages on a system using groff.

commit a5d6857a3b912b43c88ef298c28d13d4623f9ef0
Author: Anthony Sorace <a...@9srv.net>
Date:   Wed Jan 29 10:27:03 2025 -0800

    man: don't paginate when using nroff
    
    This tells bin/man to set the register L to very high to avoid pagination
    and updates tmac/tmac.an to use that value, if it's set, to set the page
    length. This is per Plan 9's rc/bin/man and sys/lib/tmac/tmac.an.

commit d32deab17bfffa5bffc5fab3e6577558e40888c5
Author: Russ Cox <r...@swtch.com>
Date:   Sat Aug 15 20:07:38 2020 -0400

    tmac: rename IM (italic manual) to MR (manual reference)
    
    Suggested by G. Brandon Robinson.




    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?67363>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature

Reply via email to