Hi Gavin,

At 2026-01-11T16:15:25+0000, Gavin Smith wrote:
> On Sun, Jan 11, 2026 at 08:01:29AM -0600, G. Branden Robinson wrote:
> > If info cannot interpret these escape sequences, it should discard
> > them.
> > 
> > If info cannot parse these sequences well enough to reliably discard
> > them, it should ask man(1) or {g,n}roff(1) not to generate them.
> 
> It is probably easy enough to discard them.  We could discard all OSC
> sequences.

That would be consistent with the expectations of ECMA-48, I think.[0]
I can't find the exact sentence in the 5th edition (1991) of that
standard that mandates the discard of unsupported control sequences, but
I'm willing to research the issue more deeply if you maintain that
info(1) is behaving conformantly with it.

> I'd never seen the problem before, but today I ran "info grotty" on
> my system and saw the misdisplayed sequences.
> 
> Viewing the manpage for "groff" via info gives very deformed output;
> it is practically unusable.  You don't just output these sequences
> for web URLs, but also use "man:*" URLs for any references to other
> manpages.

Yes.  Those are man page hyperlinks, a new feature of groff 1.23.0.[1]
They're supported by many applications, as noted in the "OSC 8 Adoption"
link I shared previously, including the gnome-terminal emulator program,
and by the less(1) pager, which in recent versions binds key sequences
starting with ^O to hyperlink navigation features.[2]  That's useful on
terminal emulators that don't support OSC 8 (but correctly ignore
sequences they don't support), like xterm.

> Fortunately, it seems that not too many manpages are generated with
> these sequences, except groff's own manpages.  I suggest you do not
> start outputting these sequences by default for any manpage
> cross-references, otherwise there are too many.

On the contrary, the plan is for wider adoption.  Alejandro Colomon of
the Linux man-pages project has been waiting on me for a while to finish
submitting a series of patches that would convert the 3,100 or so man
pages that project distributes to use of groff man(7)'s `MR` macro,
introduced in groff 1.23.0, which enables production of the hyperlinks.

> The occasional web URL is probably ok.
> 
> This change to groff output also breaks any other program that would
> use the output from "man".

It breaks programs that don't correctly support ECMA-48.  Unsupported or
malformed escape sequences must be discarded, not emitted literally.

At 2026-01-11T16:26:57+0000, Gavin Smith wrote:
> I should add that Info runs on a wide variety of Unix-like systems,

...as does groff.

> not just those using groff or particular versions of "man", so using
> particular command-line invocations or setting particular environment
> variables is unlikely to be reliable.

Only somewhat true, which is one reason I proposed two mechanisms for
the "info" program to collect man page text, since I don't know
precisely which technique it uses.

> For example, you suggested setting MANROFFOPT=-rU0, which looks like
> an option to be passed to a "roff" program by "man", but the program
> might not recognise that option, and then you might not get a manpage
> at all.

Your objection is premised on incomplete information.  Setting the
"MANROFFOPT" environment variable will indeed have no effect with man(1)
programs other than man-db man(1).  Brouwer/Lucifredi man, formerly used
by Red Hat, has been defunct for over 10 years.[3]  Other man(1)
programs still in use include Solaris's, which runs System V nroff (or
troff) on Solaris 10, and an old version of groff on Solaris 11;[4]
FreeBSD's "man" shell script; and mandoc(1)'s man program--all of which
will ignore it harmlessly, like any other unrecognied environment
variable.[5]

Second, a "-rU0" command-line option will in fact be recognized by any
"roff" program except the one actually called "roff", which to the best
of my knowledge last shipped in 2.9BSD in 1983.[6]

All lineages of nroff/troff since then support the '-r' command-line
option.[7]  What '-rU0' does is direct the formatter to assign the
register named 'U' the value '0'.  In *roff formatters, this is the same
result as not specifying it at all, since registers don't have to be
declared before use.  The formatter automatically assigns registers
values of zero if they are dereferenced before being defined.

However, on when using groff man(7), this command-line option overrides
any existing register assignment, as might be done in the "troffrc" file
or a macro package.  Since *roffs process command-line string and
register definitions before loading macro packages specified with the
'-m' option, what groff man(7) actually does is use a GNU troff
extension to check if the 'U' register is defined at all; if it is, it
must have been at the command line, so it does not override the value
the user specified.[8]

The net result is the same.  Passing '-rU0' to a *roff will not cause a
document to fail to render unless it programs itself not to do so in
that circumstace.  There is a remote possibility that a man page employs
a 'U' register for its own purposes, but this is vanishingly unlikely;
I've reviewed hundreds of man pages and grepped thousands.  The
extremely few man(7) authors who define registers at all attempt nothing
so dramatic, and they also tend to avoid use of single-letter register
names.  (The lone exception I'm aware of being the use of an 'F'
register to control index entry emission by perlpod, a man(7) document
_generator_.)  man(7) document authors are generally not sophisticated
in their exercise of formatter features, which sometimes frustrates them
but also makes man(7) document composition simpler than it would
otherwise be.

grotty's OSC 8 feature was planned and implemented with substantial
consideration, field trials, and user consultation.  The root of the
problem observed is info(1)'s poor conformance with ECMA-48.

If these problems have gone unraised by Texinfo users for a long time,
my surmise is that users of info(1), and of GNU Emacs's WoMan man
browser, have such low expectations of their rendering that they
disregard any formatting errors they see.  I observe that, for example,
WoMaN, which apparently attempts to parse man(7) document input for
itself instead of entrusting it to a man(1) program or to the nroff
command, misrenders `\c` and `\:` escape sequences.[9]

Regards,
Branden

[0] http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-048.pdf
[1] https://cgit.git.savannah.gnu.org/cgit/groff.git/tree/NEWS?h=1.23.0#n223
[2] https://www.greenwoodsoftware.com/less/news.661.html
[3] 
https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/7/html/migration_planning_guide/chap-red_hat_enterprise_linux-migration_planning_guide-changes_to_packages_functionality_and_support

[4] Explicit documentation of these facts seems scarce.  With an
    account, one can confirm this first-hand on gcc210.fsffrance.org and
    gcc211.fsffrance.org.

[5] https://cgit.freebsd.org/src/tree/usr.bin/man/man.sh

[6] https://minnie.tuhs.org/cgi-bin/utree.pl?file=2.9BSD/usr/man/cat1/roff.1

    You can use the search form on the parent page,
    <https://minnie.tuhs.org/cgi-bin/utree.pl>,
    to look for other occurrences of "/roff.1".  You will observe that
    it's missing from most descendants of Seventh Edition Unix; (a)
    Eighth Edition Unix [1985]; (b) Unix System III [1980]; (c) 3BSD
    [1980], and the VAX port of Seventh Edition, 32/V [1980].

[7] https://www.tuhs.org/cgi-bin/utree.pl?file=V10/vol2/troff/cstr.54
    https://www.troff.org/54.pdf (rendered form)

[8] 
https://cgit.git.savannah.gnu.org/cgit/groff.git/tree/tmac/an.tmac?h=1.23.0#n1495

[9] My installed version of Emacs is pretty old, though (27.1); maybe a
    newer release has fixed these defects.

    I would direct the WoMan author/maintainer to the "Portability"
    section of groff_man_style(7), a joint effort by mandoc(1)
    maintainer Ingo Schwarze and myself to describe a subset of
    man(7)+troff that developers of standalone man page formatters
    should support.

Attachment: signature.asc
Description: PGP signature

  • texi... Gavin Smith
    • ... Bruno Haible via Bug reports for the GNU Texinfo documentation system
      • ... Eli Zaretskii
        • ... Bruno Haible via Bug reports for the GNU Texinfo documentation system
          • ... Eli Zaretskii
      • ... Patrice Dumas
    • ... G. Branden Robinson
      • ... G. Branden Robinson
      • ... Gavin Smith
        • ... Gavin Smith
          • ... G. Branden Robinson
        • ... Per Bothner
          • ... Gavin Smith
    • ... Bruno Haible via Bug reports for the GNU Texinfo documentation system
      • ... Patrice Dumas
      • ... Patrice Dumas
        • ... Bruno Haible via Bug reports for the GNU Texinfo documentation system
    • ... Bruno Haible via Bug reports for the GNU Texinfo documentation system
      • ... Bruno Haible via Bug reports for the GNU Texinfo documentation system
    • ... Bruno Haible via Bug reports for the GNU Texinfo documentation system
      • ... Gavin Smith

Reply via email to