Hi Colin,

At 2025-08-26T18:33:23+0100, Colin Watson wrote:
> On Tue, Aug 26, 2025 at 07:05:49AM +0200, Alejandro Colomar wrote:
> > Hmmmm, that sounds not good at all.  How about moving this to
> > man(1)?  That is, man(1) knows whether it is being piped or not, and
> > thus can tell groff(1) to do OSC8 or not.

This point rouses a grievance I've had for a couple of decades.  Those
not interested may prefer to skip down to "MANROFFOPT" below, or even to
"stopgap measure".

Whether man(1) is involved in a pipeline need not have an impact on the
matter.  (The only reason for it to is the feebleness of pager programs,
whose authors conveniently forget that the purpose of their utility is
to _intercept the data stream a host transmits to a video terminal and
paginate it_, as soon as competently doing so requires more than minimal
interpretation of said data stream.  The Teletype Model 37 was
simple,[0] so pager authors decided that it was all they should have to
mess with.  That languor is what led to Debian's `GROFF_SGR`...episode
in the first place.  Pagers' Theodosian walls show a lot of damage these
days, but have not yet crumbled.  You still have to tell less(1) `-R`.

I think GNU coreutils (then known as fileutils, textutils, and
shellutils) took a wrong turn in the 1990s when GNU ls(1) started
almost-blindly blasting SGR escape sequences to the standard output, and
compounded the error by inviting the user to customize the feature using
the original ANSI X3.64 numeric indexing regime.  That decision gave a
hostage to fortune in the event terminals (or their emulators) ever
developed a greater color depth, which was definitely foreseeable, and
definitely happened, as an entire cottage industry grew up around the
"skinning" of desktop environments.  Why endure the angry fruit salad of
a 3- or 4-bit RGB(I) color space when your video card and monitor could
pump out 16- or 24-bit color and you could survey your domain in
modestly differentiated shades of salmon, chartreuse, and cerulean?

To clarify, why should an ls(1) user have to know that "35" means
"magenta foreground" and "47" means "white background"?  Why is this
implementation detail exposed to the user?  Can I play the tech bro's
"layering violation" card here?

Apparently I can blame Ulrich Drepper for this.

[coreutils Git]
commit c65e1fe89f81eaf82ecbff92efbc924cdca541cf
Author:     Jim Meyering <j...@meyering.net>
AuthorDate: Mon Nov 28 04:25:31 1994 +0000
Commit:     Jim Meyering <j...@meyering.net>
CommitDate: Mon Nov 28 04:25:31 1994 +0000

    `colorize' patch from Drepper.
...
+/* Nonzero means use colors to mark types.  Also define the different
+   colors as well as the stuff for the LS_COLORS environment variable.
+   The LS_COLORS variable is now in a termcap-like format.  -o or
+   --color-if-tty. */
...
+/* Parse the LS_COLORS/LS_COLOURS variable */
+
+static void
+parse_ls_color ()
+{
+  register char *p;           /* Pointer to character being parsed */
+  char *whichvar;             /* LS_COLORS or LS_COLOURS? */
+  int state;                  /* State of parser */
+  int ind_no;                 /* Indicator number */
+  int ccount;                 /* Character count */
+  int num;                    /* Escape char numeral */
+  char label[3] = "??";               /* Indicator label */
+
+  if ( (p = getenv(whichvar = "LS_COLORS")) ||
+       (p = getenv(whichvar = "LS_COLOURS")) )

Good to know.  Aware of his reputation, if I ever need to get into a war
of invective over this blunder, I can expect him to join the battle
adequately armed.

Anyway, this feature established a precedent in GNU tools and a herd of
other programs thundered through it.  I'm glad groff didn't!

> > And even for the case of the terminal, it is in a better position to
> > pass the information to groff(1); we'd still need points 1 (modified
> > for man(1)) and 2, but not 3, which is very ugly.
> 
> Doesn't man(1) have most of the same problem?  It needs to know
> whether the terminal emulator supports OSC 8, and I'm not aware of a
> way that it could discover that at the moment; it's not just a
> question of whether it's piped.  I don't think that "put it in
> man-db's configuration file" or "require a command-line option" would
> be particularly friendly solutions to that problem.

I agree, and the environment variable MANROFFOPT already exists anyway.

I've quoted Anton Shepelev before.

"`grotty' is not an appendix to a pager, but a program for printing
direct to the terminal.  Most terminals support those basic ANSI
control sequences, and many console programs freely use them.  If a
pager cannot transparently forward them to the terminal, it is a
problem of the pager, not of `grotty', and having a broken -man
configuration by default to just to appease `less' is stupid."

> If points 1 and 2 were handled in groff, then I wouldn't be
> necessarily opposed to having man(1) tell the formatter that rendered
> hyperlinks are acceptable, but it's not an area I'm all that familiar
> with.  I'd be happy to review patches provided that they retain
> compatibility with reasonably old groff versions (man-db currently
> supports groff >= 1.21).

Frustratingly, I think point 3 is the _easiest_ to do.  Point 1 is next,
it being a Simple Matter of Feature Addition with Autoconfery and
Extensive Writing of Regression Tests.

Point 2 is the hardest because it depends on other people, starting with
Thomas Dickey--who finds OSC 8 a dubious idea in the first place--
agreeing on a nomenclature.

Further points occur to me regarding the terminfo side of things:
calling the capability `o8` or `O8` may not be the best idea.  The
semantic of a terminal capability is the availability of a feature or
behavior of interest, not the mechanism by which it is obtained (or
avoided, as with the "glitch" capabilities).  terminfo's "sgr" is not
ill-chosen in this respect--it actually does stand for "select graphic
rendition(s)", even if almost no one remembers this fact.  So the
capability name should reflect what it's _for_: embedding a hyperlink.

Further, terminfo's string capability parameterization syntax _might_
not be powerful enough to support OSC 8 in the first place.  (I'll have
to spend time digesting a description of it[1] to decide.)  If true,
then there is no point selecting a short, termcap-compatible capability
name for it in the first place.  We can call the thing `url` or `hlink`
or `hyplk` or `hotsp` (PDF: "hotspot") or whatever.

> I could of course have man(1) unconditionally pass -rU0 to groff until
> the problem is resolved properly, which would at least preserve
> existing behaviour for users of unreleased groff 1.24.  I'm not sure
> whether that would be considered as playing Core War with the manual
> page system ...

A better stopgap measure can probably happen in "an.tmac" and "doc.tmac"
themselves.

groff has the `\V` escape sequence to interpolate an environment
variable's contents.  The packages could whitelist a set of `TERM`
terminal type names as commodious of OSC 8 hyperlinking.

Here's a sketch, interpolated into some existing logic.

.\" For most purposes, we treat the nroff devices equivalently.
.nr an*is-output-terminal 0
.if '\*(.T'ascii'  .nr an*is-output-terminal 1
.if '\*(.T'latin1' .nr an*is-output-terminal 1
.if '\*(.T'utf8'   .nr an*is-output-terminal 1
.
.nr an*can-hyperlink 0
.if \n[an*is-output-html] \
.  nr an*can-hyperlink 1
.
.if \n[an*is-output-terminal]) \{\
.  if '\?\V[TERM]\?'gnome-terminal'       .nr an*can-hyperlink 1
.  if '\?\V[TERM]\?'some-other-terminal'  .nr an*can-hyperlink 1
.  if '\?\V[TERM]\?'yet-another-terminal' .nr an*can-hyperlink 1
.\}
.
.if '\*[.T]'pdf' \
.  nr an*can-hyperlink 1
.
. \" Later...
.\" hyperlinked text desired
.if !r U \
.  nr U 1
.
.nr an*do-hyperlink 0
.if (\n[U] & \n[an*can-hyperlink]) .nr an*do-hyperlink 1

(I don't, off the top of my head, actually know of any terminal
emulators that implement OSC 8 besides gnome-terminal(1).[2])

I think the foregoing would relieve man-db of having to make any changes
to accommodate the groff 1.24 news item quoted earlier.[3]  (I _think_
that's the correct link, but lists.gnu.org is down. :-/ )

I'm not in love with this; I think it solves the wrong problem--a
terminal's _type name_ is not what determines whether it has a given
capability.  It's terminfo's (and the dead-but-unburied termcap's) job
to maintain knowledge of each terminal type's capability repertoire.

But for getting over a hump while the community sorts out its direction
on this matter, it's good enough.

Do you agree?  Am I missing something?

Regards,
Branden

[0]

$ infocmp tty37
#       Reconstructed via infocmp from file: 
/home/branden/ncurses-HEAD/share/terminfo.db
tty37|model 37 teletype,
        hc, os, xon,
        bel=^G, cr=\r, cub1=^H, cud1=\n, cuu1=\E7, hd=\E9, hu=\E8,
        ind=\n,

The Model 37 was not a video terminal at all, but a teletypewriter.
But, it seems, supporting it looked easy whereas supporting ANSI X3.64
(and later ECMA-48) looked hard, so pager developers decided, initially
at least, that terminals and their emulators really should just be
"glass TTYs".  We've all paid in frustration for this recalcitrance.
Naturally, these same pager programs sprouted options and features in an
arms race with each other, doubtless surpassing the virtual memory
requirements that would have been required to implement the 1979 version
of ANSI X3.64 in the first pace.

The people at the Bell Labs CSRC were shrewd.  They looked at the
problem of support for the then-vast variety of hardware terminals in
existence, perceived the mine field of irregular conformance with ANSI
X3.64 or any other standard, and noped directly to the Jerq/Blit/DMD
5620, which integrated the pager with the terminal emulator itself.

The rest is nearly forgotten history.[4]

[1] 
https://www.gnu.org/software/termutils/manual/termcap-1.3/html_mono/termcap.html#SEC19

[2] John Gardner keeps track of them, if someone would like to submit a
    patch.  :)

    https://github.com/Alhadis/OSC8-Adoption

[3] https://lists.gnu.org/archive/html/groff/2025-08/msg00051.html
[4] https://en.wikipedia.org/wiki/AT%26T_Computer_Systems

Attachment: signature.asc
Description: PGP signature

Reply via email to