Follow-up Comment #5, bug #67372 (group groff):

At 2025-07-28T14:04:24-0400, G. Branden Robinson wrote:
> commit ad4fa80a3f2d66ed7e9d4342fbc58d2be07984ea
> Author: G. Branden Robinson <g.branden.robin...@gmail.com>
> Date:   Sat Oct 1 04:18:47 2022 -0500
>
> [troff]: Refactor to parallelize logic.
>
> * src/roff/troff/input.cpp: Refactor to parallelize logic in similar
> routines; namely, those handling escape sequences that accept newlines
> as argument delimiters.

There were six escape sequences at issue: \A, \B, \b, \o, \w, and \X.

Of these, only 3 are portable to all of the _troff_s available to me:
\b, \o, and \w.  (`\X` was a Kernighan _troff_ innovation, not available
in Seventh Edition Unix _troff_.)

So I thought I'd see what happens with an input using this interpolated
delimiter trick with V7, DWB 3.3, and Heirloom Doctools troffs, and
_groff_ 1.22.{3,4}, 1.23.0, and Git HEAD.

Exhibit:


$ cat ATTIC/escape-delimiter-fun.roff
.sp 1i \" space to accommodate bracket-building
.ds D abc
\b\*D+|+\*D
.br
\o\*D+|+\*D
.br
\w\*D+|+\*D


V7 Unix (using a shorter filename because it was 1979 and 14 characters
should be enough for anyone--you can be sure Ken Thompson wasn't going
to ever type anything that long):


$ pdp11 ./v7.simh

PDP-11 simulator V3.8-1
Disabling XQ
@boot
New Boot, known devices are hp ht rk rl rp tm vt
: rl(0,0)rl2unix
mem = 177856
# Restricted rights: Use, duplication, or disclosure
is subject to restrictions stated in your contract with
Western Electric Company, Inc.
Thu Sep 22 23:35:05 EDT 1988

login: dmr
$ cat > escfun.roff
.sp 1i \" space to accommodate bracket-building
.ds D abc
\b\*D+|+\*D
.br
\o\*D+|+\*D
.br
\w\*D+|+\*D
$ nroff escfun.roff | sed '/^$/d'
b
c
+
|
+bc
+bc
120bc
$ sync
$ sync
$ sync
$
login:
Simulation stopped, PC: 002306 (MOV (SP)+,177776)
sim> quit
Goodbye


Next up, DWB 3.3 _troff_, a cousin of Kernighan _troff_:


$ DWBHOME=. ./bin/nroff escape-delimiter-fun.roff | cat -s

b
c
+
|
+bc
+bc
120bc



Next, Heirloom Doctools _troff_, a descendant of DWB 2.0 (whence also
came Solaris _troff_, to the best of my knowledge):


$ ./bin/nroff escape-delimiter-fun.roff | cat -s

b
c
+
|
+bc
+bc
120bc



_groff_ 1.22.3:


$ ~/groff-1.22.3/bin/nroff -ww ATTIC/escape-delimiter-fun.roff | cat -s
ATTIC/escape-delimiter-fun.roff:5: warning: missing closing delimiter

b
c
+bc
c
120bc



_groff_ 1.22.4:


$ ~/groff-1.22.4/bin/nroff -ww ATTIC/escape-delimiter-fun.roff | cat -s
troff: ATTIC/escape-delimiter-fun.roff:5: warning: missing closing delimiter

b
c
+bc
c
120bc



_groff_ 1.23.0:


$ ~/groff-1.23.0/bin/nroff -ww ATTIC/escape-delimiter-fun.roff | cat -s
troff:ATTIC/escape-delimiter-fun.roff:7: warning: missing closing delimiter in
width computation escape sequence (got a newline)

b
c
+
|
+
c
192
c



_groff_ Git HEAD:


$ ~/groff-HEAD/bin/nroff -ww ATTIC/escape-delimiter-fun.roff | cat -s
troff:ATTIC/escape-delimiter-fun.roff:3: warning: missing closing delimiter in
bracket-building escape sequence; expected character 'a', got a newline
troff:ATTIC/escape-delimiter-fun.roff:5: warning: missing closing delimiter in
overstrike escape sequence; expected character 'a', got a newline
troff:ATTIC/escape-delimiter-fun.roff:7: warning: missing closing delimiter in
width computation escape sequence; expected character 'a', got a newline

b
c
+
|
+.br c.br 192
a
b
c



I concede that I've destabilized _groff_ for this input.

I also observe that _groff_ has apparently never been consistent with
AT&T _troff_ in this respect: contrast _groff_ 1.22.{3,4} output with
V7, DWB 3.3, and Heirloom.

But _groff_ has another trick up its sleeve.  Recall comment #2:


   Normally, GNU 'troff' keeps track of delimited arguments'
interpolation depth.  In compatibility mode, it does not.

     .ds xx '
     \w'abc\*(xxdef'
         => 168 (normal mode on a terminal device)
         => 72def' (compatibility mode on a terminal device)


What happens if we turn on _groff_'s AT&T compatibility mode?


$ ~/groff-HEAD/bin/nroff -Cww ATTIC/escape-delimiter-fun.roff | cat -s

b
c
+bc
+bc
120bc



We get _almost_ AT&T-compatible behavior.  The pipe in the
bracket-building escape sequence has gone missing.  _groff_ appears to
have been mislaying it for many years, compatibility mode or no.

But let's check that.  Let's travel back in time and see if/how I've
perturbed the treatment of this input in compatibility mode.


$ ~/groff-1.23.0/bin/nroff -Cww ATTIC/escape-delimiter-fun.roff | cat -s
/home/branden/groff-1.23.0/bin/nroff: usage error: invalid option '-Cww'
usage: /home/branden/groff-1.23.0/bin/nroff [-bcCEhikpRStUVz] [-d ctext] [-d
string=text] [-K fallback-encoding] [-m macro-package] [-M macro-directory]
[-n page-number] [-o page-list] [-P postprocessor-argument] [-r
cnumeric-expression] [-r register=numeric-expression] [-T output-device] [-w
warning-category] [-W warning-category] [file ...]
usage: /home/branden/groff-1.23.0/bin/nroff {-v | --version}
usage: /home/branden/groff-1.23.0/bin/nroff --help


Oh, bother.


$ ~/groff-1.23.0/bin/groff -Tascii -Cww ATTIC/escape-delimiter-fun.roff | cat
-s

b
c
+bc
+bc
120bc

$ ~/groff-1.22.4/bin/groff -Tascii -Cww ATTIC/escape-delimiter-fun.roff | cat
-s

b
c
+bc
+bc
120bc

$ ~/groff-1.22.3/bin/groff -Tascii -Cww ATTIC/escape-delimiter-fun.roff | cat
-s

b
c
+bc
+bc
120bc



So compatibility mode has been stable, and close to AT&T behavior, but
not an exact match.

I think three tasks arise from this exploration.

1.  I should document that if one wants AT&T-compatible treatment of
    interpolated delimiters, one should use compatibility mode.  The
    language of our documentation should be broadened: _groff_'s concept
    of "input level" (or "interpolation depth" as I prefer to term it)
    applies not just to the _arguments_ of a delimited escape sequence,
    _but to the delimiters themselves_.

2.  I should see if I can make _groff_ perfectly AT&T-compatible with
    this input, in compatibility mode.

3.  I should add test cases that force me to nail down the formatter's
    behavior when using string interpolations to construct escape
    sequences.

How does that sound?



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?67372>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature

Reply via email to