Follow-up Comment #5, bug #67372 (group groff): At 2025-07-28T14:04:24-0400, G. Branden Robinson wrote: > commit ad4fa80a3f2d66ed7e9d4342fbc58d2be07984ea > Author: G. Branden Robinson <g.branden.robin...@gmail.com> > Date: Sat Oct 1 04:18:47 2022 -0500 > > [troff]: Refactor to parallelize logic. > > * src/roff/troff/input.cpp: Refactor to parallelize logic in similar > routines; namely, those handling escape sequences that accept newlines > as argument delimiters.
There were six escape sequences at issue: \A, \B, \b, \o, \w, and \X. Of these, only 3 are portable to all of the _troff_s available to me: \b, \o, and \w. (`\X` was a Kernighan _troff_ innovation, not available in Seventh Edition Unix _troff_.) So I thought I'd see what happens with an input using this interpolated delimiter trick with V7, DWB 3.3, and Heirloom Doctools troffs, and _groff_ 1.22.{3,4}, 1.23.0, and Git HEAD. Exhibit: $ cat ATTIC/escape-delimiter-fun.roff .sp 1i \" space to accommodate bracket-building .ds D abc \b\*D+|+\*D .br \o\*D+|+\*D .br \w\*D+|+\*D V7 Unix (using a shorter filename because it was 1979 and 14 characters should be enough for anyone--you can be sure Ken Thompson wasn't going to ever type anything that long): $ pdp11 ./v7.simh PDP-11 simulator V3.8-1 Disabling XQ @boot New Boot, known devices are hp ht rk rl rp tm vt : rl(0,0)rl2unix mem = 177856 # Restricted rights: Use, duplication, or disclosure is subject to restrictions stated in your contract with Western Electric Company, Inc. Thu Sep 22 23:35:05 EDT 1988 login: dmr $ cat > escfun.roff .sp 1i \" space to accommodate bracket-building .ds D abc \b\*D+|+\*D .br \o\*D+|+\*D .br \w\*D+|+\*D $ nroff escfun.roff | sed '/^$/d' b c + | +bc +bc 120bc $ sync $ sync $ sync $ login: Simulation stopped, PC: 002306 (MOV (SP)+,177776) sim> quit Goodbye Next up, DWB 3.3 _troff_, a cousin of Kernighan _troff_: $ DWBHOME=. ./bin/nroff escape-delimiter-fun.roff | cat -s b c + | +bc +bc 120bc Next, Heirloom Doctools _troff_, a descendant of DWB 2.0 (whence also came Solaris _troff_, to the best of my knowledge): $ ./bin/nroff escape-delimiter-fun.roff | cat -s b c + | +bc +bc 120bc _groff_ 1.22.3: $ ~/groff-1.22.3/bin/nroff -ww ATTIC/escape-delimiter-fun.roff | cat -s ATTIC/escape-delimiter-fun.roff:5: warning: missing closing delimiter b c +bc c 120bc _groff_ 1.22.4: $ ~/groff-1.22.4/bin/nroff -ww ATTIC/escape-delimiter-fun.roff | cat -s troff: ATTIC/escape-delimiter-fun.roff:5: warning: missing closing delimiter b c +bc c 120bc _groff_ 1.23.0: $ ~/groff-1.23.0/bin/nroff -ww ATTIC/escape-delimiter-fun.roff | cat -s troff:ATTIC/escape-delimiter-fun.roff:7: warning: missing closing delimiter in width computation escape sequence (got a newline) b c + | + c 192 c _groff_ Git HEAD: $ ~/groff-HEAD/bin/nroff -ww ATTIC/escape-delimiter-fun.roff | cat -s troff:ATTIC/escape-delimiter-fun.roff:3: warning: missing closing delimiter in bracket-building escape sequence; expected character 'a', got a newline troff:ATTIC/escape-delimiter-fun.roff:5: warning: missing closing delimiter in overstrike escape sequence; expected character 'a', got a newline troff:ATTIC/escape-delimiter-fun.roff:7: warning: missing closing delimiter in width computation escape sequence; expected character 'a', got a newline b c + | +.br c.br 192 a b c I concede that I've destabilized _groff_ for this input. I also observe that _groff_ has apparently never been consistent with AT&T _troff_ in this respect: contrast _groff_ 1.22.{3,4} output with V7, DWB 3.3, and Heirloom. But _groff_ has another trick up its sleeve. Recall comment #2: Normally, GNU 'troff' keeps track of delimited arguments' interpolation depth. In compatibility mode, it does not. .ds xx ' \w'abc\*(xxdef' => 168 (normal mode on a terminal device) => 72def' (compatibility mode on a terminal device) What happens if we turn on _groff_'s AT&T compatibility mode? $ ~/groff-HEAD/bin/nroff -Cww ATTIC/escape-delimiter-fun.roff | cat -s b c +bc +bc 120bc We get _almost_ AT&T-compatible behavior. The pipe in the bracket-building escape sequence has gone missing. _groff_ appears to have been mislaying it for many years, compatibility mode or no. But let's check that. Let's travel back in time and see if/how I've perturbed the treatment of this input in compatibility mode. $ ~/groff-1.23.0/bin/nroff -Cww ATTIC/escape-delimiter-fun.roff | cat -s /home/branden/groff-1.23.0/bin/nroff: usage error: invalid option '-Cww' usage: /home/branden/groff-1.23.0/bin/nroff [-bcCEhikpRStUVz] [-d ctext] [-d string=text] [-K fallback-encoding] [-m macro-package] [-M macro-directory] [-n page-number] [-o page-list] [-P postprocessor-argument] [-r cnumeric-expression] [-r register=numeric-expression] [-T output-device] [-w warning-category] [-W warning-category] [file ...] usage: /home/branden/groff-1.23.0/bin/nroff {-v | --version} usage: /home/branden/groff-1.23.0/bin/nroff --help Oh, bother. $ ~/groff-1.23.0/bin/groff -Tascii -Cww ATTIC/escape-delimiter-fun.roff | cat -s b c +bc +bc 120bc $ ~/groff-1.22.4/bin/groff -Tascii -Cww ATTIC/escape-delimiter-fun.roff | cat -s b c +bc +bc 120bc $ ~/groff-1.22.3/bin/groff -Tascii -Cww ATTIC/escape-delimiter-fun.roff | cat -s b c +bc +bc 120bc So compatibility mode has been stable, and close to AT&T behavior, but not an exact match. I think three tasks arise from this exploration. 1. I should document that if one wants AT&T-compatible treatment of interpolated delimiters, one should use compatibility mode. The language of our documentation should be broadened: _groff_'s concept of "input level" (or "interpolation depth" as I prefer to term it) applies not just to the _arguments_ of a delimited escape sequence, _but to the delimiters themselves_. 2. I should see if I can make _groff_ perfectly AT&T-compatible with this input, in compatibility mode. 3. I should add test cases that force me to nail down the formatter's behavior when using string interpolations to construct escape sequences. How does that sound? _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?67372> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
signature.asc
Description: PGP signature