Bug#312935: groff-base: grotty -c should be - no it MUST be - made the default
At 2022-09-26T15:40:39-0500, G. Branden Robinson wrote: > > less interprets b^Hbo^Hol^Hld^Hd by outputting the correct escape > > sequences for the terminal in use. > > No, it doesn't, and cannot, because that representation form is > ambiguous when the character to be overstruck is an underscore. This > actually comes up in man pages. Somewhere in the Debian BTS there is > an exhibit from a real page about this, but I don't recall right now > where. I found it. Mark Wooding pointed this out in 2020. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=963490#5 I was wrong about Mark supplying a specimen from a real-world man page, but he did offer a reproducer which enabled me to find a few. Here's a more full reproducer (it still doesn't take much). Remember, for this to behave badly you _have_ to use the celebrated SGR disablement feature. .TH demo 1 2022-09-26 "a demonstration" .IB x _mid_ y In the output you will see that the first underscore is underlined ("italicized") while the second is boldfaced. Obviously from the macro arguments they should both be bold. On my Debian bullseye system, ul(1) seems to guess better with this input than less 581.2 does. (I use a more recent version of less because I needed it to test grotty's OSC 8 support.) With the overstriking approach there is simply no way around this guesswork. One could of course propose some sort of in-band signaling protocol utterly alien to the fluttering standard of the Teletype Model 37 around which opponents of grotty's SGR output have rallied. The cowpoke(1) man page in Debian offers several exhibits of misrendering arising from this ambiguity. Here's one paragraph. SIGN_KEYID If this option is set, it is expected to contain the gpg key ID to pass to debsign(1) if the packages are to be remotely signed. You will be prompted to confirm whether you wish to sign the packages after all builds are complete. If this option is unset or an empty string, no attempt to sign packages will be made. It may be overridden on an arch and dist specific basis us‐ ing the arch_dist_SIGN_KEYID option described below, or per‐invocation with the --sign command line option. This email strips typeface changes, of course, but anyone wanting to continue to prosecute their mistaken and erroneous position regarding this groff feature can observe that the underscores after "arch" and "dist" are underlined ("italicized") when they should be bold. Observe the page source. made. It may be overridden on an \fIarch\fP and \fIdist\fP specific basis using the .IB arch _ dist _SIGN_KEYID option described below, or per-invocation with the \fB\-\-sign\fP command line option. Here are some other pages worth investigating. /usr/share/man/man1/pamdice.1.gz /usr/share/man/man1/winicontoppm.1.gz /usr/share/man/man7/lvmraid.7.gz I emphasize that there's nothing incorrect about the man page source code above. Pages misrender because overstriking for font changes is an inherently limited and ambiguous convention that simply cannot reliably produce correct output in some cases. If you want correctness, enable SGR. > As much as I'd like to think that some grief could have been avoided > by naming grotty something like "groansi" instead, I suspect the > amount would be vanishingly small. From what I've seen, users > scandalized by the violence that less(1) does to grotty's output do > not, as a first recourse, research the problem. Instead they file > bugs like this one. > > Nevertheless, I have given serious consideration to making grotty(1) > use the terminfo library to determine terminal capabilities; its > current approach is admittedly crude. It seems like it would be > friendly to users to _ask_ their terminals, or at least their $TERM > variables, what they claim to be capable of. I should add that even if I do this, I don't expect _any_ observable reduction in the number of complaints. xterm (and ncurses) maintainer Thomas Dickey's website regales the reader with story after story of users (and GNU/Linux distributions) who changed TERM environment variable settings and terminal database descriptions, practicing their marksmanship, if not blindly, then in terribly low light. The problem is that the set of people who hate the fact that grotty produces SGR considerably overlaps the set of people who clamored for 256-color (and subsequently "true color") support in xterm, because they wanted skinnable color schemes for Vim and similar. (Admittedly, some of the pressure in this area was relieved when the Atom IDE came along and skimmed a lot of these people off.) The people in the large intersection of these sets want simultaneous fidelity to ultra-modern xterm on the one hand and the Teletype Model 37 on the other. The possibility that this is fundamentally impossible they meet with QAnon le
Bug#312935: groff-base: grotty -c should be - no it MUST be - made the default
After 15 years, I recommend closing this bug as "wontfix". The traditional output mode only makes sense for overstriking terminals, of which there are vanishingly few that aren't paper terminals, which are themselves a dead breed. The virtue of "ANSI escapes" is that they're actually standardized in ISO 6429. Moreover, they are known to be widely supported in many hardware terminals and emulators. For those with truly "dumb" terminals, "grotty -cbou" is probably a good invocation to learn. It can be passed through from groff with the option "-P-cbou". For example: groff -Tascii -t -man -P-cbou ./build/tmac/groff_man.7 | more The grotty(1) man page in groff 1.22.4 is significantly improved over past versions, and the one forthcoming in groff 1.23.0 will be even better[1]. I would appreciate knowing if the "-cbou" trick is worth explicitly calling out there. I use it for groff regression tests, but I don't know how many other people would have any use for it. Regards, Branden [1] https://man7.org/linux/man-pages/man1/grotty.1.html signature.asc Description: PGP signature
Bug#312935: groff-base: grotty -c should be - no it MUST be - made the default
Package: groff-base Version: 1.18.1.1-7 Severity: normal Let's begin with some quotes from grotty(1): By default, grotty emits SGR escape sequences (from ISO 6429, also called ANSI color escapes) ... For SGR support, it is necessary to use the -R option of less(1) to disable the interpretation of grotty's old output format. Consequently, all programs which use less as the pager program have to pass this option to it. Instructions of the form "all programs ... have to ..." should be taken as a warning that whatever's being documented is a disruptive misfeature, and this is a perfect example. In previous versions of groff (for example the last version of groff in woody, and as far as I know every version of groff in every version of Debian prior to sarge), this worked: ( echo .ft B ; echo bold ; echo .ft R ; echo not bold ) > foo nroff foo | less Now it doesn't. Instead it shows ESC[1m and ESC[22m. grotty(1) documents a splendid variety of ways to avoid this outcome: by setting an environment variable, adding an option to the grotty command line, or sticking a directive into the intermediate [di]troff output. There are so many ways to disable this new behavior, you can't avoid thinking that the perpetrator must have known he was about to cause a lot of trouble. The burden of tracking down scripts that need to be rewritten, changing options, and setting environment variables rightly belongs on the few who want the new functionality, not the many who were happy with nroff as it has always been. Enabling a new, incompatible output format by default and then boldly declaring that "all programs ... have to" adapt to it is arrogant and rude. The advice to use less -R is helpful to some people, I suppose, but it is basically a layering violation. Decades of UNIX tradition have established the basic rules to be followed when the PAGER environment variable is used. It can be assumed that the PAGER behaves reasonably on sequences like b^Hbo^Hol^Hld^Hd, showing it as the world "bold" in bold if possible, but it cannot be assumed that the PAGER is less, or that the PAGER has a less-ish -R option or that it can interpret sequences like ESC[1m. PAGER is a protocol. One side does not get to unilaterally dictate changes in the other. grotty is violating the Robustness Principle here: be conservative in what you send. Aside from that, the use of ANSI escape sequences without regard to the TERM environment variable is an unwarranted assumption. less interprets b^Hbo^Hol^Hld^Hd by outputting the correct escape sequences for the terminal in use. less -R just passes through the ESC[1m, which does no good on a Wyse terminal. All the world's not a VT100. Apparently, some of these problems have already been noticed, because /etc/groff/man.local contains the magic code to enable the sensible, backward-compatible behavior. But if it was correct to make that decision for man pages, why shouldn't it be done for all other uses of nroff too? Please, if upstream will not reverse the default, do it in the Debian package, for all of the reasons listed above. This ill-conceived new behavior must not stand. -- System Information: Debian Release: 3.1 Architecture: i386 (i686) Kernel: Linux 2.4.29 Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968) Versions of packages groff-base depends on: ii libc6 2.3.2.ds1-22 GNU C Library: Shared libraries an ii libgcc1 1:3.4.3-13 GCC support library ii libstdc++5 1:3.3.5-13 The GNU Standard C++ Library v3 -- no debconf information -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]