Our docs build system deals with manpages written directly in groff, and also manpages written in asciidoc. asciidoc manpages get built into regular groff manpages using the asciidoc tools.

All manpages, both the native groff ones and the ones built from asciidoc, are converted to HTML using the HTML output mode of the groff tools.

There's a bit of complexity in how groff handles hyphens and minus signs. The documentation and the implementation of this is evolving fairly actively, with non-trivial differences between groff 1.21 (in Wheezy) and groff 1.22.4 (in Buster, and currently the latest released version).

groff 1.21 in Wheezy renders '-' (ascii/utf8 0x2d, the character named "hyphen-minus") in the input to '-' (ascii/utf8 0x2d) in the HTML output, and renders '\-' in the input as '−' in the HTML output. My web browser renders '−' as '−' (u2212, named "minus"), so copy/paste doesn't work.

groff 1.22.4 in Buster renders both '-' (2d) and '\-' (escaped 2d) in the input to '-' (ascii/utf8 0x2d, which is what we want) in the HTML output.

The documentation for the groff 1.23 release candidate has this to say:

https://www.man7.org/linux/man-pages/man7/groff_char.7.html

The hyphen-minus is a particularly unfortunate case of
overloading.  Its awkward name in ISO 8859 and later standards
reflects the many conflicting purposes to which it had already
been put in the 1980s, including a hyphen, a minus sign, and
(alone or in repetition) dashes of varying widths.  For best
results in groff, use the character in input without an escape
only to mean a hyphen, as in the phrase “long-term”.  For a minus
sign in running prose or a Unix command-line option dash, use \-
(or \[-] in groff if you find it helps the clarity of the source
document).

The groff(1) manpage for both Wheezy and Buster, themselves written in groff of course, use '\-' (escaped 0x2d) for the dashes that precede command-line arguments. I figure this gives us a pretty good idea of how the groff people (or at least the people who wrote the manpage) think about it.

And finally it's worth nothing that when asciidoc writes groff, it escapes the hyphen-minus character.

So based on all this I believe our groff manpages should use "-" (ascii 0x2d, u2d, "hyphen-minus") when we want a hyphen (probably rarely) and "\-" (ascii/utf8 sequence 0x5c 0x2d, an escaped hyphen-minus) when we want the actual hyphen-minus character in the output (like for command-line arguments and hal pins).

We should switch our official docs build from Wheezy (which gets this all wrong) to Buster (which gets it right).

This will make the docs on wlo copy-paste-able, but it won't help anyone who wants to build our docs on Wheezy. I can live with that.


--
Sebastian Kuzminsky


_______________________________________________
Emc-users mailing list
Emc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/emc-users

Reply via email to