Re: [Groff] ASCII Minus Sign in man Pages.

Ingo Schwarze Wed, 26 Apr 2017 09:55:36 -0700

Hi Ralph,

i think it is fair to say that our priorities differ slightly here;
i value simplicity for writers slightly higher than you seem to,
and you seem to value typeset (e.g., PDF) output slightly higher
than i tend to.  No doubt, both priorities have some merit.

I don't deny that writing good manual pages requires some understanding
of the roff(7) language, no matter which macro set you use, even mdoc(7),
and the required understanding goes beyond understanding that request
and macro lines start with a dot and text lines don't.  The mandoc mdoc(7)
manual says as much near the top, in the fifth paragraph of the
description.

As a consequence of the slightly different priorities described above,
we might (or might not) disagree about whether certain low-level roff
best practices should be recommended for manual pages, too, but i don't
see evidence of substantial disagreement in that respect yet.
I certainly don't object if people write "e.g.\&", though i might
not try to train less prolific authors to pay attention to that.
I claim that \c is not needed (and should not be used) in mdoc(7),
but you didn't deny that.  I claim that man(7) requires more low-level
roff constructs than mdoc(7), and you didn't deny that.  So the fact
that \c is occasionally used in man(7), in particular in pages
written by unusually skilled typesetters like in the groff manuals
themselves, does not seem all that surprising to me, even though i
don't like it.  That fact that docbook autogeneration spews it
surprises me even less.  No matter how you judge \c usage in man(7),
docbook is no doubt notorious to be the man(7) autogenerator that
produces the lowest quality and least portable man(7) output available
anywhere on the market, even though generators producing acceptable
quality do exist, for example pod2man(1) - not all of that is pretty,
either, but it works quite reliably.

Ralph Corderoy wrote on Wed, Apr 26, 2017 at 03:50:26PM +0100:
> Ingo wrote:

>> INSIDE manual pages, - for \(hy or \- for \(mi is a terrible idea
>> already now because the three main implementations (including groff)
>> don't do that in the quite important -Tutf8 device.

> This is because of the bodge to map `-' onto ASCII 45, by Debian
> originally, was it?  Rather than stand firm and map just `\-' and
> tell complainants that the upstream man page needed fixing.

That is an argument to be taken very seriously.  *If* we value high
quality manual page output enough to worry about hyphens in normal
English words to be rendered as U+002D HYPHEN-MINUS in text lines,
then we have a second problem to solve (in -Tutf8) in addition to
not having a good way to request U+002D HYPHEN-MINUS where that is
really the character we want.

That leads to a natural suggestion solving *both* of these problems:

 - Make sure *all* output devices (even UTF-8) render - as a
   typographic hyphen (U+2010 HYPHEN) even in manual pages.
   That solves the second problem brought up in this mail.
   Yes, it will require fixing lots of manual pages, but if
   we decide that way, i think i'm willing to try and convince
   the BSD camp and help fixing OpenBSD manuals, even though
   i anticipate that will require some work.  That has good
   chances to work out because quite a few people are already
   at least half-aware that the input - is not quite the same
   as an output ASCII HYPHEN-MINUS.

 - Make sure *all* output devices (even PDF) render \- as
   U+002D HYPHEN-MINUS.  That is admittedly a deviation from current
   practice, but it thoroughly solves Ralph's original problem for
   the future, and i think it is close to what \- was originally
   intended to mean in historical roff implementations, so i don't
   view it as trampling on roff traditions.  Also, the visual
   appearance of U+002D HYPHEN-MINUS and U+2212 MINUS SIGN is so
   similar in most fonts that this change is unlikely to cause
   much outrage.

 - Educate people to always use \(mi rather than \- if they want a
   typographic minus sign U+2212 MINUS SIGN (say, in mathematical
   formula) as opposed to an ASCII minus sign (say, in sample C code).

 - Nothing changes for \(en and \(em.

After the above code changes in groff, Heirloom, mandoc, and the
related documentation, and until manual pages adapt, people may
see slighly wrong typography in practice - but none of that will
harm understanding.  The same goes for people using legacy roff
implementations other than the three mentioned, or old versions
of groff.  So incompatibility costs are minor and the transition
paths are more or less benign.

Thoughts?

If people like that direction, i'm even willing to draft patches.

This route is slightly more complicated for authors than my first
simplistic proposal, but i cannot deny Ralph's point that my proposal
harms manual page PDF output, and i can understand if people object
to that: the ability to produce uncompromisingly good typeset output
is among the most famous assets of the system of manual pages as a
whole.

This route is also technically cleaner and easier to implement than
my first simplistic proposal.

In any case, I'd love to have clear, easy-to-understand rules about
what - \- \(hy \(mi mean.  Lots of people have repeatedly asked me
about that, and i have never been able to answer in a very convincing
way.  The current macro set dependency and the absence of U+002D
output in typeset output makes this a deplorable mess.

Yours,
  Ingo

Re: [Groff] ASCII Minus Sign in man Pages.

Reply via email to