At 2025-08-24T15:50:04+1000, Martin D Kealey wrote:
> I note that much of the documentation still uses a quoting style that
> pretends that characters U+0060 and U+0027 are matching opening and
> closing quotes, and that new documentation is still being added that
> follows this style. For extra credit, they're sometimes redoubled as
> `` and '' to be fake double quotes.

Yes, because that's idiomatic *roff input.  See below.  Input is not
output.  TeX uses the same input convention.  Do you plan to shift the
world's entire corpus of TeX documents as well?

> The use of the grave accent symbol as if it were a quote mark is
> visually asymmetric (ugly!),

Yes.

> has semantic conflicts (including with its use as a shell
> metacharacter),

Yes.

> is in the wrong character class (for line wrapping and hyphenation),

No idea what you're talking about here--maybe Bash's Texinfo manuals,
since this claim is false for groff.

GNU troff by default assigns no more special properties to "`" than it
does the grave accent per se (accessible via the special character
escape sequence `\[ga]`).  (And none of its stock macro files use
`cflags` requests to override this character's defaults, either.)

https://cgit.git.savannah.gnu.org/cgit/groff.git/tree/src/roff/troff/input.cpp?h=1.23.0#n6960

> disregards all formal specifications (Unicode-16.0.0
> (2024) still says "grave accent"),

And most people call "\" a "backslash" rather than a "reverse solidus".
Big deal.  The "formal specifications" upon which you're loading so much
rhetorical freight and frenzied gesticulation are not applicable here.

> and is extremely outdated (ASA X3.4-1963 said "diacritic" 62 years
> ago).

It was used for constructive overstriking on typewriters (and
teleprinting terminals); in that sense, it was indeed a diacritic.  The
ASCII standard of 1963/1968 expected certain code points to do
double-duty as spacing and combining ("diacritic") characters, and the
glyphs of the typeface of the Teletype Corporation Model 37 exhibited
conformance with that expectation.

See, for example, the ascii(7) man page of early Unix manuals.

From what I've seen, the high-flown neutral double quote is a dead
giveaway for Model 37 output.  You see it often in early Unix papers.

http://bitsavers.informatik.uni-stuttgart.de/pdf/att/unix/2nd_Edition/UNIX_Programmers_Manual_2ed_Jun72.pdf

> A more thorough analysis is provided by Markus
> Kuhn <https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html>.

Markus has long hosted several excellent resources.  Sadly for your
readers, you reflect his erudition poorly.

> GNU is the last serious hold-out, and "this is how we've always done
> it" won't wash any more.

As Sam James noted, you simply don't know what you're talking about
here.  Consider gathering facts before writing intemperate screeds.

> I propose, at minimum, that the U+0060 grave accent be replaced
> wherever it's been misused as an opening quote, but a better result
> would be to replace both, using paired Unicode ‘typographic’ quotes
> where possible.  Wherever redoubled `` '' pairs have been used, they
> should be replaced by the corresponding double quote characters.

I counter-propose that *roff and TeX documents should use the
conventions for eliciting typographer's quotes from an output device
that are supported by the formatting system's documentation.

> Whether to use Unicode ‘quote’ style, or just stick with ASCII 'quote'
> style, depends on context:

It's more than context; these are different problem domains.

> * In HTML documentation, not using typographic quotes lacks any
> reasonable defence: any program that can show HTML can also cope with
> Unicode. Any editor whose keyboard doesn't have typographic quotes can
> type HTML entities instead.

A typesetter or formatter takes care of this.

> * Strings that are compiled into Bash have to be displayable on
> terminals that lack unicode support. Either they need to be written in
> pure ASCII, or the output function needs to replace typographic quotes
> with ASCII ones.  (Consider augmenting gettext() to do the latter as
> its fallback.)

I'll defer to a more seasoned Bash developer to answer this, but GCC,
for instance and contra your implication, does actually use
typographer's quotes in diagnostic messages.  Do you pay any attention
to the behavior of the programs you complain about?  For example, I just
introduced a syntax error in a C++ source file to provoke error
diagnostics from the compiler.

../src/roff/troff/input.cpp: In function ‘void init_hpf_code_table()’:
../src/roff/troff/input.cpp:7975:35: error: expected ‘;’ before ‘}’ token

$ g++ --version | head -n 1
g++ (Debian 10.2.1-6) 10.2.1 20210110

> * Man pages, info files, and other stuff that gets locale handling can
> use en.UTF-8 as the primary version, and generate C/POSIX (ASCII-only)
> from that.

In the contemporary world, man pages get formatted either by groff or by
OpenBSD's mandoc(1), a tool with a much smaller charter than groff.  At
least in its currently released version, it doesn't appear to honor your
desire.  You can charge at Ingo Schwarze with your jeremiad; I expect
his response would be entertaining.

groff already takes the character repertoire of the output device into
account.

$ printf \`foobar\' | nroff -T ascii | cat -s
`foobar'

$ printf \`foobar\' | nroff -T latin1 | cat -s
`foobar'

$ printf \`foobar\' | nroff -T utf8 | cat -s
‘foobar’

Now, that said, Debian and some other distributors of groff
unfortunately cripple man page rendering in this precise respect,[1]
because, at least in Debian's case, the package maintainer gets too many
harangues from people like you who go off half-cocked.  The
groff_man_style(7) page in the forthcoming groff 1.24 release will have
(something similar to) the following Q&A.

groff_man_style(7):
     • When and how should I use quotation marks?

       As noted above in subsection “Font style macros”, apply quotation
       marks to “brief specimens of literal text, such as article
       titles, inline examples, mentions of individual characters or
       short strings, and (sub)section headings in man pages”.  Multi‐
       word literals, such as Unix commands with arguments, when set
       inline (as opposed to displayed between EX and EE), should be
       quoted to ensure that the boundaries of the literal are clear
       even when the material is stripped of font styling by, for
       example, copy‐and‐paste operations.  groff, Heirloom Doctools
       troff, neatroff, and mandoc support all of the special characters
       \[oq], \[cq], \[lq], \[rq], \[aq], and \[dq] described in
       subsection “Portability” above.  DWB, Plan 9, and Solaris 10
       troffs do not.

       Historically, man pages used ` and ' exclusively for directional
       single quotation marks.  However, in recent years, some
       distributors of groff have chosen to override the meanings of
       these characters in man pages, remapping them to their Unicode
       Basic Latin code points.  Unfortunately, ` and ' are the only
       reliable means of obtaining directional single quotation marks in
       AT&T troff; in that implementation, often no special character
       escape sequences exist to obtain them.  Further, AT&T troff’s
       special character identifiers, like its font names, were device‐
       specific.  To achieve quotation portably in man pages rendered
       both by AT&T and more modern troffs, consider adding a preamble
       to your page after the TH call as follows.

              .ie \n(.g \{\
              .  ds oq \[oq]\"
              .  ds cq \[cq]\"
              .\}
              .el \{\
              .  ds oq `\"
              .  ds cq '\"
              .\}

       You must then use the \* escape sequence to interpolate the
       quotation mark strings.

              The command
              .RB \*(oq "while !\& git pull; do sleep 10; done" \*(cq
              retries an update from the repository until it succeeds.

       If this procedure seems complex, petition your distributor to
       revert their remapping of the ` and ' characters.

> * Translations should be encouraged to use their respective
> typographic quoting style: „DE“, »DK«, «FR», ”HE„, „HU”, 『JP』 etc.
> (See > https://en.wikipedia.org/wiki/Quotation_mark#Summary_table)
> * Files with monospaced plaintext (CHANGES, HISTORY, etc) - either
> 'ASCII' quotes or ‘Unicode’ quotes depending on what Chet can type.
> * LICENCE/LICENSE - ask the respective licence-holders to provide
> updated versions, or to ratify our "translation" (especially GNU &
> BSD).

I decline to address how Chet maintains plain text files in his
distribution.

> * m4 (aka “where did I put my seppuku blade?”) Add
> changequote(,)changequote(`,')dnl to the start of all documents that
> tacitly assume `', so that this assumption can eventually be
> deprecated.

This request is off-topic for Bash mailing lists; GNU M4 is a separate
project.  You might check out the bug...@gnu.org mailing list.

> * Other stuff - what have I missed?

Good question.  Why don't you go conduct research and only then come
back and _then_ tell us of your findings?

> What do others think?
> 
> -Martin
> 
> PS: arguably I should have started this in coreutils or gnu-policy,

Inarguably you should have gotten your facts in order before making an
aggressively worded proposal.

> but I'm starting here because ` is syntactically significant to Bash,
> so there's extra damage.

See above.

At 2025-08-24T09:03:31+0300, Oğuz wrote:
> How? Is there a portable roff macro that produces them when supported
> and fall back to regular double quotes otherwise?

There's a portable way, yes, but some distributors of groff have broken
it because people when people like Martin D. Kealey turn their hand to
writing man pages, they choose to scream rather than learn.  So they
produce bad man pages, and then people like Martin D. Kealey scream.

At 2025-08-24T07:10:30+0100, Sam James wrote:
> Martin D Kealey <mar...@kurahaupo.gen.nz> writes:
> > GNU is the last serious hold-out, and "this is how we've always done
> > it" won't wash any more.
> 
> It is, in fact, not the last serious hold-out at all:
> https://www.gnu.org/prep/standards/standards.html#Quote-Characters.
> 
> I don't recall when it changed other than it being in the last few
> years.

At least a decade.

https://web.archive.org/web/20150104113840/https://www.gnu.org/prep/standards/standards.html#Quote-Characters

Regards,
Branden

[1] 
https://salsa.debian.org/debian/groff/-/commit/d5394c68d70e6c5199b01d2522e094c8fd52e64e

Attachment: signature.asc
Description: PGP signature

Reply via email to