On 2026-05-29 16:34, Pádraig Brady wrote:
On 29/05/2026 23:51, Bruno Haible via Gnulib discussion list wrote:
Other purposes that I missed?
Avoid printing shell meta-characters,
which may cause side effects if pasted back to a shell command.
Also, avoid over-quoting, i.e., quoting that is unnecessary and
therefore makes strings harder for humans to interpret.
Also, quote correctly, i.e., so that a parser for a particular notation
will recover the original string unchanged, byte-for-byte.
As a corollary, do so portably, i.e., cater to idiosyncrasies of target
parsers for the notation, even if they may have bugs. (In some cases, I
expect, it's impossible to quote correctly and portably.)
Also, be able to quote strings correctly even if they contain NUL bytes
or encoding errors.
It'd also be helpful to mention which goals are the most important,
e.g., correctness is more important than avoiding over-quoting.
I've probably forgotten some goals....
* Avoid printing characters that are "unprintable" in the terminal's
encoding. If that encoding is not UTF-8, they would typically leave
no traces on screen, thus end up presenting misleading results.
Although this is nice to have, it's not possible to avoid misleading
displays in general even if we assume UTF-8, because many terminals
swallow some Unicode characters with no visible effects (which they are
required to do for characters like U+200B ZERO WIDTH SPACE). And even if
they didn't silently swallow characters, this is just a subset of the
more-general problem, a problem that also includes confusables.