On 30/05/2026 01:30, Paul Eggert wrote:
On 2026-05-29 16:34, Pádraig Brady wrote:
On 29/05/2026 23:51, Bruno Haible via Gnulib discussion list wrote:
* Avoid printing characters that are "unprintable" in the terminal's
encoding. If that encoding is not UTF-8, they would typically leave
no traces on screen, thus end up presenting misleading results.
Related to the above is terminals adjusting encodings,
like xterm always converting to unicode composed form,
thus causing copy and paste issues when referencing files.
Although this is nice to have, it's not possible to avoid misleading
displays in general even if we assume UTF-8, because many terminals
swallow some Unicode characters with no visible effects (which they are
required to do for characters like U+200B ZERO WIDTH SPACE). And even if
they didn't silently swallow characters, this is just a subset of the
more-general problem, a problem that also includes confusables.
Yes I'd probably avoid mentioning UTF8 at all,
rather stating that with the C locale, output should be unambiguous
given various encoding incompatibilities (mentioned above).
cheers,
Padraig