Update of bug #55042 (project groff):
Status: None => Invalid
Assigned to: None => gbranden
Open/Closed: Open => Closed
_______________________________________________________
Follow-up Comment #1:
I don't believe this report is correct.
preconv in groff 1.22.4 appears to map the no-break space correctly. In a
UTF-8 environment, however, it is important to remember that this character
does not have a single-byte representation.
$ printf '\302\240\n' | preconv
.lf 1 -
\[u00A0]
$ printf '\240\n' | preconv -e latin-1
.lf 1 -
\[u00A0]
Notice what happens if we try to feed a Latin-1-encoded no-break space to
preconv without giving the program a hint about the input encoding:
$ printf '\240\n' | preconv
.lf 1 -
\[uFFFD]
We get the Unicode replacement character, as documented in the man page.
I could not make sense of the example input. Every character in the following
is US-ASCII and preconv therefore performs no transformations. Further, the
purpose of the commented line is not clear to me.
.pl 3v
A\[char161]B\[char160]C
.br
.\".tr \[u00A0]\~
A\[u00A1]B\[u00A0]C
Closing, as preconv appears to be working as documented.
Feel free to reopen if a minimal reproducing case of erroneous behavior can be
established. Please use printf(1) or similar so that it is precisely clear
what the input stream consists of; I don't trust Savannah's web interface to
not mangle whitespace characters, especially exotic ones outside of US-ASCII.
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?55042>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/