Package: groff Version: 1.22.4-8 Severity: normal When using -k on a file which contains a single UTF-8 character, preconv misdetects the text as some other encoding, even though the locale in use is UTF-8. Since UTF-8 is nearly universally used for text files on Unix, this leads to bizarre behaviour and misencodings.
For example, given the first file below, groff prints a warning and then proceeds to insert an incorrect character. However, when a second UTF-8 character is included, the file works. My recommendation here is that when detecting character sets, if the data is valid UTF-8, then UTF-8 be used as the encoding. The uchardet detection of "MAC-CENTRALEUROPE" may be acceptable for some web pages, where encoding can be specified explicitly at the HTTP level, but it is not a prudent choice for documents on Debian (which has never supported this as a valid system encoding) in 2022. I very much doubt this would be a prudent encoding on macOS in 2022, either, which, as I understand it, has used UTF-8 exclusively since 10.0, released over two decades ago. Command line: LC_ALL=fr_CA.UTF-8 groff -Tps -dpaper=com10l -P-pcom10 -P-l -k envelope.me >envelope.ps broken ---- .nf .po 0.5c .sp 0.5c .ft P Toronto City Hall 100 Queen Street W Toronto ON M5H 2N2 Canada .sp 2c .in 8.5c New York City Hall 1 City Hall New York NY 10007-1298 États-Unis ---- working ---- .nf .po 0.5c .sp 0.5c .ft P Hôtel de Ville de Toronto 100 Rue Queen O Toronto ON M5H 2N2 Canada .sp 2c .in 8.5c New York City Hall 1 City Hall New York NY 10007-1298 États-Unis ---- -- System Information: Debian Release: bookworm/sid APT prefers stable-security APT policy: (500, 'stable-security'), (500, 'unstable'), (500, 'stable'), (500, 'oldstable'), (1, 'experimental') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 5.15.0-3-amd64 (SMP w/8 CPU threads) Kernel taint flags: TAINT_WARN Locale: LANG=fr_FR.UTF-8, LC_CTYPE=en_CA.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages groff depends on: ii groff-base 1.22.4-8 ii libc6 2.33-7 ii libgcc-s1 12-20220319-1 ii libstdc++6 12-20220319-1 ii libx11-6 2:1.7.5-1 ii libxaw7 2:1.0.14-1 ii libxmu6 2:1.1.3-3 ii libxt6 1:1.2.1-1 Versions of packages groff recommends: ii ghostscript 9.56.0~dfsg-1 ii imagemagick 8:6.9.11.60+dfsg-1.3+b2 ii imagemagick-6.q16 [imagemagick] 8:6.9.11.60+dfsg-1.3+b2 ii libpaper1 1.1.28+b1 ii netpbm 2:10.97.00-2 ii perl 5.34.0-3 ii psutils 1.17.dfsg-4 groff suggests no packages. -- no debconf information -- brian m. carlson (he/him or they/them) Toronto, Ontario, CA
signature.asc
Description: PGP signature