Package: ghostscript Version: 9.55.0~~rc1~dfsg-1 Severity: normal Tags: upstream fixed-upstream
Ghostscript, e.g. via the ps2pdf wrapper, emits incorrect ToUnicode CMap entries, making text non-searchable, partly unreadable via pdftotext, and affecting copy-paste too. A bug was fixed upstream by commit b4e8434defb8e05ea05bb130b92217290efd2fba (2021-10-25) but due to an incorrect test, Ghostscript still generates a ToUnicode CMap with ligatures such as "fi" and "fl", mapped to several code points by TeX Live 2021, and this is not supported by Ghostscript. See discussions: https://bugs.ghostscript.com/show_bug.cgi?id=704478 https://bugs.ghostscript.com/show_bug.cgi?id=704674 In such a case, Ghostscript should not generate a ToUnicode CMap. The absence of such a CMap is not guaranteed to be correct either, but in practice, with PDF files generated by TeX Live, the common PDF consumers can recognize the characters in such files (I am not aware of any issue). Testcase chartest5a-tl2021.pdf attached. It was generated with TeX Live 2021 on the following LaTeX source: \documentclass[12pt]{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage{lmodern} \begin{document} \thispagestyle{empty} Test: « don't finite float offer affine ». \end{document} This chartest5a-tl2021.pdf file contains the following text: Test: « don’t finite float offer affine ». But when converted with ps2pdf (or gs directly), one gets: Test: ń donŠt Ąnite Ćoat offer affine ż. This bug is fixed upstream by commit 8f62213019bc682eeb0ed9467d8841f3770cfda6 (2021-10-29) Note: this does not fix all pdfwrite bugs concerning the ToUnicode CMap. For additional issues, see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995392 Note also that the Š above happens to be due to such an additional issue, but in this particular case, the fix no longer generates a ToUnicode CMap, which has the effect to make this issue disappear in this case. -- System Information: Debian Release: bookworm/sid APT prefers unstable-debug APT policy: (500, 'unstable-debug'), (500, 'stable-updates'), (500, 'stable-security'), (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 5.14.0-3-amd64 (SMP w/8 CPU threads) Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE Locale: LANG=POSIX, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages ghostscript depends on: ii libc6 2.32-4 ii libgs9 9.55.0~~rc1~dfsg-1 ghostscript recommends no packages. Versions of packages ghostscript suggests: ii ghostscript-x 9.55.0~~rc1~dfsg-1 -- no debconf information -- Vincent Lefèvre <vinc...@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
chartest5a-tl2021.pdf
Description: Adobe PDF document