Package: ghostscript Version: 10.04.0~dfsg-1 Severity: important X-Debbugs-Cc: [email protected]
Dear Maintainer,
The txtwrite device in the most recent upload of ghostscript is broken;
this causes src:plastex to FTBFS as it uses the device in its test
suite.
The following is a simple reproducer based on a unit test from plastex.
(The necessary test3.pdf is also attached).
/-----
$ cat test3.tex
\documentclass{article}
\begin{document}
\thispagestyle{empty}
a.b
\end{document}
$ lualatex test3
[...]
Output written on test3.pdf (1 page, 2719 bytes).
### In bookworm
$ gs --version
10.00.0
$ gs -q -sDEVICE=txtwrite -o %stdout% test3.pdf
a.b
### In sid
$ gs --version
10.04.0
$ gs -q -sDEVICE=txtwrite -o %stdout% test3.pdf
X#
-----/
The tool pdftotext from poppler-utils also correctly extracts the text
from the test file.
This problem does not extend to all PDFs; in fact it seems to be
confined to PDFs generated by lualatex while pdflatex is OK.
Unfortanately for modern fonts and UTF-8, users are encouraged to use
lualatex these days, and the plastex test suite does so. As seen below,
lualatex picks different fonts and encodes them differently - that seems
to be what ghostscript is getting wrong.
$ pdffonts test3-lualatex.pdf
name type encoding emb sub
uni object ID
------------------------------------ ----------------- ---------------- --- ---
--- ---------
VFSMBO+LMRoman10-Regular CID Type 0C Identity-H yes yes
yes 4 0
$ pdffonts test3-pdftex.pdf
name type encoding emb sub
uni object ID
------------------------------------ ----------------- ---------------- --- ---
--- ---------
ZKXRNQ+CMR10 Type 1 Builtin yes yes
yes 4 0
regards
Stuart
test3.pdf
Description: Adobe PDF document
