Package: xpdf Version: 3.04+git20220601-1+b2 Severity: normal Text search misses hyphenated words. For instance, on the attached PDF file, search for "thatmay" or "verylong".
This issue doesn't occur with the text produced by "pdftotext": the hyphens do not appear in the output. -- System Information: Debian Release: trixie/sid APT prefers unstable-debug APT policy: (500, 'unstable-debug'), (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable-debug'), (500, 'proposed-updates-debug'), (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 'experimental') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 6.5.0-5-amd64 (SMP w/12 CPU threads; PREEMPT) Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE Locale: LANG=POSIX, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages xpdf depends on: ii libc6 2.37-13 ii libgcc-s1 13.2.0-8 ii libpaper1 1.1.29 ii libpoppler126 22.12.0-2+local1 ii libstdc++6 13.2.0-8 ii libx11-6 2:1.8.7-1 ii libxm4 2.3.8-3 ii libxt6 1:1.2.1-1.1 Versions of packages xpdf recommends: ii cups-bsd 2.4.7-1 pn gsfonts-x11 <none> ii poppler-data 0.4.12-1 ii poppler-utils 22.12.0-2+local1 ii sensible-utils 0.0.20 xpdf suggests no packages. -- no debconf information -- Vincent Lefèvre <[email protected]> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
hyphens.pdf
Description: Adobe PDF document

