Package: xpdf
Version: 3.04+git20220601-1+b2
Severity: normal

Text search misses hyphenated words. For instance, on the attached
PDF file, search for "thatmay" or "verylong".

This issue doesn't occur with the text produced by "pdftotext": the
hyphens do not appear in the output.

-- System Information:
Debian Release: trixie/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'stable-updates'), (500, 
'stable-security'), (500, 'stable-debug'), (500, 'proposed-updates-debug'), 
(500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 6.5.0-5-amd64 (SMP w/12 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, 
TAINT_UNSIGNED_MODULE
Locale: LANG=POSIX, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages xpdf depends on:
ii  libc6          2.37-13
ii  libgcc-s1      13.2.0-8
ii  libpaper1      1.1.29
ii  libpoppler126  22.12.0-2+local1
ii  libstdc++6     13.2.0-8
ii  libx11-6       2:1.8.7-1
ii  libxm4         2.3.8-3
ii  libxt6         1:1.2.1-1.1

Versions of packages xpdf recommends:
ii  cups-bsd        2.4.7-1
pn  gsfonts-x11     <none>
ii  poppler-data    0.4.12-1
ii  poppler-utils   22.12.0-2+local1
ii  sensible-utils  0.0.20

xpdf suggests no packages.

-- no debconf information

-- 
Vincent Lefèvre <[email protected]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

Attachment: hyphens.pdf
Description: Adobe PDF document

Reply via email to