On 11/3/25 4:10 PM, Jared Hall via users wrote:
On 11/2/2025 9:14 AM, Matus UHLAR - fantomas wrote:

extracttext_external    pdftotext       /usr/bin/pdftotext -nopgbrk -layout 
-enc UTF-8 {} -
extracttext_use         pdftotext       .pdf application/pdf

Yes. Using that syntax exactly.
I am not sure if it's possible to have PDF link like in HTML and what it would 
extract:

<a href="http://example.com";>see this link</a>
You are correct.  ExtractText works well with visible text.  It does not seem to
pickup anchored URL references.

fwiw Mail::SpamAssassin::Plugin::PDFInfo has some code to extract anchored URL 
references.

 Cheers
  Giovanni

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to