On 11/3/25 4:10 PM, Jared Hall via users wrote:
On 11/2/2025 9:14 AM, Matus UHLAR - fantomas wrote:extracttext_external pdftotext /usr/bin/pdftotext -nopgbrk -layout -enc UTF-8 {} - extracttext_use pdftotext .pdf application/pdfYes. Using that syntax exactly.I am not sure if it's possible to have PDF link like in HTML and what it would extract: <a href="http://example.com">see this link</a>You are correct. ExtractText works well with visible text. It does not seem to pickup anchored URL references.
fwiw Mail::SpamAssassin::Plugin::PDFInfo has some code to extract anchored URL references. Cheers Giovanni
OpenPGP_signature.asc
Description: OpenPGP digital signature
