On 11/3/2025 11:35 AM, [email protected] wrote:
On 11/3/25 4:10 PM, Jared Hall via users wrote:
On 11/2/2025 9:14 AM, Matus UHLAR - fantomas wrote:
extracttext_external pdftotext /usr/bin/pdftotext -nopgbrk
-layout -enc UTF-8 {} -
extracttext_use pdftotext .pdf application/pdf
Yes. Using that syntax exactly.
I am not sure if it's possible to have PDF link like in HTML and
what it would extract:
<a href="http://example.com">see this link</a>
You are correct. ExtractText works well with visible text. It does
not seem to
pickup anchored URL references.
fwiw Mail::SpamAssassin::Plugin::PDFInfo has some code to extract
anchored URL references.
Yes, I see the $pms->add_uri_detail_list($location); runs by default in
PDFInfo.pm
Just loaded the plugin in v341.pre. Works fine now.
Thanks,
-- Jared Hall
[email protected]