>I need a rule that can detect urls inside a PDF file. Can you guide me?

do you have SA 4.0 installed and plugin:
Mail::SpamAssassin::Plugin::ExtractText
enabled?

    This is a production server running CentOS7 with spamassassin-3.4.0-6.

On 05.07.22 11:00, Indunil Jayasooriya wrote:
   According to your statement,  spamassassin-3.4 will NOT be able to
fulfill this job.

hardly. SA 4 should be released soon AFAIK, you may test it as I do.
I was unable to find spam with PDF containing URL in my archive, but other rules catch text contained in pdf attachments.

   spamassassin current version is release 3.4.6 (Stable).
   Am I right? Where can I find Spam assassin 4?

   I have NOT installed Mail::SpamAssassin::Plugin::ExtractText.
   I could NOT find any RPM for the above package.

it's in the SA 4 distribution. I have no idea where you can get wortking SA4 package for centos tho.


did you install pdftotext and configure SA to use it?

       I installed it with the below command.
           yum install poppler-utils

     How to configure spamassassin to use it?
              Is just a file called rule.cf enough?

no, you must have SA4, ExtractText enabled and configure it to use pdftotext e.g. according to the ExtractText docs:

       loadplugin Mail::SpamAssassin::Plugin::ExtractText

       ifplugin Mail::SpamAssassin::Plugin::ExtractText

         extracttext_external  pdftotext  /usr/bin/pdftotext -nopgbrk -layout 
-enc UTF-8 {} -
         extracttext_use       pdftotext  .pdf application/pdf


           If so, Could you please give me an example to catch a URL
inside PDF attachment?

    I googled it. But could not find a way for spamassassin to use it.

SA 4 is the way but note that this is amavis mailing list.
While amavis can use SA for spam searching, it's not the best list for this
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Emacs is a complicated operating system without good text editor.

Reply via email to