Re: PDFInfo plugin with SA 3.1.7
Hallo John, On Thu, Jul 12, 2007 at 08:19:04AM -0700, John Rudd wrote: I have this in /var/lib/clamav at the moment: drwxr-xr-x 2 clamav clamav4096 2007-07-12 14:22 clamav-29a2fe02977a1d4c26abf3fd199d1e70 -rw-r--r-- 1 clamav clamav 995915 2007-07-11 22:48 daily.cvd -rwxrwxr-- 1 clamav clamav 0 2007-07-12 14:15 .dbLock -rw-r--r-- 1 clamav clamav 9351789 2007-07-11 22:48 main.cvd -rw-r--r-- 1 clamav clamav 294979 2007-07-12 15:05 MSRBL-Images.hdb -rw-r--r-- 1 clamav clamav 228436 2007-07-12 15:05 MSRBL-SPAM.ndb -rw-r--r-- 1 clamav clamav 180868 2007-07-12 10:26 phish.ndb.gz -rw-r--r-- 1 clamav clamav 115449 2007-07-12 10:26 scam.ndb.gz Those are the ones you're getting from Sanesecurity. They're gzipped. In order to actually have ClamAV _USE_ them, you need to gunzip them. Thanks. That is what I was not sure of. This also make me wonder if you're actually testing the files before you put them into production. If you're not, that's a rather bad idea. At 2am this morning, I had a non-usable phish.ndb come through. If you're using clamd, that could have caused clamd to crash. Here's the script I use for importing from MSRBL and Sanesecurity. I run it out of cron with -all, on the hour. You'll probably need to modify some bits of the first few lines (down to the rsync binary location): The script I have downloaded also do some testing. I think the reason why those files were not unzipped was that the script was looking for the unzipped files before finishing it's task. It is working now and I like it. Regards Johann -- Johann Spies Telefoon: 021-808 4036 Informasietegnologie, Universiteit van Stellenbosch Let your character be free from the love of money, being content with what you have; for He Himself has said, I will never desert you, nor will I ever forsake you. Hebrews 13:5
Re: PDFInfo plugin with SA 3.1.7
I'm running PDFInfo 0.3 with SA 3.1.8 and it works fine for me - and I'm even running it on Windows! :) Cheers, Jeremy Suhas Ingale [EMAIL PROTECTED] wrote in message news:![EMAIL PROTECTED] Hello, I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the plugin successfully but does not scan the PDFs in the emails. According to Dallas Engelken (Creator of PDFInfo) , The MIME parser in SA is not seeing a PDF attachment on this message. Has anyone tried running PDFInfo plugin with 3.1.7 version?
Re: PDFInfo plugin with SA 3.1.7
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jeremy Fairbrass schrieb: I'm running PDFInfo 0.3 with SA 3.1.8 and it works fine for me - and I'm even running it on Windows! :) Cheers, Jeremy Suhas Ingale [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote in message news:![EMAIL PROTECTED] news:![EMAIL PROTECTED]... Hello, I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the plugin successfully but does not scan the PDFs in the emails. According to Dallas Engelken (Creator of PDFInfo) , The MIME parser in SA is not seeing a PDF attachment on this message. Has anyone tried running PDFInfo plugin with 3.1.7 version? Hi, after having good results in the beginning with pdfinfo , no one of the following pdf spam was catched/marked i am now using clam and Sanesecurity to eleminate pdf spam. It seems that the pdf spam mutates so fast that the recent concept of PDFInfo plugin isnt enough to catch/mark, so i think a solution like fuzzy ocr is needed here Spam in other attach file types ( doc etc ) may follow soon. - -- Mit freundlichen Gruessen Best Regards Robert Schetterer https://www.schetterer.org Germany -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFGlfprfGH2AvR16oERAofzAJ4/IWRUzPTAnDSr1bvUe/gmdmjjGgCeMiHB X+OCrOgBXx4einVKW5CwhWs= =xRaL -END PGP SIGNATURE-
RE: PDFInfo plugin with SA 3.1.7
Is there any other way/plugin to catch these pdf spam? And which is compatible with SA 3.1.7 -Original Message- From: Robert Schetterer [mailto:[EMAIL PROTECTED] Sent: Thursday, July 12, 2007 3:25 PM To: Jeremy Fairbrass Cc: users@spamassassin.apache.org Subject: Re: PDFInfo plugin with SA 3.1.7 -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jeremy Fairbrass schrieb: I'm running PDFInfo 0.3 with SA 3.1.8 and it works fine for me - and I'm even running it on Windows! :) Cheers, Jeremy Suhas Ingale [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote in message news:!!AAAYAMRrBcT1zltLjt0cCjq6+rbCgAAAEEH3WBzE0F5MsOsb [EMAIL PROTECTED] news:!!AAAYAMRrBcT1zltLjt0cCjq6+rbCgAAAEEH3WBzE0F5MsOs [EMAIL PROTECTED]... Hello, I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the plugin successfully but does not scan the PDFs in the emails. According to Dallas Engelken (Creator of PDFInfo) , The MIME parser in SA is not seeing a PDF attachment on this message. Has anyone tried running PDFInfo plugin with 3.1.7 version? Hi, after having good results in the beginning with pdfinfo , no one of the following pdf spam was catched/marked i am now using clam and Sanesecurity to eleminate pdf spam. It seems that the pdf spam mutates so fast that the recent concept of PDFInfo plugin isnt enough to catch/mark, so i think a solution like fuzzy ocr is needed here Spam in other attach file types ( doc etc ) may follow soon. - -- Mit freundlichen Gruessen Best Regards Robert Schetterer https://www.schetterer.org Germany -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFGlfprfGH2AvR16oERAofzAJ4/IWRUzPTAnDSr1bvUe/gmdmjjGgCeMiHB X+OCrOgBXx4einVKW5CwhWs= =xRaL -END PGP SIGNATURE-
Re: PDFInfo plugin with SA 3.1.7
On 7/12/2007 2:20 PM, Suhas Ingale wrote: Is there any other way/plugin to catch these pdf spam? And which is compatible with SA 3.1.7 It is compatible with SA 3.1.7 - both *nix Windows -Original Message- From: Robert Schetterer [mailto:[EMAIL PROTECTED] Sent: Thursday, July 12, 2007 3:25 PM To: Jeremy Fairbrass Cc: users@spamassassin.apache.org Subject: Re: PDFInfo plugin with SA 3.1.7 -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jeremy Fairbrass schrieb: I'm running PDFInfo 0.3 with SA 3.1.8 and it works fine for me - and I'm even running it on Windows! :) Cheers, Jeremy Suhas Ingale [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote in message news:!!AAAYAMRrBcT1zltLjt0cCjq6+rbCgAAAEEH3WBzE0F5MsOsb [EMAIL PROTECTED] news:!!AAAYAMRrBcT1zltLjt0cCjq6+rbCgAAAEEH3WBzE0F5MsOs [EMAIL PROTECTED]... Hello, I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the plugin successfully but does not scan the PDFs in the emails. According to Dallas Engelken (Creator of PDFInfo) , The MIME parser in SA is not seeing a PDF attachment on this message. Has anyone tried running PDFInfo plugin with 3.1.7 version? Hi, after having good results in the beginning with pdfinfo , no one of the following pdf spam was catched/marked i am now using clam and Sanesecurity to eleminate pdf spam. It seems that the pdf spam mutates so fast that the recent concept of PDFInfo plugin isnt enough to catch/mark, so i think a solution like fuzzy ocr is needed here Spam in other attach file types ( doc etc ) may follow soon. - -- Mit freundlichen Gruessen Best Regards Robert Schetterer https://www.schetterer.org Germany -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFGlfprfGH2AvR16oERAofzAJ4/IWRUzPTAnDSr1bvUe/gmdmjjGgCeMiHB X+OCrOgBXx4einVKW5CwhWs= =xRaL -END PGP SIGNATURE-
Re: PDFInfo plugin with SA 3.1.7
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Suhas Ingale schrieb: Is there any other way/plugin to catch these pdf spam? And which is compatible with SA 3.1.7 -Original Message- From: Robert Schetterer [mailto:[EMAIL PROTECTED] Sent: Thursday, July 12, 2007 3:25 PM To: Jeremy Fairbrass Cc: users@spamassassin.apache.org Subject: Re: PDFInfo plugin with SA 3.1.7 Jeremy Fairbrass schrieb: I'm running PDFInfo 0.3 with SA 3.1.8 and it works fine for me - and I'm even running it on Windows! :) Cheers, Jeremy Suhas Ingale [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote in message news:!!AAAYAMRrBcT1zltLjt0cCjq6+rbCgAAAEEH3WBzE0F5MsOsb [EMAIL PROTECTED] news:!!AAAYAMRrBcT1zltLjt0cCjq6+rbCgAAAEEH3WBzE0F5MsOs [EMAIL PROTECTED]... Hello, I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the plugin successfully but does not scan the PDFs in the emails. According to Dallas Engelken (Creator of PDFInfo) , The MIME parser in SA is not seeing a PDF attachment on this message. Has anyone tried running PDFInfo plugin with 3.1.7 version? Hi, after having good results in the beginning with pdfinfo , no one of the following pdf spam was catched/marked i am now using clam and Sanesecurity to eleminate pdf spam. It seems that the pdf spam mutates so fast that the recent concept of PDFInfo plugin isnt enough to catch/mark, so i think a solution like fuzzy ocr is needed here Spam in other attach file types ( doc etc ) may follow soon. As i wrote before clam and Sanesecurity works nice, I was noticed that there is a new version of the pdfinfo plugin, i will mail results to the list with the upgrade installed and just a small tip, avoid top posting, - -- Mit freundlichen Gruessen Best Regards Robert Schetterer https://www.schetterer.org Germany -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFGlh9LfGH2AvR16oERAvZ3AJ9g/ilAdPAoPU/2OmGOOdL6r2vHsQCfZcdA RGN/V2fweRYNC+33m5MlWM4= =pMbl -END PGP SIGNATURE-
Re: PDFInfo plugin with SA 3.1.7
Robert Schetterer wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Jeremy Fairbrass schrieb: I'm running PDFInfo 0.3 with SA 3.1.8 and it works fine for me - and I'm even running it on Windows! :) Cheers, Jeremy Suhas Ingale [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote in message news:![EMAIL PROTECTED] news:![EMAIL PROTECTED]... Hello, I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the plugin successfully but does not scan the PDFs in the emails. According to Dallas Engelken (Creator of PDFInfo) , The MIME parser in SA is not seeing a PDF attachment on this message. Has anyone tried running PDFInfo plugin with 3.1.7 version? Hi, after having good results in the beginning with pdfinfo , no one of the following pdf spam was catched/marked i am now using clam and Sanesecurity to eleminate pdf spam. It seems that the pdf spam mutates so fast that the recent concept of PDFInfo plugin isnt enough to catch/mark, so i think a solution like fuzzy ocr is needed here Spam in other attach file types ( doc etc ) may follow soon. That is what concerns me more, if I start opening up every attached file, parsing out text, parsing out images (so I can parse out the text) and then running SA against everything I parse out, it's gonna get hairy real fast. I can absolutely see doc, rtf, etc coming next. Considering we deal with more PDF and Doc Ham than PDF and Doc spam, it will become a major resource hog fast. I don't want to being running plugins extracting text and images from docs, then text from images from docs, on 40,000+ messages a day. There has to be a better way. DAve -- Three years now I've asked Google why they don't have a logo change for Memorial Day. Why do they choose to do logos for other non-international holidays, but nothing for Veterans? Maybe they forgot who made that choice possible.
Re: PDFInfo plugin with SA 3.1.7
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Johann Spies schrieb: On Thu, Jul 12, 2007 at 11:54:51AM +0200, Robert Schetterer wrote: Hi, after having good results in the beginning with pdfinfo , no one of the following pdf spam was catched/marked i am now using clam and Sanesecurity to eleminate pdf spam. I have tried that, but clamav did not pick up one when scanning a bunch of the pdf-spam. I have used one of the downloadscripts on Sanesecurity. Do I have to do some other configurations to activate the databasis for Clamav? I have this in /var/lib/clamav at the moment: drwxr-xr-x 2 clamav clamav4096 2007-07-12 14:22 clamav-29a2fe02977a1d4c26abf3fd199d1e70 -rw-r--r-- 1 clamav clamav 995915 2007-07-11 22:48 daily.cvd -rwxrwxr-- 1 clamav clamav 0 2007-07-12 14:15 .dbLock -rw-r--r-- 1 clamav clamav 9351789 2007-07-11 22:48 main.cvd -rw-r--r-- 1 clamav clamav 294979 2007-07-12 15:05 MSRBL-Images.hdb -rw-r--r-- 1 clamav clamav 228436 2007-07-12 15:05 MSRBL-SPAM.ndb -rw-r--r-- 1 clamav clamav 180868 2007-07-12 10:26 phish.ndb.gz -rw-r--r-- 1 clamav clamav 115449 2007-07-12 10:26 scam.ndb.gz Regards Johann Hi Johann, your clam db looks nice, but its always the same with signatures of antivirus tools, it may need time till they catch the newest one. Today i had 3 new pdf spam, so the mutation rate is very high and cant be catched at once , if you recieve it in a early stage of spreading, the download scripts of Sanesecurity works for me, if i got in trouble with curl , i use wget for the new sigs and restart clam, this worked too for me, also no trouble with todays update 0.91 to clam maybe you ask the coders of Sanesecurity for help - -- Mit freundlichen Gruessen Best Regards Robert Schetterer https://www.schetterer.org Germany -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFGljg2fGH2AvR16oERAtKrAJsFdiIDF8LVypS78AKrN2nY6oYV5wCggcns vUvbgpdZ4rKJxysPJkCsovQ= =HuVF -END PGP SIGNATURE-
Re: PDFInfo plugin with SA 3.1.7
On Thu, 2007-07-12 at 16:03 +0200, Johann Spies wrote: On Thu, Jul 12, 2007 at 11:54:51AM +0200, Robert Schetterer wrote: i am now using clam and Sanesecurity to eleminate pdf spam. I have tried that, but clamav did not pick up one when scanning a bunch of the pdf-spam. I have used one of the downloadscripts on Sanesecurity. Do I have to do some other configurations to activate the databasis for Clamav? Do you have ScanPDF enabled in your /etc/clamd.conf file? It's typically disabled by default. -Bill
Re: PDFInfo plugin with SA 3.1.7
On Thu, Jul 12, 2007 at 11:54:51AM +0200, Robert Schetterer wrote: Hi, after having good results in the beginning with pdfinfo , no one of the following pdf spam was catched/marked i am now using clam and Sanesecurity to eleminate pdf spam. I have tried that, but clamav did not pick up one when scanning a bunch of the pdf-spam. I have used one of the downloadscripts on Sanesecurity. Do I have to do some other configurations to activate the databasis for Clamav? I have this in /var/lib/clamav at the moment: drwxr-xr-x 2 clamav clamav4096 2007-07-12 14:22 clamav-29a2fe02977a1d4c26abf3fd199d1e70 -rw-r--r-- 1 clamav clamav 995915 2007-07-11 22:48 daily.cvd -rwxrwxr-- 1 clamav clamav 0 2007-07-12 14:15 .dbLock -rw-r--r-- 1 clamav clamav 9351789 2007-07-11 22:48 main.cvd -rw-r--r-- 1 clamav clamav 294979 2007-07-12 15:05 MSRBL-Images.hdb -rw-r--r-- 1 clamav clamav 228436 2007-07-12 15:05 MSRBL-SPAM.ndb -rw-r--r-- 1 clamav clamav 180868 2007-07-12 10:26 phish.ndb.gz -rw-r--r-- 1 clamav clamav 115449 2007-07-12 10:26 scam.ndb.gz Regards Johann -- Johann Spies Telefoon: 021-808 4036 Informasietegnologie, Universiteit van Stellenbosch Many, O LORD my God, are thy wonderful works which thou hast done, and thy thoughts which are for us... Psalms 40:5
Re: PDFInfo plugin with SA 3.1.7
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Bill Randle schrieb: On Thu, 2007-07-12 at 16:03 +0200, Johann Spies wrote: On Thu, Jul 12, 2007 at 11:54:51AM +0200, Robert Schetterer wrote: i am now using clam and Sanesecurity to eleminate pdf spam. I have tried that, but clamav did not pick up one when scanning a bunch of the pdf-spam. I have used one of the downloadscripts on Sanesecurity. Do I have to do some other configurations to activate the databasis for Clamav? Do you have ScanPDF enabled in your /etc/clamd.conf file? It's typically disabled by default. -Bill clam and Sanesecurity does not matter to ScanPDF enabled in your /etc/clamd.conf - -- Mit freundlichen Gruessen Best Regards Robert Schetterer https://www.schetterer.org Germany -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFGljxgfGH2AvR16oERAkHXAJ0YcqZ9u4y4TShaZr1I4kR7udab9gCfSkbP xCvTrK/AONG8de060UDuUNA= =fI/c -END PGP SIGNATURE-
Re: PDFInfo plugin with SA 3.1.7
On Thu, Jul 12, 2007 at 05:50:37PM +0530, Suhas Ingale wrote: Is there any other way/plugin to catch these pdf spam? And which is compatible with SA 3.1.7 Use sa-update. -- Randomly Selected Tagline: A two party system is kind-of like a two stroke engine. Alot is wasted in the form of noise and hot, noxious gasses. - Jim Flanagan in [EMAIL PROTECTED] pgpvZggGx5dA7.pgp Description: PGP signature
Re: PDFInfo plugin with SA 3.1.7
Johann Spies wrote: On Thu, Jul 12, 2007 at 11:54:51AM +0200, Robert Schetterer wrote: Hi, after having good results in the beginning with pdfinfo , no one of the following pdf spam was catched/marked i am now using clam and Sanesecurity to eleminate pdf spam. I have tried that, but clamav did not pick up one when scanning a bunch of the pdf-spam. I have used one of the downloadscripts on Sanesecurity. Do I have to do some other configurations to activate the databasis for Clamav? I have this in /var/lib/clamav at the moment: drwxr-xr-x 2 clamav clamav4096 2007-07-12 14:22 clamav-29a2fe02977a1d4c26abf3fd199d1e70 -rw-r--r-- 1 clamav clamav 995915 2007-07-11 22:48 daily.cvd -rwxrwxr-- 1 clamav clamav 0 2007-07-12 14:15 .dbLock -rw-r--r-- 1 clamav clamav 9351789 2007-07-11 22:48 main.cvd -rw-r--r-- 1 clamav clamav 294979 2007-07-12 15:05 MSRBL-Images.hdb -rw-r--r-- 1 clamav clamav 228436 2007-07-12 15:05 MSRBL-SPAM.ndb -rw-r--r-- 1 clamav clamav 180868 2007-07-12 10:26 phish.ndb.gz -rw-r--r-- 1 clamav clamav 115449 2007-07-12 10:26 scam.ndb.gz The problem is with those last 2 files. Those are the ones you're getting from Sanesecurity. They're gzipped. In order to actually have ClamAV _USE_ them, you need to gunzip them. This also make me wonder if you're actually testing the files before you put them into production. If you're not, that's a rather bad idea. At 2am this morning, I had a non-usable phish.ndb come through. If you're using clamd, that could have caused clamd to crash. Here's the script I use for importing from MSRBL and Sanesecurity. I run it out of cron with -all, on the hour. You'll probably need to modify some bits of the first few lines (down to the rsync binary location): #!/usr/local/bin/perl my $chmod = /bin/chmod; my $mv = /bin/mv; my $gunzip = /usr/bin/gunzip; my $clamscan = /usr/local/bin/clamscan; my $testfile = /bin/sh; my $diff = /usr/bin/diff; my %methods = (http = /usr/local/bin/wget -q, rsync = /usr/bin/rsync -q); my %urls = (msrbl-spam = rsync://rsync.mirror.msrbl.com/msrbl/MSRBL-SPAM.ndb, msrbl-imgs = rsync://rsync.mirror.msrbl.com/msrbl/MSRBL-Images.hdb, sane-phish = http://www.sanesecurity.com/clamav/phishsigs/phish.ndb.gz;, sane-scam = http://www.sanesecurity.com/clamav/scamsigs/scam.ndb.gz;); my %tmpdirs = (msrbl-spam = /tmp/msrbl, msrbl-imgs = /tmp/msrbl, sane-phish = /tmp/sanecomputing, sane-scam = /tmp/sanecomputing); my %destdirs = (msrbl-spam = /var/lib/clamav, msrbl-imgs = /var/lib/clamav, sane-phish = /var/lib/clamav, sane-scam = /var/lib/clamav); my $getall = 0; my (@distros, $dist, $tmpdir, $proto, $method, $file, $retcode); my ($ufile, $diffout, $destdir); if ($ARGV[0] =~ --?al?l?) { $getall = 1; @distros = keys(%urls); } else { @distros = @ARGV; } foreach $dist (sort (@distros)) { $tmpdir = $tmpdirs{$dist}; $destdir = $destdirs{$dist}; $url = $urls{$dist}; $proto = $url; $proto =~ s/:.*$//; $method = $methods{$proto}; $file = $url; $file =~ s^.*/([^/]*)$$1; $ufile = $file; $ufile =~ s/\.gz$//; if ((-e $tmpdir) (!(-d $tmpdir))) { rename ($tmpdir, ($tmpdir . .bad)) || die tmpdir $tmpdir isn't a directory, can't rename it; mkdir ($tmpdir) || die can't make tmpdir $tmpdir; } elsif (! (-e $tmpdir)) { mkdir ($tmpdir) || die can't make tmpdir $tmpdir; } system ($chmod 700 $tmpdir); if ((-e $destdir) (!(-d $destdir))) { rename ($destdir, ($destdir . .bad)) || die destdir $destdir isn't a directory, can't rename it; mkdir ($destdir) || die can't make destdir $destdir; } elsif (! (-e $destdir)) { mkdir ($destdir) || die can't make destdir $tmpdir; } system ($chmod 775 $destdir); chdir ($tmpdir); if (-e $file) { unlink ($file); } if (-e $ufile) { unlink ($ufile); } # download the file if ($proto eq rsync) { system($method $url $file); } elsif ($proto eq http) { system($method $url); } unless (-e $file) { printdidn't get download file $file\n; last; } if ($file =~ /\.gz$/) { system($gunzip $file); $file = $ufile; } # test against clamav $retcode = system($clamscan --database=$tmpdir $testfile /dev/null 21) / 256; if ($retcode == 0) { # clamscan of testfile worked and didn't find a virus # lets see if it's the same file we already have/had $diffout = (system($diff --brief --speed-large-files $tmpdir/$file $destdir/$file /dev/null 2/dev/null)) / 256; if ($diffout == 0) { # file hasn't changed unlink ($file); } else { print$file appears to have changed, moving to destination\n; system($mv $tmpdir/$file $destdir/$file); system($chmod 644 $destdir/$file); } }
PDFInfo plugin with SA 3.1.7
Hello, I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the plugin successfully but does not scan the PDFs in the emails. According to Dallas Engelken (Creator of PDFInfo) , The MIME parser in SA is not seeing a PDF attachment on this message. Has anyone tried running PDFInfo plugin with 3.1.7 version?
Re: PDFInfo plugin with SA 3.1.7
On Wed, 2007-07-11 at 14:49 +0530, Suhas Ingale wrote: Has anyone tried running PDFInfo plugin with 3.1.7 version? No, finally got it working yesterday evening using 3.2.1, but the initial results are underwhelming. Almost 100% overlap with TVD_SPACE_RATIO. Only one miss: sudo grep GMD_PDF /var/log/mail/info | grep -v TVD_SPACE_RATIO Jul 11 03:26:15 sa amavis[25324]: (25324-17) SPAM, [EMAIL PROTECTED] - [EMAIL PROTECTED], Yes, score=25.456 tag=-99 tag2=4.5 kill=6.31 tests=[BODY_8BITS=1.5, BOTNET_CLIENT=0.01, BOTNET_CLIENTWORDS=0, BOTNET_IPINHOSTNAME=0, BOTNET_W=2, DKIM_POLICY_SIGNSOME=0, FH_HELO_EQ_D_D_D_D=0.498, GMD_PDF_BAD_FUZZY=3.75, GMD_PDF_HORIZ=0.25, GMD_PDF_STOX=1, HELO_DYNAMIC_DHCP=1.52, HELO_DYNAMIC_IPADDR=2.935, L_P0F_W=1, RAZOR2_CF_RANGE_51_100=0.5, RAZOR2_CF_RANGE_E4_51_100=1.5, RAZOR2_CHECK=0.5, RCVD_IN_BL_SPAMCOP_NET=2.188, RCVD_IN_PBL=0.509, RCVD_IN_XBL=2.896, RDNS_DYNAMIC=0.1, UNWANTED_LANGUAGE_BODY=2.8], autolearn=disabled That's out of [EMAIL PROTECTED] ~]$ sudo grep -o -P GMD_PDF.+?= /var/log/mail/info | sort | uniq -c 684 GMD_PDF_BAD_FUZZY= 43 GMD_PDF_HORIZ= 67 GMD_PDF_STOX= 24 GMD_PDF_VERT= -- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com
Re: PDFInfo plugin with SA 3.1.7
Daniel J McDonald wrote: On Wed, 2007-07-11 at 14:49 +0530, Suhas Ingale wrote: Has anyone tried running PDFInfo plugin with 3.1.7 version? No, finally got it working yesterday evening using 3.2.1, but the initial results are underwhelming. Almost 100% overlap with TVD_SPACE_RATIO. Only one miss: First of all, TVD_SPACE_RATIO only applies for those running v3.2, whereas PDFInfo.pm can be used with any 3.x version.. Secondly, TVD_SPACE_RATIO can fire almost at will without a body. $ echo | spamassassin 2.9 TVD_SPACE_RATIOBODY: TVD_SPACE_RATIO Take the basic mime part from a pdf stock spam... it looks similar to this --050701020003040207010006 Content-Type: text/plain; charset=iso-8859-2; format=flowed Content-Transfer-Encoding: 7bit --050701020003040207010006 and it fires on TVD_SPACE_RATIO fine. $ cat /root/sample2.txt | spamassassin -D 21 | grep -i tvd [26686] dbg: tvd: word [SPAM-8.3]- Re: warning_6042146166.pdf [26686] dbg: tvd: len=39 [26686] dbg: tvd: spaces 2 nonspaces 37 [26686] dbg: tvd: pct = 5 [26686] dbg: tvd: final = 5 [26686] dbg: rules: ran eval rule TVD_SPACE_RATIO == got hit (1) change the mime part to --050701020003040207010006 Content-Type: text/plain; charset=iso-8859-2; format=flowed Content-Transfer-Encoding: 7bit tvd no longer fires now --050701020003040207010006 $ cat /root/sample2.txt | spamassassin -D 21 | grep -i tvd [26739] dbg: tvd: word [SPAM-8.3]- Re: warning_6042146166.pdf [26739] dbg: tvd: len=39 [26739] dbg: tvd: spaces 2 nonspaces 37 [26739] dbg: tvd: pct = 5 [26739] dbg: tvd: word tvd no longer fires now [26739] dbg: tvd: len=24 [26739] dbg: tvd: spaces 4 nonspaces 20 [26739] dbg: tvd: pct = 20 [26739] dbg: tvd: final = 20 ... and 20 isnt between tvd_vertical_words('0','10') Easy for spammy to avoid that. Even more, this rule has a good chance of falsing. I emailed myself a png from webalizer without any body text. # cat test | spamassassin -D 21 |grep -i tvd [27390] dbg: tvd: word hourly_usage_200706.png [27390] dbg: tvd: len=24 [27390] dbg: tvd: spaces 0 nonspaces 24 [27390] dbg: tvd: pct = 0 [27390] dbg: tvd: final = 0 [27390] dbg: rules: ran eval rule TVD_SPACE_RATIO == got hit (1) The fact is, email is FTP for Dummies... and IMHO, TVD_SPACE_RATIO may be a bit high at 2.9. BTW, v0.3 of PDFInfo.pm is now posted - so for those that have it already, you might want to sync up # countsGMD_PDF_HORIZ 135s/0h of 6132 corpus (4555s/1577h AxB-MANUAL) 07/11/07 # countsGMD_PDF_HORIZ 31s/0h of 11773 corpus (10988s/785h AxB2-TRAPS) 07/11/07 # countsGMD_PDF_SQUARE 36s/0h of 6132 corpus (4555s/1577h AxB-MANUAL) 07/11/07 # countsGMD_PDF_SQUARE 11s/0h of 11773 corpus (10988s/785h AxB2-TRAPS) 07/11/07 # countsGMD_PDF_VERT24s/0h of 6132 corpus (4555s/1577h AxB-MANUAL) 07/11/07 # countsGMD_PDF_VERT10s/0h of 11773 corpus (10988s/785h AxB2-TRAPS) 07/11/07 # countsGMD_PDF_FUZZY1_T1 591s/0h of 6132 corpus (555s/1577h AxB-MANUAL) 07/11/07 # countsGMD_PDF_FUZZY1_T1 199s/0h of 11773 corpus (10988s/785h AxB2-TRAPS) 07/11/07 # countsGMD_PDF_FUZZY2_T1 199s/0h of 11773 corpus (10988s/785h AxB2-TRAPS) 07/11/07 # countsGMD_PDF_FUZZY2_T1 591s/0h of 6132 corpus (555s/1577h AxB-MANUAL) 07/11/07 # countsGMD_PDF_FUZZY2_T2 118s/0h of 6132 corpus (555s/1577h AxB-MANUAL) 07/11/07 # countsGMD_PDF_FUZZY2_T2 1s/0h of 10767 corpus (9986s/781h AxB2-TRAPS) 07/11/07 # countsGMD_PDF_FUZZY2_T3 0s/0h of 10767 corpus (9986s/781h AxB2-TRAPS) 07/11/07 # countsGMD_PDF_FUZZY2_T3 25s/0h of 5641 corpus (4064s/1577h AxB-MANUAL) 07/11/07 # countsGMD_PDF_FUZZY2_T4 105s/0h of 6132 corpus (555s/1577h AxB-MANUAL) 07/11/07 # countsGMD_PDF_FUZZY2_T4 28s/0h of 10767 corpus (9986s/781h AxB2-TRAPS) 07/11/07 # countsGMD_AUTHOR_COLET1s/0h of 10767 corpus (9986s/781h AxB2-TRAPS) 07/11/07 # countsGMD_AUTHOR_COLET2s/0h of 6132 corpus (555s/1577h AxB-MANUAL) 07/11/07 # countsGMD_AUTHOR_MOBILE 2s/0h of 6132 corpus (555s/1577h AxB-MANUAL) 07/11/07 # countsGMD_AUTHOR_MOBILE 55s/0h of 10767 corpus (9986s/781h AxB2-TRAPS) 07/11/07 # countsGMD_AUTHOR_OOO 1s/0h of 10767 corpus (9986s/781h AxB2-TRAPS) 07/11/07 # countsGMD_AUTHOR_OOO 118s/0h of 6132 corpus (555s/1577h AxB-MANUAL) 07/11/07 # countsGMD_AUTHOR_HPADMIN 105s/0h of 6132 corpus (4555s/1577h AxB-MANUAL) 07/11/07 # countsGMD_AUTHOR_HPADMIN 27s/0h of 11773 corpus (10988s/785h AxB2-TRAPS) 07/11/07 # countsGMD_PRODUCER_GPL227s/0h of 6132 corpus (555s/1577h AxB-MANUAL) 07/11/07 # countsGMD_PRODUCER_GPL85s/0h of 10767 corpus (9986s/781h AxB2-TRAPS)