Re: PDFInfo plugin with SA 3.1.7

2007-07-13 Thread Johann Spies
Hallo John,

On Thu, Jul 12, 2007 at 08:19:04AM -0700, John Rudd wrote:
 
 I have this in /var/lib/clamav at the moment:
 
   drwxr-xr-x  2 clamav clamav4096 2007-07-12 14:22 
   clamav-29a2fe02977a1d4c26abf3fd199d1e70
   -rw-r--r--  1 clamav clamav  995915 2007-07-11 22:48 daily.cvd
   -rwxrwxr--  1 clamav clamav   0 2007-07-12 14:15 .dbLock
   -rw-r--r--  1 clamav clamav 9351789 2007-07-11 22:48 main.cvd
   -rw-r--r--  1 clamav clamav  294979 2007-07-12 15:05 MSRBL-Images.hdb
   -rw-r--r--  1 clamav clamav  228436 2007-07-12 15:05 MSRBL-SPAM.ndb
   -rw-r--r--  1 clamav clamav  180868 2007-07-12 10:26 phish.ndb.gz
   -rw-r--r--  1 clamav clamav  115449 2007-07-12 10:26 scam.ndb.gz
 
 
 Those are the ones you're getting from Sanesecurity.  They're gzipped. 
 In order to actually have ClamAV _USE_ them, you need to gunzip them.

Thanks. That is what I was not sure of.

 
 This also make me wonder if you're actually testing the files before you 
 put them into production.  If you're not, that's a rather bad idea.  At 
 2am this morning, I had a non-usable phish.ndb come through.  If you're 
 using clamd, that could have caused clamd to crash.
 
 
 Here's the script I use for importing from MSRBL and Sanesecurity.  I 
 run it out of cron with -all, on the hour.  You'll probably need to 
 modify some bits of the first few lines (down to the rsync binary location):

The script I have downloaded also do some testing.  I think
the reason why those files were not unzipped was that the script was
looking for the unzipped files before finishing it's task.

It is working now and I like it.

Regards
Johann
-- 
Johann Spies  Telefoon: 021-808 4036
Informasietegnologie, Universiteit van Stellenbosch

 Let your character be free from the love of money,
  being content with what you have; for He Himself has
  said, I will never desert you, nor will I ever
  forsake you.
  Hebrews 13:5


Re: PDFInfo plugin with SA 3.1.7

2007-07-12 Thread Jeremy Fairbrass
I'm running PDFInfo 0.3 with SA 3.1.8 and it works fine for me - and I'm 
even running it on Windows! :)

Cheers,
Jeremy


  Suhas Ingale [EMAIL PROTECTED] wrote in message 
news:![EMAIL PROTECTED]
  Hello,



  I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the plugin 
successfully but does not scan the PDFs in the emails. According to Dallas 
Engelken (Creator of PDFInfo) , The MIME parser in SA is not seeing a PDF 
attachment on this message.



  Has anyone tried running PDFInfo plugin with 3.1.7 version?








Re: PDFInfo plugin with SA 3.1.7

2007-07-12 Thread Robert Schetterer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jeremy Fairbrass schrieb:
 I'm running PDFInfo 0.3 with SA 3.1.8 and it works fine for me - and I'm
 even running it on Windows! :)
  
 Cheers,
 Jeremy
  
  
 
 Suhas Ingale [EMAIL PROTECTED]
 mailto:[EMAIL PROTECTED] wrote in message
 news:![EMAIL PROTECTED]
 news:![EMAIL PROTECTED]...
 
 Hello,
 
  
 
 I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the
 plugin successfully but does not scan the PDFs in the emails.
 According to Dallas Engelken (Creator of PDFInfo) , The MIME parser
 in SA is not seeing a PDF attachment on this message.
 
  
 
 Has anyone tried running PDFInfo plugin with 3.1.7 version?
 
  
 
  
 
  
 
Hi, after having good results in the beginning
with pdfinfo ,
no one of the following pdf spam was catched/marked

i am now using
clam and Sanesecurity to eleminate pdf spam.

It seems that the pdf spam mutates so fast
that the recent concept of   PDFInfo plugin
isnt enough to catch/mark, so i think a solution like fuzzy ocr is
needed here

Spam in other attach file types ( doc etc ) may follow soon.

- --
Mit freundlichen Gruessen
Best Regards

Robert Schetterer

https://www.schetterer.org
Germany
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFGlfprfGH2AvR16oERAofzAJ4/IWRUzPTAnDSr1bvUe/gmdmjjGgCeMiHB
X+OCrOgBXx4einVKW5CwhWs=
=xRaL
-END PGP SIGNATURE-



RE: PDFInfo plugin with SA 3.1.7

2007-07-12 Thread Suhas Ingale
Is there any other way/plugin to catch these pdf spam? And which is
compatible with SA 3.1.7

-Original Message-
From: Robert Schetterer [mailto:[EMAIL PROTECTED] 
Sent: Thursday, July 12, 2007 3:25 PM
To: Jeremy Fairbrass
Cc: users@spamassassin.apache.org
Subject: Re: PDFInfo plugin with SA 3.1.7

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jeremy Fairbrass schrieb:
 I'm running PDFInfo 0.3 with SA 3.1.8 and it works fine for me - and I'm
 even running it on Windows! :)
  
 Cheers,
 Jeremy
  
  
 
 Suhas Ingale [EMAIL PROTECTED]
 mailto:[EMAIL PROTECTED] wrote in message

news:!!AAAYAMRrBcT1zltLjt0cCjq6+rbCgAAAEEH3WBzE0F5MsOsb
[EMAIL PROTECTED]

news:!!AAAYAMRrBcT1zltLjt0cCjq6+rbCgAAAEEH3WBzE0F5MsOs
[EMAIL PROTECTED]...
 
 Hello,
 
  
 
 I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the
 plugin successfully but does not scan the PDFs in the emails.
 According to Dallas Engelken (Creator of PDFInfo) , The MIME parser
 in SA is not seeing a PDF attachment on this message.
 
  
 
 Has anyone tried running PDFInfo plugin with 3.1.7 version?
 
  
 
  
 
  
 
Hi, after having good results in the beginning
with pdfinfo ,
no one of the following pdf spam was catched/marked

i am now using
clam and Sanesecurity to eleminate pdf spam.

It seems that the pdf spam mutates so fast
that the recent concept of   PDFInfo plugin
isnt enough to catch/mark, so i think a solution like fuzzy ocr is
needed here

Spam in other attach file types ( doc etc ) may follow soon.

- --
Mit freundlichen Gruessen
Best Regards

Robert Schetterer

https://www.schetterer.org
Germany
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFGlfprfGH2AvR16oERAofzAJ4/IWRUzPTAnDSr1bvUe/gmdmjjGgCeMiHB
X+OCrOgBXx4einVKW5CwhWs=
=xRaL
-END PGP SIGNATURE-



Re: PDFInfo plugin with SA 3.1.7

2007-07-12 Thread Yet Another Ninja

On 7/12/2007 2:20 PM, Suhas Ingale wrote:

Is there any other way/plugin to catch these pdf spam? And which is
compatible with SA 3.1.7


It is compatible with SA 3.1.7 - both *nix  Windows




-Original Message-
From: Robert Schetterer [mailto:[EMAIL PROTECTED] 
Sent: Thursday, July 12, 2007 3:25 PM

To: Jeremy Fairbrass
Cc: users@spamassassin.apache.org
Subject: Re: PDFInfo plugin with SA 3.1.7

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jeremy Fairbrass schrieb:

I'm running PDFInfo 0.3 with SA 3.1.8 and it works fine for me - and I'm
even running it on Windows! :)
 
Cheers,

Jeremy
 
 


Suhas Ingale [EMAIL PROTECTED]
mailto:[EMAIL PROTECTED] wrote in message


news:!!AAAYAMRrBcT1zltLjt0cCjq6+rbCgAAAEEH3WBzE0F5MsOsb
[EMAIL PROTECTED]
news:!!AAAYAMRrBcT1zltLjt0cCjq6+rbCgAAAEEH3WBzE0F5MsOs
[EMAIL PROTECTED]...

Hello,

 


I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the
plugin successfully but does not scan the PDFs in the emails.
According to Dallas Engelken (Creator of PDFInfo) , The MIME parser
in SA is not seeing a PDF attachment on this message.

 


Has anyone tried running PDFInfo plugin with 3.1.7 version?

 

 

 


Hi, after having good results in the beginning
with pdfinfo ,
no one of the following pdf spam was catched/marked

i am now using
clam and Sanesecurity to eleminate pdf spam.

It seems that the pdf spam mutates so fast
that the recent concept of   PDFInfo plugin
isnt enough to catch/mark, so i think a solution like fuzzy ocr is
needed here

Spam in other attach file types ( doc etc ) may follow soon.

- --
Mit freundlichen Gruessen
Best Regards

Robert Schetterer

https://www.schetterer.org
Germany
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFGlfprfGH2AvR16oERAofzAJ4/IWRUzPTAnDSr1bvUe/gmdmjjGgCeMiHB
X+OCrOgBXx4einVKW5CwhWs=
=xRaL
-END PGP SIGNATURE-






Re: PDFInfo plugin with SA 3.1.7

2007-07-12 Thread Robert Schetterer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Suhas Ingale schrieb:
 Is there any other way/plugin to catch these pdf spam? And which is
 compatible with SA 3.1.7
 
 -Original Message-
 From: Robert Schetterer [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, July 12, 2007 3:25 PM
 To: Jeremy Fairbrass
 Cc: users@spamassassin.apache.org
 Subject: Re: PDFInfo plugin with SA 3.1.7
 
 Jeremy Fairbrass schrieb:
 I'm running PDFInfo 0.3 with SA 3.1.8 and it works fine for me - and I'm
 even running it on Windows! :)
 
 Cheers,
 Jeremy
 
 
 
 Suhas Ingale [EMAIL PROTECTED]
 mailto:[EMAIL PROTECTED] wrote in message
 
 news:!!AAAYAMRrBcT1zltLjt0cCjq6+rbCgAAAEEH3WBzE0F5MsOsb
 [EMAIL PROTECTED]
 news:!!AAAYAMRrBcT1zltLjt0cCjq6+rbCgAAAEEH3WBzE0F5MsOs
 [EMAIL PROTECTED]...
 Hello,
 
 
 
 I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the
 plugin successfully but does not scan the PDFs in the emails.
 According to Dallas Engelken (Creator of PDFInfo) , The MIME parser
 in SA is not seeing a PDF attachment on this message.
 
 
 
 Has anyone tried running PDFInfo plugin with 3.1.7 version?
 
 
 
 
 
 
 
 Hi, after having good results in the beginning
 with pdfinfo ,
 no one of the following pdf spam was catched/marked
 
 i am now using
 clam and Sanesecurity to eleminate pdf spam.
 
 It seems that the pdf spam mutates so fast
 that the recent concept of   PDFInfo plugin
 isnt enough to catch/mark, so i think a solution like fuzzy ocr is
 needed here
 
 Spam in other attach file types ( doc etc ) may follow soon.
 

As i wrote before

clam and Sanesecurity works nice,

I was noticed that there is a new version
of the pdfinfo plugin, i will mail results to the list
with the upgrade installed

and just a small tip, avoid top posting,

- --
Mit freundlichen Gruessen
Best Regards

Robert Schetterer

https://www.schetterer.org
Germany
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFGlh9LfGH2AvR16oERAvZ3AJ9g/ilAdPAoPU/2OmGOOdL6r2vHsQCfZcdA
RGN/V2fweRYNC+33m5MlWM4=
=pMbl
-END PGP SIGNATURE-



Re: PDFInfo plugin with SA 3.1.7

2007-07-12 Thread DAve

Robert Schetterer wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jeremy Fairbrass schrieb:

I'm running PDFInfo 0.3 with SA 3.1.8 and it works fine for me - and I'm
even running it on Windows! :)
 
Cheers,

Jeremy
 
 


Suhas Ingale [EMAIL PROTECTED]
mailto:[EMAIL PROTECTED] wrote in message
news:![EMAIL PROTECTED]
news:![EMAIL PROTECTED]...

Hello,

 


I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the
plugin successfully but does not scan the PDFs in the emails.
According to Dallas Engelken (Creator of PDFInfo) , The MIME parser
in SA is not seeing a PDF attachment on this message.

 


Has anyone tried running PDFInfo plugin with 3.1.7 version?

 

 

 


Hi, after having good results in the beginning
with pdfinfo ,
no one of the following pdf spam was catched/marked

i am now using
clam and Sanesecurity to eleminate pdf spam.

It seems that the pdf spam mutates so fast
that the recent concept of   PDFInfo plugin
isnt enough to catch/mark, so i think a solution like fuzzy ocr is
needed here

Spam in other attach file types ( doc etc ) may follow soon.


That is what concerns me more, if I start opening up every attached 
file, parsing out text, parsing out images (so I can parse out the text) 
and then running SA against everything I parse out, it's gonna get hairy 
real fast. I can absolutely see doc, rtf, etc coming next.


Considering we deal with more PDF and Doc Ham than PDF and Doc spam, it 
will become a major resource hog fast. I don't want to being running 
plugins extracting text and images from docs, then text from images from 
docs, on 40,000+ messages a day.


There has to be a better way.

DAve


--
Three years now I've asked Google why they don't have a
logo change for Memorial Day. Why do they choose to do logos
for other non-international holidays, but nothing for
Veterans?

Maybe they forgot who made that choice possible.


Re: PDFInfo plugin with SA 3.1.7

2007-07-12 Thread Robert Schetterer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Johann Spies schrieb:
 On Thu, Jul 12, 2007 at 11:54:51AM +0200, Robert Schetterer wrote:
 Hi, after having good results in the beginning
 with pdfinfo ,
 no one of the following pdf spam was catched/marked

 i am now using
 clam and Sanesecurity to eleminate pdf spam.
 
 I have tried that, but clamav did not pick up one when scanning a
 bunch of the pdf-spam.  I have used one of the downloadscripts on
 Sanesecurity.  Do I have to do some other configurations to activate
 the databasis for Clamav?
 
 I have this in /var/lib/clamav at the moment:
 
   drwxr-xr-x  2 clamav clamav4096 2007-07-12 14:22 
 clamav-29a2fe02977a1d4c26abf3fd199d1e70
   -rw-r--r--  1 clamav clamav  995915 2007-07-11 22:48 daily.cvd
   -rwxrwxr--  1 clamav clamav   0 2007-07-12 14:15 .dbLock
   -rw-r--r--  1 clamav clamav 9351789 2007-07-11 22:48 main.cvd
   -rw-r--r--  1 clamav clamav  294979 2007-07-12 15:05 MSRBL-Images.hdb
   -rw-r--r--  1 clamav clamav  228436 2007-07-12 15:05 MSRBL-SPAM.ndb
   -rw-r--r--  1 clamav clamav  180868 2007-07-12 10:26 phish.ndb.gz
   -rw-r--r--  1 clamav clamav  115449 2007-07-12 10:26 scam.ndb.gz
 
 
 Regards
 Johann
Hi Johann,
your clam db looks nice,
but its always the same with signatures of antivirus tools,
it may need time till they catch the newest one.
Today i had 3 new pdf spam, so the mutation rate is very high
and cant be catched at once , if you recieve it in a early stage of
spreading,
the download scripts of Sanesecurity
works for me, if i got in trouble with curl , i use wget
for the new sigs and restart clam, this worked too for me,
also no trouble with todays update 0.91 to clam
maybe you ask the coders of  Sanesecurity
for help
- --
Mit freundlichen Gruessen
Best Regards

Robert Schetterer

https://www.schetterer.org
Germany
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFGljg2fGH2AvR16oERAtKrAJsFdiIDF8LVypS78AKrN2nY6oYV5wCggcns
vUvbgpdZ4rKJxysPJkCsovQ=
=HuVF
-END PGP SIGNATURE-



Re: PDFInfo plugin with SA 3.1.7

2007-07-12 Thread Bill Randle
On Thu, 2007-07-12 at 16:03 +0200, Johann Spies wrote:
 On Thu, Jul 12, 2007 at 11:54:51AM +0200, Robert Schetterer wrote:
   
  i am now using
  clam and Sanesecurity to eleminate pdf spam.
 
 I have tried that, but clamav did not pick up one when scanning a
 bunch of the pdf-spam.  I have used one of the downloadscripts on
 Sanesecurity.  Do I have to do some other configurations to activate
 the databasis for Clamav?

Do you have ScanPDF enabled in your /etc/clamd.conf file? It's
typically disabled by default.

-Bill




Re: PDFInfo plugin with SA 3.1.7

2007-07-12 Thread Johann Spies
On Thu, Jul 12, 2007 at 11:54:51AM +0200, Robert Schetterer wrote:
  
 Hi, after having good results in the beginning
 with pdfinfo ,
 no one of the following pdf spam was catched/marked
 
 i am now using
 clam and Sanesecurity to eleminate pdf spam.

I have tried that, but clamav did not pick up one when scanning a
bunch of the pdf-spam.  I have used one of the downloadscripts on
Sanesecurity.  Do I have to do some other configurations to activate
the databasis for Clamav?

I have this in /var/lib/clamav at the moment:

  drwxr-xr-x  2 clamav clamav4096 2007-07-12 14:22 
clamav-29a2fe02977a1d4c26abf3fd199d1e70
  -rw-r--r--  1 clamav clamav  995915 2007-07-11 22:48 daily.cvd
  -rwxrwxr--  1 clamav clamav   0 2007-07-12 14:15 .dbLock
  -rw-r--r--  1 clamav clamav 9351789 2007-07-11 22:48 main.cvd
  -rw-r--r--  1 clamav clamav  294979 2007-07-12 15:05 MSRBL-Images.hdb
  -rw-r--r--  1 clamav clamav  228436 2007-07-12 15:05 MSRBL-SPAM.ndb
  -rw-r--r--  1 clamav clamav  180868 2007-07-12 10:26 phish.ndb.gz
  -rw-r--r--  1 clamav clamav  115449 2007-07-12 10:26 scam.ndb.gz


Regards
Johann
-- 
Johann Spies  Telefoon: 021-808 4036
Informasietegnologie, Universiteit van Stellenbosch

 Many, O LORD my God, are thy wonderful works which 
  thou hast done, and thy thoughts which are for us... 
 Psalms 40:5 


Re: PDFInfo plugin with SA 3.1.7

2007-07-12 Thread Robert Schetterer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Bill Randle schrieb:
 On Thu, 2007-07-12 at 16:03 +0200, Johann Spies wrote:
 On Thu, Jul 12, 2007 at 11:54:51AM +0200, Robert Schetterer wrote:
 i am now using
 clam and Sanesecurity to eleminate pdf spam.
 I have tried that, but clamav did not pick up one when scanning a
 bunch of the pdf-spam.  I have used one of the downloadscripts on
 Sanesecurity.  Do I have to do some other configurations to activate
 the databasis for Clamav?
 
 Do you have ScanPDF enabled in your /etc/clamd.conf file? It's
 typically disabled by default.
 
   -Bill
 
clam and Sanesecurity does not matter to
ScanPDF enabled in your /etc/clamd.conf
- --
Mit freundlichen Gruessen
Best Regards

Robert Schetterer

https://www.schetterer.org
Germany
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFGljxgfGH2AvR16oERAkHXAJ0YcqZ9u4y4TShaZr1I4kR7udab9gCfSkbP
xCvTrK/AONG8de060UDuUNA=
=fI/c
-END PGP SIGNATURE-



Re: PDFInfo plugin with SA 3.1.7

2007-07-12 Thread Theo Van Dinter
On Thu, Jul 12, 2007 at 05:50:37PM +0530, Suhas Ingale wrote:
 Is there any other way/plugin to catch these pdf spam? And which is
 compatible with SA 3.1.7

Use sa-update.

-- 
Randomly Selected Tagline:
A two party system is kind-of like a two stroke engine. Alot is wasted
 in the form of noise and hot, noxious gasses.
 - Jim Flanagan in [EMAIL PROTECTED]


pgpvZggGx5dA7.pgp
Description: PGP signature


Re: PDFInfo plugin with SA 3.1.7

2007-07-12 Thread John Rudd

Johann Spies wrote:

On Thu, Jul 12, 2007 at 11:54:51AM +0200, Robert Schetterer wrote:

Hi, after having good results in the beginning
with pdfinfo ,
no one of the following pdf spam was catched/marked

i am now using
clam and Sanesecurity to eleminate pdf spam.


I have tried that, but clamav did not pick up one when scanning a
bunch of the pdf-spam.  I have used one of the downloadscripts on
Sanesecurity.  Do I have to do some other configurations to activate
the databasis for Clamav?

I have this in /var/lib/clamav at the moment:

  drwxr-xr-x  2 clamav clamav4096 2007-07-12 14:22 
clamav-29a2fe02977a1d4c26abf3fd199d1e70
  -rw-r--r--  1 clamav clamav  995915 2007-07-11 22:48 daily.cvd
  -rwxrwxr--  1 clamav clamav   0 2007-07-12 14:15 .dbLock
  -rw-r--r--  1 clamav clamav 9351789 2007-07-11 22:48 main.cvd
  -rw-r--r--  1 clamav clamav  294979 2007-07-12 15:05 MSRBL-Images.hdb
  -rw-r--r--  1 clamav clamav  228436 2007-07-12 15:05 MSRBL-SPAM.ndb
  -rw-r--r--  1 clamav clamav  180868 2007-07-12 10:26 phish.ndb.gz
  -rw-r--r--  1 clamav clamav  115449 2007-07-12 10:26 scam.ndb.gz



The problem is with those last 2 files.

Those are the ones you're getting from Sanesecurity.  They're gzipped. 
In order to actually have ClamAV _USE_ them, you need to gunzip them.


This also make me wonder if you're actually testing the files before you 
put them into production.  If you're not, that's a rather bad idea.  At 
2am this morning, I had a non-usable phish.ndb come through.  If you're 
using clamd, that could have caused clamd to crash.



Here's the script I use for importing from MSRBL and Sanesecurity.  I 
run it out of cron with -all, on the hour.  You'll probably need to 
modify some bits of the first few lines (down to the rsync binary location):


#!/usr/local/bin/perl

my $chmod = /bin/chmod;
my $mv = /bin/mv;
my $gunzip = /usr/bin/gunzip;
my $clamscan = /usr/local/bin/clamscan;
my $testfile = /bin/sh;
my $diff = /usr/bin/diff;

my %methods =
   (http  = /usr/local/bin/wget -q,
rsync = /usr/bin/rsync -q);

my %urls =
   (msrbl-spam = rsync://rsync.mirror.msrbl.com/msrbl/MSRBL-SPAM.ndb,
msrbl-imgs = 
rsync://rsync.mirror.msrbl.com/msrbl/MSRBL-Images.hdb,
sane-phish = 
http://www.sanesecurity.com/clamav/phishsigs/phish.ndb.gz;,
sane-scam  = 
http://www.sanesecurity.com/clamav/scamsigs/scam.ndb.gz;);


my %tmpdirs =
   (msrbl-spam = /tmp/msrbl,
msrbl-imgs = /tmp/msrbl,
sane-phish = /tmp/sanecomputing,
sane-scam  = /tmp/sanecomputing);


my %destdirs =
   (msrbl-spam = /var/lib/clamav,
msrbl-imgs = /var/lib/clamav,
sane-phish = /var/lib/clamav,
sane-scam  = /var/lib/clamav);


my $getall = 0;
my (@distros, $dist, $tmpdir, $proto, $method, $file, $retcode);
my ($ufile, $diffout, $destdir);

if ($ARGV[0] =~ --?al?l?) {
   $getall = 1;
   @distros = keys(%urls);
   }
else {
   @distros = @ARGV;
   }

foreach $dist (sort (@distros)) {
   $tmpdir = $tmpdirs{$dist};
   $destdir = $destdirs{$dist};
   $url = $urls{$dist};
   $proto = $url; $proto =~ s/:.*$//;
   $method = $methods{$proto};
   $file = $url; $file =~ s^.*/([^/]*)$$1;
   $ufile = $file; $ufile =~ s/\.gz$//;

   if ((-e $tmpdir)  (!(-d $tmpdir))) {
  rename ($tmpdir, ($tmpdir . .bad))
 || die tmpdir $tmpdir isn't a directory, can't rename it;
  mkdir ($tmpdir) || die can't make tmpdir $tmpdir;
  }
   elsif (! (-e $tmpdir)) {
  mkdir ($tmpdir) || die can't make tmpdir $tmpdir;
  }
   system ($chmod 700 $tmpdir);

   if ((-e $destdir)  (!(-d $destdir))) {
  rename ($destdir, ($destdir . .bad))
 || die destdir $destdir isn't a directory, can't rename it;
  mkdir ($destdir) || die can't make destdir $destdir;
  }
   elsif (! (-e $destdir)) {
  mkdir ($destdir) || die can't make destdir $tmpdir;
  }
   system ($chmod 775 $destdir);

   chdir ($tmpdir);

   if (-e $file) {
  unlink ($file);
  }

   if (-e $ufile) {
  unlink ($ufile);
  }

   # download the file
   if ($proto eq rsync) {
  system($method $url $file);
  }
   elsif ($proto eq http) {
  system($method $url);
  }

   unless (-e $file) {
  printdidn't get download file $file\n;
  last;
  }

   if ($file =~ /\.gz$/) {
  system($gunzip $file);
  $file = $ufile;
  }

   # test against clamav
   $retcode =
  system($clamscan --database=$tmpdir $testfile  /dev/null 21) 
/ 256;


   if ($retcode == 0) {
  # clamscan of testfile worked and didn't find a virus
  # lets see if it's the same file we already have/had
  $diffout = (system($diff --brief --speed-large-files 
$tmpdir/$file $destdir/$file  /dev/null 2/dev/null)) / 256;

  if ($diffout == 0) {
 # file hasn't changed
 unlink ($file);
 }
  else {
 print$file appears to have changed, moving to destination\n;
 system($mv $tmpdir/$file $destdir/$file);
 system($chmod 644 $destdir/$file);
 }
  }
   

PDFInfo plugin with SA 3.1.7

2007-07-11 Thread Suhas Ingale
Hello,

 

I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the plugin
successfully but does not scan the PDFs in the emails. According to Dallas
Engelken (Creator of PDFInfo) , The MIME parser in SA is not seeing a PDF
attachment on this message.

 

Has anyone tried running PDFInfo plugin with 3.1.7 version?

 

 

 



Re: PDFInfo plugin with SA 3.1.7

2007-07-11 Thread Daniel J McDonald
On Wed, 2007-07-11 at 14:49 +0530, Suhas Ingale wrote:

 Has anyone tried running PDFInfo plugin with 3.1.7 version?
 

No, finally got it working yesterday evening using 3.2.1, but the
initial results are underwhelming.  Almost 100% overlap with
TVD_SPACE_RATIO.  Only one miss:
sudo grep GMD_PDF /var/log/mail/info | grep -v TVD_SPACE_RATIO
Jul 11 03:26:15 sa amavis[25324]: (25324-17) SPAM, [EMAIL PROTECTED] -
[EMAIL PROTECTED], Yes, score=25.456 tag=-99 tag2=4.5
kill=6.31 tests=[BODY_8BITS=1.5, BOTNET_CLIENT=0.01,
BOTNET_CLIENTWORDS=0, BOTNET_IPINHOSTNAME=0, BOTNET_W=2,
DKIM_POLICY_SIGNSOME=0, FH_HELO_EQ_D_D_D_D=0.498,
GMD_PDF_BAD_FUZZY=3.75, GMD_PDF_HORIZ=0.25, GMD_PDF_STOX=1,
HELO_DYNAMIC_DHCP=1.52, HELO_DYNAMIC_IPADDR=2.935, L_P0F_W=1,
RAZOR2_CF_RANGE_51_100=0.5, RAZOR2_CF_RANGE_E4_51_100=1.5,
RAZOR2_CHECK=0.5, RCVD_IN_BL_SPAMCOP_NET=2.188, RCVD_IN_PBL=0.509,
RCVD_IN_XBL=2.896, RDNS_DYNAMIC=0.1, UNWANTED_LANGUAGE_BODY=2.8],
autolearn=disabled

That's out of
[EMAIL PROTECTED] ~]$ sudo grep -o -P GMD_PDF.+?= /var/log/mail/info | sort
| uniq -c
684 GMD_PDF_BAD_FUZZY=
 43 GMD_PDF_HORIZ=
 67 GMD_PDF_STOX=
 24 GMD_PDF_VERT=


-- 
Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX
Austin Energy
http://www.austinenergy.com


Re: PDFInfo plugin with SA 3.1.7

2007-07-11 Thread Dallas Engelken

Daniel J McDonald wrote:

On Wed, 2007-07-11 at 14:49 +0530, Suhas Ingale wrote:

  

Has anyone tried running PDFInfo plugin with 3.1.7 version?




No, finally got it working yesterday evening using 3.2.1, but the
initial results are underwhelming.  Almost 100% overlap with
TVD_SPACE_RATIO.  Only one miss:
  


First of all,  TVD_SPACE_RATIO only applies for those running v3.2, 
whereas PDFInfo.pm can be used with any 3.x version..


Secondly, TVD_SPACE_RATIO can fire almost at will without a body.

$ echo  | spamassassin
2.9 TVD_SPACE_RATIOBODY: TVD_SPACE_RATIO


Take the basic mime part from a pdf stock spam... it looks similar to this

--050701020003040207010006
Content-Type: text/plain; charset=iso-8859-2; format=flowed
Content-Transfer-Encoding: 7bit


--050701020003040207010006

and it fires on TVD_SPACE_RATIO fine.

$ cat /root/sample2.txt | spamassassin -D 21 | grep -i tvd
[26686] dbg: tvd: word [SPAM-8.3]- Re: warning_6042146166.pdf
[26686] dbg: tvd: len=39
[26686] dbg: tvd: spaces 2 nonspaces 37
[26686] dbg: tvd: pct = 5
[26686] dbg: tvd: final = 5
[26686] dbg: rules: ran eval rule TVD_SPACE_RATIO == got hit (1)


change the mime part to

--050701020003040207010006
Content-Type: text/plain; charset=iso-8859-2; format=flowed
Content-Transfer-Encoding: 7bit

tvd no longer fires now

--050701020003040207010006

$ cat /root/sample2.txt | spamassassin -D 21 | grep -i tvd
[26739] dbg: tvd: word [SPAM-8.3]- Re: warning_6042146166.pdf
[26739] dbg: tvd: len=39
[26739] dbg: tvd: spaces 2 nonspaces 37
[26739] dbg: tvd: pct = 5
[26739] dbg: tvd: word tvd no longer fires now
[26739] dbg: tvd: len=24
[26739] dbg: tvd: spaces 4 nonspaces 20
[26739] dbg: tvd: pct = 20
[26739] dbg: tvd: final = 20

... and 20 isnt between tvd_vertical_words('0','10')

Easy for spammy to avoid that.  Even more, this rule has a good chance 
of falsing.  I emailed myself a png from webalizer without any body text.


# cat test | spamassassin -D 21  |grep -i tvd
[27390] dbg: tvd: word hourly_usage_200706.png
[27390] dbg: tvd: len=24
[27390] dbg: tvd: spaces 0 nonspaces 24
[27390] dbg: tvd: pct = 0
[27390] dbg: tvd: final = 0
[27390] dbg: rules: ran eval rule TVD_SPACE_RATIO == got hit (1)

The fact is, email is FTP for Dummies...  and IMHO,  TVD_SPACE_RATIO 
may be a bit high at 2.9.


BTW,   v0.3 of PDFInfo.pm is now posted - so for those that have it 
already, you might want to sync up


# countsGMD_PDF_HORIZ   135s/0h of 6132 corpus (4555s/1577h 
AxB-MANUAL) 07/11/07
# countsGMD_PDF_HORIZ   31s/0h of 11773 corpus (10988s/785h 
AxB2-TRAPS) 07/11/07
# countsGMD_PDF_SQUARE  36s/0h of 6132 corpus (4555s/1577h 
AxB-MANUAL) 07/11/07
# countsGMD_PDF_SQUARE  11s/0h of 11773 corpus (10988s/785h 
AxB2-TRAPS) 07/11/07
# countsGMD_PDF_VERT24s/0h of 6132 corpus (4555s/1577h 
AxB-MANUAL) 07/11/07
# countsGMD_PDF_VERT10s/0h of 11773 corpus (10988s/785h 
AxB2-TRAPS) 07/11/07
# countsGMD_PDF_FUZZY1_T1   591s/0h of 6132 corpus (555s/1577h 
AxB-MANUAL) 07/11/07
# countsGMD_PDF_FUZZY1_T1   199s/0h of 11773 corpus (10988s/785h 
AxB2-TRAPS) 07/11/07
# countsGMD_PDF_FUZZY2_T1   199s/0h of 11773 corpus (10988s/785h 
AxB2-TRAPS) 07/11/07
# countsGMD_PDF_FUZZY2_T1   591s/0h of 6132 corpus (555s/1577h 
AxB-MANUAL) 07/11/07
# countsGMD_PDF_FUZZY2_T2   118s/0h of 6132 corpus (555s/1577h 
AxB-MANUAL) 07/11/07
# countsGMD_PDF_FUZZY2_T2   1s/0h of 10767 corpus (9986s/781h 
AxB2-TRAPS) 07/11/07
# countsGMD_PDF_FUZZY2_T3   0s/0h of 10767 corpus (9986s/781h 
AxB2-TRAPS) 07/11/07
# countsGMD_PDF_FUZZY2_T3   25s/0h of 5641 corpus (4064s/1577h 
AxB-MANUAL) 07/11/07
# countsGMD_PDF_FUZZY2_T4   105s/0h of 6132 corpus (555s/1577h 
AxB-MANUAL) 07/11/07
# countsGMD_PDF_FUZZY2_T4   28s/0h of 10767 corpus (9986s/781h 
AxB2-TRAPS) 07/11/07
# countsGMD_AUTHOR_COLET1s/0h of 10767 corpus (9986s/781h 
AxB2-TRAPS) 07/11/07
# countsGMD_AUTHOR_COLET2s/0h of 6132 corpus (555s/1577h 
AxB-MANUAL) 07/11/07
# countsGMD_AUTHOR_MOBILE   2s/0h of 6132 corpus (555s/1577h 
AxB-MANUAL) 07/11/07
# countsGMD_AUTHOR_MOBILE   55s/0h of 10767 corpus (9986s/781h 
AxB2-TRAPS) 07/11/07
# countsGMD_AUTHOR_OOO  1s/0h of 10767 corpus (9986s/781h 
AxB2-TRAPS) 07/11/07
# countsGMD_AUTHOR_OOO  118s/0h of 6132 corpus (555s/1577h 
AxB-MANUAL) 07/11/07
# countsGMD_AUTHOR_HPADMIN  105s/0h of 6132 corpus (4555s/1577h 
AxB-MANUAL) 07/11/07
# countsGMD_AUTHOR_HPADMIN  27s/0h of 11773 corpus (10988s/785h 
AxB2-TRAPS) 07/11/07
# countsGMD_PRODUCER_GPL227s/0h of 6132 corpus (555s/1577h 
AxB-MANUAL) 07/11/07
# countsGMD_PRODUCER_GPL85s/0h of 10767 corpus (9986s/781h 
AxB2-TRAPS)