Re: [spamdyke-users] Spam Stats

2009-09-03 Thread Mirko Buffoni
Sergio, Eric,

It's nothing really worth worldwide attention. It's a simple php
script that collects data from various sources and aggregates them.
Here is the relevant part:

 $res = sprintf( Antispam Statistics for:  .date('d/m/Y', 
time()-86400).
 \n\n.
  Good : % 6d = %6.2f %%\n.
Unsure : % 6d = %6.2f %%\n.
 Virus : % 6d = %6.2f %%\n.
 BAD Sender: % 6d = %6.2f %%\n.
 BAD  Rcpt : % 6d = %6.2f %%\n.
 Pure SPAM : % 6d = %6.2f %%\n.
   SPAMMER : % 6d = %6.2f %%\n%s.
 --\n.
 Total : % 6d = 100.00 %%\n\n,
 $pure_good,  100.0 * $pure_good / $total_mails,
 $unsure, 100.0 * $unsure / $total_mails,
 $virus,  100.0 * $virus / $total_mails,
 $pure_spam,  100.0 * $pure_spam / $total_mails,
 $bad_sender, 100.0 * $bad_sender / $total_mails,
 $bad_rcpt,   100.0 * $bad_rcpt / $total_mails,
 $intrusion,  100.0 * $intrusion / $total_mails,
 $spamdyke,
 $total_mails );

It's not based on any other statistics script, as it need to serve only
my own purposes.  Virus stats are collected through clamav, bad_sender/rcpt
are chkuser GREPs, and so on.

Mirko

At 16:10 02/09/2009 -0700, you wrote:
Sergio Minini (NETKEY) wrote:
  Mirko Buffoni escribió:
  div class=moz-text-flowed style=font-family: -moz-fixedGoods
  average between 500 and 2000 daily.  Figures are however
  pretty standard.  Spamdyke filters out about 60k attempts daily.
  Here are yesterday stats:
 
  Good :   1025 =   0.68 %
 Unsure :183 =   0.12 %
  Virus : 62 =   0.04 %
  BAD Sender:   5114 =   3.40 %
  BAD  Rcpt :212 =   0.14 %
  Pure SPAM :  45997 =  30.56 %
SPAMMER :  97940 =  65.06 %
   |
   \.BLACKLISTED_KEYWORD :  29608 =  30.23 %
   \..DENIED_EARLYTALKER :  3 =   0.00 %
   \...DENIED_IP_IN_RDNS :  30447 =  31.09 %
   \DENIED_RBL_MATCH :  23268 =  23.76 %
   \.DENIED_SENDER_NO_MX :  13070 =  13.34 %
   \..DENIED_TOO_MANY_RECIPIENTS :  1 =   0.00 %
   \DENIED_UNQUALIFIED_RECIPIENT :  1 =   0.00 %
   \.TIMEOUT :   1542 =   1.57 %
 
  --
  Total : 150533 = 100.00 %
  Mirko, nice layout of stats.
  Could you please share the script you are using to get them?
  Thanks!
  -Sergio

Ditto! Somebody did a nice job!
(I wonder if this is this based on the spamdyke-stats.pl script that
Felix Buenemann did last October)

Pleeeze Mirko? I'd like to include in with the qmailtoaster-plus scripts.

--
-Eric 'shubes'

___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users

___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] Spam Stats

2009-09-03 Thread Eric Shubert
Mirko,

That answers the 'pretty formatting' part, but the meat of the sandwich 
is collecting the stats. I'm afraid that Virus stats are collected 
through clamav, bad_sender/rcpt are chkuser GREPs, and so on leaves us 
hanging. :(

The data collection code is what I'm most interested in. Are the stats 
gathered continually and stored, or are they gathered dynamically on 
demand? This is the code I'm most interested in. The $spamdyke part is 
particularly mysterious. If it's a bit disjointed that's ok. I'm sure 
that we can work with it.

Thanks again.

Mirko Buffoni wrote:
 Sergio, Eric,
 
 It's nothing really worth worldwide attention. It's a simple php
 script that collects data from various sources and aggregates them.
 Here is the relevant part:
 
  $res = sprintf( Antispam Statistics for:  .date('d/m/Y', 
 time()-86400).
  \n\n.
   Good : % 6d = %6.2f %%\n.
 Unsure : % 6d = %6.2f %%\n.
  Virus : % 6d = %6.2f %%\n.
  BAD Sender: % 6d = %6.2f %%\n.
  BAD  Rcpt : % 6d = %6.2f %%\n.
  Pure SPAM : % 6d = %6.2f %%\n.
SPAMMER : % 6d = %6.2f %%\n%s.
  --\n.
  Total : % 6d = 100.00 %%\n\n,
  $pure_good,  100.0 * $pure_good / $total_mails,
  $unsure, 100.0 * $unsure / $total_mails,
  $virus,  100.0 * $virus / $total_mails,
  $pure_spam,  100.0 * $pure_spam / $total_mails,
  $bad_sender, 100.0 * $bad_sender / $total_mails,
  $bad_rcpt,   100.0 * $bad_rcpt / $total_mails,
  $intrusion,  100.0 * $intrusion / $total_mails,
  $spamdyke,
  $total_mails );
 
 It's not based on any other statistics script, as it need to serve only
 my own purposes.  Virus stats are collected through clamav, bad_sender/rcpt
 are chkuser GREPs, and so on.
 
 Mirko
 
 At 16:10 02/09/2009 -0700, you wrote:
 Sergio Minini (NETKEY) wrote:
 Mirko Buffoni escribió:
 div class=moz-text-flowed style=font-family: -moz-fixedGoods
 average between 500 and 2000 daily.  Figures are however
 pretty standard.  Spamdyke filters out about 60k attempts daily.
 Here are yesterday stats:

 Good :   1025 =   0.68 %
Unsure :183 =   0.12 %
 Virus : 62 =   0.04 %
 BAD Sender:   5114 =   3.40 %
 BAD  Rcpt :212 =   0.14 %
 Pure SPAM :  45997 =  30.56 %
   SPAMMER :  97940 =  65.06 %
  |
  \.BLACKLISTED_KEYWORD :  29608 =  30.23 %
  \..DENIED_EARLYTALKER :  3 =   0.00 %
  \...DENIED_IP_IN_RDNS :  30447 =  31.09 %
  \DENIED_RBL_MATCH :  23268 =  23.76 %
  \.DENIED_SENDER_NO_MX :  13070 =  13.34 %
  \..DENIED_TOO_MANY_RECIPIENTS :  1 =   0.00 %
  \DENIED_UNQUALIFIED_RECIPIENT :  1 =   0.00 %
  \.TIMEOUT :   1542 =   1.57 %

 --
 Total : 150533 = 100.00 %
 Mirko, nice layout of stats.
 Could you please share the script you are using to get them?
 Thanks!
 -Sergio
 Ditto! Somebody did a nice job!
 (I wonder if this is this based on the spamdyke-stats.pl script that
 Felix Buenemann did last October)

 Pleeeze Mirko? I'd like to include in with the qmailtoaster-plus scripts.

 --
 -Eric 'shubes'

 ___
 spamdyke-users mailing list
 spamdyke-users@spamdyke.org
 http://www.spamdyke.org/mailman/listinfo/spamdyke-users


-- 
-Eric 'shubes'

___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] Spam Stats

2009-09-03 Thread Mirko Buffoni
Hi Eric,

At 06:50 03/09/2009 -0700, you wrote:
Mirko,

That answers the 'pretty formatting' part, but the meat of the sandwich
is collecting the stats. I'm afraid that Virus stats are collected
through clamav, bad_sender/rcpt are chkuser GREPs, and so on leaves us
hanging. :(

You can collect data in a various amount of ways.  For continuous collection
I suggest to use collectd package, altough for spam/mail statistics I'm afraid
you'll have to write your own plugins.
To count the entries in a daily rotated log file a simple

grep VIRUS FOUND clamav/current.1 | wc -l

is enough.  The same applies to other patterns in the log file.

The data collection code is what I'm most interested in. Are the stats
gathered continually and stored, or are they gathered dynamically on

Since they are a daily statistic, they are collected after logfile rotation
and stored/processed.

Mirko

___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] Spam Stats

2009-09-03 Thread Eric Shubert
Mirko Buffoni wrote:
 Hi Eric,
 
 At 06:50 03/09/2009 -0700, you wrote:
 Mirko,

 That answers the 'pretty formatting' part, but the meat of the sandwich
 is collecting the stats. I'm afraid that Virus stats are collected
 through clamav, bad_sender/rcpt are chkuser GREPs, and so on leaves us
 hanging. :(
 
 You can collect data in a various amount of ways.  For continuous collection
 I suggest to use collectd package, altough for spam/mail statistics I'm afraid
 you'll have to write your own plugins.
 To count the entries in a daily rotated log file a simple
 
 grep VIRUS FOUND clamav/current.1 | wc -l
 
 is enough.  The same applies to other patterns in the log file.

I'm very familiar with this sort of thing.

 The data collection code is what I'm most interested in. Are the stats
 gathered continually and stored, or are they gathered dynamically on
 
 Since they are a daily statistic, they are collected after logfile rotation
 and stored/processed.

Can you share the code that does this collecting and storing??

 Mirko


-- 
-Eric 'shubes'

___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] Spam Stats

2009-09-03 Thread Sebastian Grewe
Hey list,

I just looked at those stats and compared the output to what I am having
on our boxes and I started wondering:

When I check the log files, Spamdyke logs the following

FILTER_RBL_MATCH : When listed in the RDNS
DENIED_RBL_MATCH : For each recipient address in the mail

So basically it will result in 1 FILTER match but 1 DENIED match for
each mail address.

Doesn't that mean that using the DENIED match will not result in the
actual denied mails but rather in a much higher number? I am currently
looking for both FILTER_ and DENIED_ flags and sum those up to find out
how many mails I rejected - but I am guessing here that looking for
FILTER_ alone would make more sense.

Here my output, wrote the script today - Mirkos' output inspired me :)
It's tailored to work for our environment though.

Total  : 1571(100.%)
Legitimate : 123 (7.8200%)
   |
   |-FILTER_WHITELIST : 61 (49.5900%)
   |
   |-_RECIPIENT_WHITELIST : 61 (100.%)

Rejected   : 1448 (92.1700%)
   |
   |-FILTER : 539 (37.2200%)
   ||
   ||-  _RDNS_MISSING : 192 (35.6200%)
   ||-  _OTHER: 12 (2.2200%)
   ||-  _RBL_MATCH: 297 (55.1000%)
   ||
   ||- _RBL_MATCH_SPAMHAUS: 171 (57.5700%)
   ||- _RBL_MATCH_SPAMCOP : 126 (42.4200%)
   |
   |-DENIED : 905 (62.5000%)
   ||
   ||-  _RDNS_MISSING : 415 (45.8500%)
   ||-  _RBL_MATCH: 446 (49.2800%)
   ||-  _EARLYTALKER  : 0 (0%)
   ||-  _SENDER_NO_MX : 14 (1.5400%)
   ||-  _TOO_MANY_RECIPIENTS  : 0 (0%)
   ||-  _UNQUALIFIED_RECIPIENT: 0 (0%)
   |
   |-Clamav : 4 (.2700%)
|
|-  Phishing  : 4 (100.%)
|-  Trojan: 0 (0%)


On Tue, 2009-09-01 at 15:52 -0500, Sam Clippinger wrote:
  -Original Message-
  From: spamdyke-users-boun...@spamdyke.org
  [mailto:spamdyke-users-boun...@spamdyke.org] On Behalf Of Mirko
 Buffoni
  Sent: 01 September 2009 14:27
  To: spamdyke users
  Subject: Re: [spamdyke-users] Spam Stats
 
  Goods average between 500 and 2000 daily.  Figures are however
  pretty standard.  Spamdyke filters out about 60k attempts daily.
  Here are yesterday stats:
 
   Good :   1025 =   0.68 %
  Unsure :183 =   0.12 %
   Virus : 62 =   0.04 %
  BAD Sender:   5114 =   3.40 %
  BAD  Rcpt :212 =   0.14 %
  Pure SPAM :  45997 =  30.56 %
 SPAMMER :  97940 =  65.06 %
|
\.BLACKLISTED_KEYWORD :  29608 =  30.23 %
\..DENIED_EARLYTALKER :  3 =   0.00 %
\...DENIED_IP_IN_RDNS :  30447 =  31.09 %
\DENIED_RBL_MATCH :  23268 =  23.76 %
\.DENIED_SENDER_NO_MX :  13070 =  13.34 %
\..DENIED_TOO_MANY_RECIPIENTS :  1 =   0.00 %
\DENIED_UNQUALIFIED_RECIPIENT :  1 =   0.00 %
\.TIMEOUT :   1542 =   1.57 %
 
  --
   Total : 150533 = 100.00 %
-- 
Sebastian Grewe
Jammicron | Experts in Powering Online Sales
Phone 604.331.0586 x 104
Fax 604.331.0587
www.jammicron.com | www.qwik.ca


___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] Spam Stats

2009-09-03 Thread Eric Shubert
I don't have any FILTER_RBL messages. I'm using log-level=2.
What log level are you using?

I think that it's appropriate to count each recipient as a separate 
email. If the message came from a qmail server, it would be that way 
anyhow. And after all, that's how many messages end up being delivered.

Sebastian Grewe wrote:
 Hey list,
 
 I just looked at those stats and compared the output to what I am having
 on our boxes and I started wondering:
 
 When I check the log files, Spamdyke logs the following
 
 FILTER_RBL_MATCH : When listed in the RDNS
 DENIED_RBL_MATCH : For each recipient address in the mail
 
 So basically it will result in 1 FILTER match but 1 DENIED match for
 each mail address.
 
 Doesn't that mean that using the DENIED match will not result in the
 actual denied mails but rather in a much higher number? I am currently
 looking for both FILTER_ and DENIED_ flags and sum those up to find out
 how many mails I rejected - but I am guessing here that looking for
 FILTER_ alone would make more sense.
 
 Here my output, wrote the script today - Mirkos' output inspired me :)
 It's tailored to work for our environment though.
 
 Total  : 1571(100.%)
 Legitimate : 123 (7.8200%)
|
|-FILTER_WHITELIST : 61 (49.5900%)
|
|-_RECIPIENT_WHITELIST : 61 (100.%)
 
 Rejected   : 1448 (92.1700%)
|
|-FILTER : 539 (37.2200%)
||
||-  _RDNS_MISSING : 192 (35.6200%)
||-  _OTHER: 12 (2.2200%)
||-  _RBL_MATCH: 297 (55.1000%)
||
||- _RBL_MATCH_SPAMHAUS: 171 (57.5700%)
||- _RBL_MATCH_SPAMCOP : 126 (42.4200%)
|
|-DENIED : 905 (62.5000%)
||
||-  _RDNS_MISSING : 415 (45.8500%)
||-  _RBL_MATCH: 446 (49.2800%)
||-  _EARLYTALKER  : 0 (0%)
||-  _SENDER_NO_MX : 14 (1.5400%)
||-  _TOO_MANY_RECIPIENTS  : 0 (0%)
||-  _UNQUALIFIED_RECIPIENT: 0 (0%)
|
|-Clamav : 4 (.2700%)
 |
 |-  Phishing  : 4 (100.%)
 |-  Trojan: 0 (0%)
 
 
 On Tue, 2009-09-01 at 15:52 -0500, Sam Clippinger wrote:
 -Original Message-
 From: spamdyke-users-boun...@spamdyke.org
 [mailto:spamdyke-users-boun...@spamdyke.org] On Behalf Of Mirko
 Buffoni
 Sent: 01 September 2009 14:27
 To: spamdyke users
 Subject: Re: [spamdyke-users] Spam Stats

 Goods average between 500 and 2000 daily.  Figures are however
 pretty standard.  Spamdyke filters out about 60k attempts daily.
 Here are yesterday stats:

  Good :   1025 =   0.68 %
 Unsure :183 =   0.12 %
  Virus : 62 =   0.04 %
 BAD Sender:   5114 =   3.40 %
 BAD  Rcpt :212 =   0.14 %
 Pure SPAM :  45997 =  30.56 %
SPAMMER :  97940 =  65.06 %
   |
   \.BLACKLISTED_KEYWORD :  29608 =  30.23 %
   \..DENIED_EARLYTALKER :  3 =   0.00 %
   \...DENIED_IP_IN_RDNS :  30447 =  31.09 %
   \DENIED_RBL_MATCH :  23268 =  23.76 %
   \.DENIED_SENDER_NO_MX :  13070 =  13.34 %
   \..DENIED_TOO_MANY_RECIPIENTS :  1 =   0.00 %
   \DENIED_UNQUALIFIED_RECIPIENT :  1 =   0.00 %
   \.TIMEOUT :   1542 =   1.57 %

 --
  Total : 150533 = 100.00 %


-- 
-Eric 'shubes'

___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] Spam Stats

2009-09-03 Thread Sebastian Grewe
Hey Eric,

Yeah, my log level is higher - didn't think about that.

I was more thinking about a statistic for the incoming connection. If
you look at it as a mail counter for mails being delivered, yeah, DENIED
makes way more sense.

I will just keep the counters like they are now, they still give me a
pretty good idea of what's going on.

Thanks Eric, as fast as usual!

Sebastian

On Thu, 2009-09-03 at 11:57 -0700, Eric Shubert wrote:
 I don't have any FILTER_RBL messages. I'm using log-level=2.
 What log level are you using?
 
 I think that it's appropriate to count each recipient as a separate 
 email. If the message came from a qmail server, it would be that way 
 anyhow. And after all, that's how many messages end up being delivered.
 
 Sebastian Grewe wrote:
  Hey list,
  
  I just looked at those stats and compared the output to what I am having
  on our boxes and I started wondering:
  
  When I check the log files, Spamdyke logs the following
  
  FILTER_RBL_MATCH : When listed in the RDNS
  DENIED_RBL_MATCH : For each recipient address in the mail
  
  So basically it will result in 1 FILTER match but 1 DENIED match for
  each mail address.
  
  Doesn't that mean that using the DENIED match will not result in the
  actual denied mails but rather in a much higher number? I am currently
  looking for both FILTER_ and DENIED_ flags and sum those up to find out
  how many mails I rejected - but I am guessing here that looking for
  FILTER_ alone would make more sense.
  
  Here my output, wrote the script today - Mirkos' output inspired me :)
  It's tailored to work for our environment though.
  
  Total  : 1571(100.%)
  Legitimate : 123 (7.8200%)
 |
 |-FILTER_WHITELIST : 61 (49.5900%)
 |
 |-_RECIPIENT_WHITELIST : 61 (100.%)
  
  Rejected   : 1448 (92.1700%)
 |
 |-FILTER : 539 (37.2200%)
 ||
 ||-  _RDNS_MISSING : 192 (35.6200%)
 ||-  _OTHER: 12 (2.2200%)
 ||-  _RBL_MATCH: 297 (55.1000%)
 ||
 ||- _RBL_MATCH_SPAMHAUS: 171 (57.5700%)
 ||- _RBL_MATCH_SPAMCOP : 126 (42.4200%)
 |
 |-DENIED : 905 (62.5000%)
 ||
 ||-  _RDNS_MISSING : 415 (45.8500%)
 ||-  _RBL_MATCH: 446 (49.2800%)
 ||-  _EARLYTALKER  : 0 (0%)
 ||-  _SENDER_NO_MX : 14 (1.5400%)
 ||-  _TOO_MANY_RECIPIENTS  : 0 (0%)
 ||-  _UNQUALIFIED_RECIPIENT: 0 (0%)
 |
 |-Clamav : 4 (.2700%)
  |
  |-  Phishing  : 4 (100.%)
  |-  Trojan: 0 (0%)
  
  
  On Tue, 2009-09-01 at 15:52 -0500, Sam Clippinger wrote:
  -Original Message-
  From: spamdyke-users-boun...@spamdyke.org
  [mailto:spamdyke-users-boun...@spamdyke.org] On Behalf Of Mirko
  Buffoni
  Sent: 01 September 2009 14:27
  To: spamdyke users
  Subject: Re: [spamdyke-users] Spam Stats
 
  Goods average between 500 and 2000 daily.  Figures are however
  pretty standard.  Spamdyke filters out about 60k attempts daily.
  Here are yesterday stats:
 
   Good :   1025 =   0.68 %
  Unsure :183 =   0.12 %
   Virus : 62 =   0.04 %
  BAD Sender:   5114 =   3.40 %
  BAD  Rcpt :212 =   0.14 %
  Pure SPAM :  45997 =  30.56 %
 SPAMMER :  97940 =  65.06 %
|
\.BLACKLISTED_KEYWORD :  29608 =  30.23 %
\..DENIED_EARLYTALKER :  3 =   0.00 %
\...DENIED_IP_IN_RDNS :  30447 =  31.09 %
\DENIED_RBL_MATCH :  23268 =  23.76 %
\.DENIED_SENDER_NO_MX :  13070 =  13.34 %
\..DENIED_TOO_MANY_RECIPIENTS :  1 =   0.00 %
\DENIED_UNQUALIFIED_RECIPIENT :  1 =   0.00 %
\.TIMEOUT :   1542 =   1.57 %
 
  --
   Total : 150533 = 100.00 %
 
 
-- 
Sebastian Grewe
Jammicron | Experts in Powering Online Sales
Phone 604.331.0586 x 104
Fax 604.331.0587
www.jammicron.com | www.qwik.ca


___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users


Re: [spamdyke-users] Qmail + spamdyke + chkuser

2009-09-03 Thread Sam Clippinger
spamdyke does not change the way incoming emails are received or 
processed on a qmail server.  Without spamdyke, an incoming connection 
is accepted by a daemon called tcpserver, which starts a program 
called qmail-smtpd and exits.  qmail-smtpd communicates with the 
remote server and accepts or rejects the message.

With spamdyke, incoming connections are still accepted by tcpserver, but 
it starts a copy of spamdyke instead of qmail-smtpd.  spamdyke then 
starts a copy of qmail-smtpd.  When the remote server sends any data, 
spamdyke passes it to qmail-smtpd.  When qmail-smtpd produces any 
output, spamdyke passes it to the remote server.  That's why spamdyke is 
often described as a filter or a pipe -- it just passes the data 
back and forth between qmail-smtpd and the remote server.  If none of 
spamdyke's filters are triggered, neither qmail-smtpd nor the remote 
server can tell spamdyke was even running.  When spamdyke wants to block 
a message, it disconnects qmail-smtpd and terminates the process, so 
qmail-smtpd believes the remote server just disconnected without sending 
a complete message.  At the same time, spamdyke continues responding to 
the remote server, imitating qmail-smtpd but sending errors and 
rejection codes instead of accepting the message.

So, chkuser should install and function the same way whether spamdyke is 
present or not.  If it does not, you may have found a bug in spamdyke -- 
please send more details about your setup and the errors so I can get it 
fixed.

-- Sam Clippinger

Youri V. Kravatsky wrote:
 Hello Eric,

   
 chkuser is implemented via a patch to qmail.
 
 Well, BEFORE spamdyke adding, my chkuser was working perfectly (rejecting
 mails to non-existant and overquoting users). Now, as far as I understand,
 spamdyke injects received mails directly to qmail queue, or send them
 through local smtp, so they are always accepted, so queue of my server is
 full of autoresponses. Sam said, that there is no problem in using spamdyke
 and chkuser - and I again asking - how can I do it? I didn't found any
 traces about it in docs/faqs.

   
___
spamdyke-users mailing list
spamdyke-users@spamdyke.org
http://www.spamdyke.org/mailman/listinfo/spamdyke-users