Re: Bad performance of Bayes with MySQL cluster

2009-08-15 Thread Henrik K
On Fri, Aug 14, 2009 at 07:43:37PM +0200, Jorn Argelo wrote:
 Hi All,

 I'm running spamassassin 3.2.5 on RHEL 5.3 x86_64. We have three boxes,  
 and all three of them are sharing the same bayes DB using a MySQL  
 cluster, version 7.0.6 (based on 5.1.34). The cluster has 2 datanodes  
 with a quadcore and 4 GB of memory. Everything is working fine, even the  
 AWL in SQL, except for Bayes. The bayes database currently houses a bit  
 less than 500k tokens and the database size is not very big either, as  
 the datanodes have less than 1 GB of storage in use. I've followed the  
 instructions from the Spamassassin wiki, and I also used the supplied  
 bayes_mysql.sql file to create my tables. In case anyone is interested,  
 you can find the cluster.ini and the my.cnf used on the SQL nodes here:

 http://www.wcborstel.com/web/mysql/my.cnf

skip-innodb

That's pretty much the reason. You _need_ to use InnoDB as it has row level
locking. MyISAM just kills Bayes.

 Now the problem at the first glance seems to be, from my perspective  
 (please correct me if I'm wrong), the actual queries being done. For  
 every mail being scanned by spamassassin, it seems to be doing the  
 SELECT RPAD(token, 5, ' '), spam_count, ham_count, atime FROM  
 bayes_token query every time. This effectively requesting the entire  
 bayes_token table

What you are seeing are expiry runs.

As you right now use MyISAM, the whole table is locked for such operations
so you are pretty much hosed.

In any case, you should use bayes_auto_expire 0 and run expire for example
once every night when traffic is slower.

 It seems that the query cache is either not suitable for this or I am
 doing something majorly wrong :)

You are right. Better to disable completely if there's nothing else running
that uses it and save little CPU.



Re: Bad performance of Bayes with MySQL cluster

2009-08-15 Thread Jorn Argelo

Henrik K wrote:

On Fri, Aug 14, 2009 at 07:43:37PM +0200, Jorn Argelo wrote:
  

Hi All,

I'm running spamassassin 3.2.5 on RHEL 5.3 x86_64. We have three boxes,  
and all three of them are sharing the same bayes DB using a MySQL  
cluster, version 7.0.6 (based on 5.1.34). The cluster has 2 datanodes  
with a quadcore and 4 GB of memory. Everything is working fine, even the  
AWL in SQL, except for Bayes. The bayes database currently houses a bit  
less than 500k tokens and the database size is not very big either, as  
the datanodes have less than 1 GB of storage in use. I've followed the  
instructions from the Spamassassin wiki, and I also used the supplied  
bayes_mysql.sql file to create my tables. In case anyone is interested,  
you can find the cluster.ini and the my.cnf used on the SQL nodes here:


http://www.wcborstel.com/web/mysql/my.cnf



skip-innodb

That's pretty much the reason. You _need_ to use InnoDB as it has row level
locking. MyISAM just kills Bayes.
  
Actually I'm using NDB and not MyISAM. I need a clustered storage 
engine, otherwise the bayes DB can't really be shared. If I create an 
InnoDB table on one SQL node, it doesn't show up at the other SQL node, 
while this is the case with an NDB storage engine.


What I can do however, is point all mailservers to one SQL node. I just 
need to synchronize the bayes_token table to the other SQL node I guess. 
Do you have an idea about this?
  
Now the problem at the first glance seems to be, from my perspective  
(please correct me if I'm wrong), the actual queries being done. For  
every mail being scanned by spamassassin, it seems to be doing the  
SELECT RPAD(token, 5, ' '), spam_count, ham_count, atime FROM  
bayes_token query every time. This effectively requesting the entire  
bayes_token table



What you are seeing are expiry runs.

As you right now use MyISAM, the whole table is locked for such operations
so you are pretty much hosed.

In any case, you should use bayes_auto_expire 0 and run expire for example
once every night when traffic is slower.
  
Thanks for this, I was not aware of it. Running expiry runs manually is 
done by sa-learn --force-expiry, correct?
  

It seems that the query cache is either not suitable for this or I am
doing something majorly wrong :)



You are right. Better to disable completely if there's nothing else running
that uses it and save little CPU.
  
Good to know. There will be other applications running on it as well so 
I'll reduce the size of the query cache for a good bit.


Thanks a lot for your feedback.

Jorn



__ Information from ESET NOD32 Antivirus, version of virus signature 
database 4336 (20090814) __

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com



Re: Bad performance of Bayes with MySQL cluster

2009-08-15 Thread Henrik K
On Sat, Aug 15, 2009 at 09:50:41AM +0200, Jorn Argelo wrote:
 Henrik K wrote:
 On Fri, Aug 14, 2009 at 07:43:37PM +0200, Jorn Argelo wrote:
   
 Hi All,

 I'm running spamassassin 3.2.5 on RHEL 5.3 x86_64. We have three 
 boxes,  and all three of them are sharing the same bayes DB using a 
 MySQL  cluster, version 7.0.6 (based on 5.1.34). The cluster has 2 
 datanodes  with a quadcore and 4 GB of memory. Everything is working 
 fine, even the  AWL in SQL, except for Bayes. The bayes database 
 currently houses a bit  less than 500k tokens and the database size 
 is not very big either, as  the datanodes have less than 1 GB of 
 storage in use. I've followed the  instructions from the Spamassassin 
 wiki, and I also used the supplied  bayes_mysql.sql file to create my 
 tables. In case anyone is interested,  you can find the cluster.ini 
 and the my.cnf used on the SQL nodes here:

 http://www.wcborstel.com/web/mysql/my.cnf
 

 skip-innodb

 That's pretty much the reason. You _need_ to use InnoDB as it has row level
 locking. MyISAM just kills Bayes.
   
 Actually I'm using NDB and not MyISAM. I need a clustered storage  
 engine, otherwise the bayes DB can't really be shared. If I create an  
 InnoDB table on one SQL node, it doesn't show up at the other SQL node,  
 while this is the case with an NDB storage engine.

Ah right sorry.. I have no idea on NDB and how it performs for SA.

 What I can do however, is point all mailservers to one SQL node. I just  
 need to synchronize the bayes_token table to the other SQL node I guess.  
 Do you have an idea about this?

MySQL replication? Maybe search on spamassassin-users archives to find
experiences.

 Thanks for this, I was not aware of it. Running expiry runs manually is  
 done by sa-learn --force-expiry, correct?

Yep.



Re: Barracuda RBL in first place

2009-08-15 Thread --[ UxBoD ]--
- Marc Perkel m...@perkel.com wrote: 
 
 
 Aaron Wolfe wrote: 

On Fri, Aug 14, 2009 at 11:24 AM, Chris Owen ow...@hubris.net wrote: 

On Aug 14, 2009, at 10:13 AM, Mike Cardwell wrote: 

The comparisons on that page are useless. What matters is list policy,
reliability and reputation.

SpamHaus is hands down the best dnsbl. While I certainly agree that SpamHaus is 
very good, I would argue that
Invalument is currently better.  It certainly stops a lot more spam here and
I think false positives are still extremely low. Invaluement lists are also the 
top performers at my site:

Total messages: 273235355
Total blocked: 227710956 83.34%

Unknown user 32.00% (32.00%)87427696
  Greylisted 24.88% (16.92%)46225401
   Throttled 11.03% (5.64%) 15399444
 Relay access denied 0.01%  (0.00%) 7034
   Bogus DNS (Broadcast) 0.01%  (0.00%)11692
  Bogus DNS (RFC 1918 space) 0.07%  (0.03%)82135
 Spoofed Address 0.26%  (0.12%)   319551
  Unclassified Event 0.77%  (0.35%)   949388
 Temporary Local Problem 0.01%  (0.00%) 8165
 Require FQDN sender address 0.04%  (0.02%)51022
  Require FQDN for HELO hostname 8.97%  (4.02%) 10988455
 Require DNS for sender's domain 0.78%  (0.32%)   870643
 Require Reverse DNS 23.83% (9.65%) 26372877
   Require DNS for HELO hostname 0.20%  (0.06%)   165157
 The Spamhaus Block List 21.87% (6.74%) 18405091
  The Invaluement SIP Block List 22.14% (5.33%) 14557404
   The SIP/24 Block List 3.84%  (0.72%)  1965510
 The Barracuda Reputation Block List 3.89%  (0.70%)  1915628
(several RBLs not widely used snipped)

We have several hundred domains and each can use it's own filtering
options, so not all RBLs/checks are used on all mail.  Checks are
listed in order applied, so a message dropped by unknown user for
instance is never seen by greylisted.

Invalument lists block over 25% of all messages that make it past all
the checks in front of them, including Spamhaus.  That's massive.
Barracuda is not used by a majority of clients and is used after the
others, so the low number is not an indication of poor performance.
I've actually had pretty good luck with it.

-Aaron 

--
RANK    RULE NAME                       COUNT  %OFMAIL %OFSPAM  %OFHAM
--
 1     URIBL_INVALUEMENT               27029    47.58   85.13    0.60
 2     RCVD_IN_INVALUEMENT             26116    45.81   82.26    0.22
 3     HTML_MESSAGE                    25184    79.83   79.32   80.48
 4     BAYES_99                        23445    41.09   73.84    0.12
 5     RCVD_IN_INVALUEMENT24           23290    40.85   73.35    0.18
 6     URIBL_BLACK                     22372    39.49   70.46    0.74
 7     RCVD_IN_JMF_BL                  16845    30.70   53.06    2.74
 8     URIBL_JP_SURBL                  15962    27.99   50.27    0.12
 9     DKIM_SIGNED                     12137    37.32   38.23   36.18
 10     DKIM_VERIFIED                   11051    33.93   34.81   32.84

Chris

-
Chris Owen         - Garden City (620) 275-1900 -  Lottery (noun):
President          - Wichita     (316) 858-3000 -    A stupidity tax
Hubris Communications Inc www.hubris.net 
- 
 
 Yep Invalument is a good list. But there's no public option to compare it. 
 
What log script do you good people use to generate the list above ? Is it a 
home brew or one we can download so we can compare our own hits ?


-- 
This message has been scanned for viruses and
dangerous content and is believed to be clean.

SplatNIX IT Services :: Innovation through collaboration



Re: Barracuda RBL in first place

2009-08-15 Thread Yet Another Ninja

On 8/15/2009 11:02 AM, --[ UxBoD ]-- wrote:

--
RANKRULE NAME   COUNT  %OFMAIL %OFSPAM  %OFHAM
--
 1 URIBL_INVALUEMENT   2702947.58   85.130.60
 2 RCVD_IN_INVALUEMENT 2611645.81   82.260.22
 3 HTML_MESSAGE2518479.83   79.32   80.48
 4 BAYES_992344541.09   73..840.12
 5 RCVD_IN_INVALUEMENT24   2329040.85   73.350.18
 6 URIBL_BLACK 2237239.49   70.460.74
 7 RCVD_IN_JMF_BL  1684530.70   53.062.74
 8 URIBL_JP_SURBL  1596227.99   50.270.12
 9 DKIM_SIGNED 1213737.32   38.23   36.18
 10 DKIM_VERIFIED   1105133.93   34.81   32.84

Chris

-
Chris Owen - Garden City (620) 275-1900 -  Lottery (noun):
President  - Wichita (316) 858-3000 -A stupidity tax
Hubris Communications Inc www.hubris.net - 
Yep Invalument is a good list. But there's no public option to compare it.. 


What log script do you good people use to generate the list above ? Is it a 
home brew or one we can download so we can compare our own hits ?


http://www.rulesemporium.com/programs/sa-stats.txt


OT - my eyes hurt (Re: Barracuda RBL in first place)

2009-08-15 Thread Henrik K
On Sat, Aug 15, 2009 at 10:02:52AM +0100, --[ UxBoD ]-- wrote:
 - Marc Perkel m...@perkel.com wrote: 
  
  
  Aaron Wolfe wrote: 
 
 On Fri, Aug 14, 2009 at 11:24 AM, Chris Owen ow...@hubris.net wrote: 
 
 On Aug 14, 2009, at 10:13 AM, Mike Cardwell wrote: 
 
 The comparisons on that page are useless. What matters is list policy,
 reliability and reputation.
 
 SpamHaus is hands down the best dnsbl. While I certainly agree that SpamHaus 
 is very good, I would argue that
 Invalument is currently better.  It certainly stops a lot more spam here and
 I think false positives are still extremely low. Invaluement lists are also 
 the top performers at my site:
 
 Total messages: 273235355
 Total blocked: 227710956 83.34%
 
 Unknown user 32.00% (32.00%)87427696
   Greylisted 24.88% (16.92%)46225401
Throttled 11.03% (5.64%) 15399444
  Relay access denied 0.01%  (0.00%) 7034
Bogus DNS (Broadcast) 0.01%  (0.00%)11692
   Bogus DNS (RFC 1918 space) 0.07%  (0.03%)82135
  Spoofed Address 0.26%  (0.12%)   319551
   Unclassified Event 0.77%  (0.35%)   949388
  Temporary Local Problem 0.01%  (0.00%) 8165
  Require FQDN sender address 0.04%  (0.02%)51022
   Require FQDN for HELO hostname 8.97%  (4.02%) 10988455
  Require DNS for sender's domain 0.78%  (0.32%)   870643
  Require Reverse DNS 23.83% (9.65%) 26372877
Require DNS for HELO hostname 0.20%  (0.06%)   165157
  The Spamhaus Block List 21.87% (6.74%) 18405091
   The Invaluement SIP Block List 22.14% (5.33%) 14557404
The SIP/24 Block List 3.84%  (0.72%)  1965510
  The Barracuda Reputation Block List 3.89%  (0.70%)  1915628
 (several RBLs not widely used snipped)
 
 We have several hundred domains and each can use it's own filtering
 options, so not all RBLs/checks are used on all mail.  Checks are
 listed in order applied, so a message dropped by unknown user for
 instance is never seen by greylisted.
 
 Invalument lists block over 25% of all messages that make it past all
 the checks in front of them, including Spamhaus.  That's massive.
 Barracuda is not used by a majority of clients and is used after the
 others, so the low number is not an indication of poor performance.
 I've actually had pretty good luck with it.
 
 -Aaron 
 
 --
 RANK    RULE NAME                       COUNT  %OFMAIL %OFSPAM  %OFHAM
 --
  1     URIBL_INVALUEMENT               27029    47.58   85.13    0.60
  2     RCVD_IN_INVALUEMENT             26116    45.81   82.26    0.22
  3     HTML_MESSAGE                    25184    79.83   79.32   80.48
  4     BAYES_99                        23445    41.09   73.84    0.12
  5     RCVD_IN_INVALUEMENT24           23290    40.85   73.35    0.18
  6     URIBL_BLACK                     22372    39.49   70.46    0.74
  7     RCVD_IN_JMF_BL                  16845    30.70   53.06    2.74
  8     URIBL_JP_SURBL                  15962    27.99   50.27    0.12
  9     DKIM_SIGNED                     12137    37.32   38.23   36.18
  10     DKIM_VERIFIED                   11051    33.93   34.81   32.84
 
 Chris
 
 -
 Chris Owen         - Garden City (620) 275-1900 -  Lottery (noun):
 President          - Wichita     (316) 858-3000 -    A stupidity tax
 Hubris Communications Inc www.hubris.net 
 - 
  
  Yep Invalument is a good list. But there's no public option to compare it. 
  
 What log script do you good people use to generate the list above ? Is it a 
 home brew or one we can download so we can compare our own hits ?
 

A bit OT but please don't post HTML (Marc!) and make incomprehensible and
full message quotes messages like this. Takes good while to scroll and
understand all this using mutt.



Re: giftcardsurveys.us.com

2009-08-15 Thread Ted Mittelstaedt

John Hardin wrote:

On Thu, 13 Aug 2009, Johnson, S wrote:

When I put in the email address of the user that was being sent these 
survey offers for gift cards I got a message stating please allow 10 
days for removal which makes me think they are not legit.


That's not necessarily the case. One legitimate reason for claiming a 
delay like that is if a marketing promotion is already underway 
materials may already be in the pipeline.


Granted, that's more true of physical mail than email, but the 
procedures in place for electronic marketing may have the same latency. 
It doesn't automatically mean they're lying about unsubscribing you as 
quickly as they practically can.


However, I agree it's annoying.



But it's so easy to check if they are lying.  Just setup a fake
e-mail address that feeds right into your Bays filter for spam,
then after keying in the user's e-mail address that you want to
unsubscribe, submit your feeder address for unsubscribing

If they are a bona-fied spammer when they see the virgin
(to their database) feeder address punched into their unsubscribe
link, they will immediately start spamming it.

Ted



Re: (no report template found)

2009-08-15 Thread SpamScoreChecker.com


Matt Kettler-3 wrote:
 
 Loren Wilton wrote:
 There is a standard template that gives the form of the report in the
 mail message.  I don't recall which cf file this is normally in, but
 it sounds like that file is not being included in the cf files in your
 configuration.

 I would check include paths and possibly permissions and the like, as
 well as and special configuration files or options that may be
 included by the process you are following.
 
 It normally lives in 10_misc.cf...
 
 However, it is also possible there's a clear_report_template command,
 with no new template declared after it.
 
 

I too had this problem and found the solution..

Running sa-compile created a folder like
/var/lib/spamassassin/3xxx
which is being picked up by the spamassassin config and causing the
'template not found' error.

Solution:
spamassassin --lint -D
sa-compile
rm -rf /var/lib/spamassassin/3xx
/etc/rc.d/init.d/spamassassin restart

I hope this helps you guys, it took me a day to figure this one out.

-- 
View this message in context: 
http://www.nabble.com/%28no-report-template-found%29-tp14623651p24983070.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Subject starts Re: but no References/In-Reply-To

2009-08-15 Thread Mike Cardwell
How would I create a rule to match when a subject line begins /^Re: /i 
but the message contains no References or In-Reply-To headers?


--
Mike Cardwell - IT Consultant and LAMP developer
Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/


Re: [Solved] Bad performance of Bayes with MySQL cluster

2009-08-15 Thread Jorn Argelo

Henrik K wrote:

On Sat, Aug 15, 2009 at 09:50:41AM +0200, Jorn Argelo wrote:
  

Henrik K wrote:


On Fri, Aug 14, 2009 at 07:43:37PM +0200, Jorn Argelo wrote:
  
  

Hi All,

I'm running spamassassin 3.2.5 on RHEL 5.3 x86_64. We have three 
boxes,  and all three of them are sharing the same bayes DB using a 
MySQL  cluster, version 7.0.6 (based on 5.1.34). The cluster has 2 
datanodes  with a quadcore and 4 GB of memory. Everything is working 
fine, even the  AWL in SQL, except for Bayes. The bayes database 
currently houses a bit  less than 500k tokens and the database size 
is not very big either, as  the datanodes have less than 1 GB of 
storage in use. I've followed the  instructions from the Spamassassin 
wiki, and I also used the supplied  bayes_mysql.sql file to create my 
tables. In case anyone is interested,  you can find the cluster.ini 
and the my.cnf used on the SQL nodes here:


http://www.wcborstel.com/web/mysql/my.cnf



skip-innodb

That's pretty much the reason. You _need_ to use InnoDB as it has row level
locking. MyISAM just kills Bayes.
  
  
Actually I'm using NDB and not MyISAM. I need a clustered storage  
engine, otherwise the bayes DB can't really be shared. If I create an  
InnoDB table on one SQL node, it doesn't show up at the other SQL node,  
while this is the case with an NDB storage engine.



Ah right sorry.. I have no idea on NDB and how it performs for SA.

  
What I can do however, is point all mailservers to one SQL node. I just  
need to synchronize the bayes_token table to the other SQL node I guess.  
Do you have an idea about this?



MySQL replication? Maybe search on spamassassin-users archives to find
experiences.

  
Thanks for this, I was not aware of it. Running expiry runs manually is  
done by sa-learn --force-expiry, correct?



Yep.


  
In case anybody else comes across the same, I've kicked out the MySQL 
cluster and now using MySQL with multi-master replication. There we can 
use InnoDB and this definitely solved all of the problems I had with 
bayes. Scantimes are now below 1 second. I don't have much load as of 
yet, so I expect this to increase somewhat during business hours, but 
all in all things look a lot more promising. I've used this howto: 
http://capttofu.livejournal.com/1752.html


Thanks for the pointers, Henrik.

Regards,
Jorn



__ Information from ESET NOD32 Antivirus, version of virus signature 
database 4336 (20090814) __

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com




Re: Subject starts Re: but no References/In-Reply-To

2009-08-15 Thread Evan Platt

At 02:56 AM 8/15/2009, you wrote:
How would I create a rule to match when a subject line begins /^Re: 
/i but the message contains no References or In-Reply-To headers?


Just FYI, I'm on a number of lists where different people insist on 
starting their subject with RE: ... YMMV. :) 



SV: Subject starts SV: but no References/In-Reply-To

2009-08-15 Thread Benny Pedersen
On Sat, 15 Aug 2009 07:12:18 -0700, Evan Platt e...@espphotography.com
wrote:
 At 02:56 AM 8/15/2009, you wrote:
How would I create a rule to match when a subject line begins /^Re: 
/i but the message contains no References or In-Reply-To headers?
 
 Just FYI, I'm on a number of lists where different people insist on 
 starting their subject with RE: ... YMMV. :)

so lets change the stats :)

-- 
Benny Pedersen


Re: Subject starts Re: but no References/In-Reply-To

2009-08-15 Thread Henrik K
On Sat, Aug 15, 2009 at 07:12:18AM -0700, Evan Platt wrote:
 At 02:56 AM 8/15/2009, you wrote:
 How would I create a rule to match when a subject line begins /^Re: /i 
 but the message contains no References or In-Reply-To headers?

 Just FYI, I'm on a number of lists where different people insist on  
 starting their subject with RE: ... YMMV. :) 

Something here..

http://ruleqa.spamassassin.org/?rule=%2FFAKE_REPLY
http://svn.apache.org/repos/asf/spamassassin/trunk/rulesrc/sandbox/fredt/99_zFVGT_FakeReply.cf



Re: Barracuda RBL in first place

2009-08-15 Thread MySQL Student
Hi,

                            Unknown user 32.00% (32.00%)            87427696
                              Greylisted 24.88% (16.92%)            46225401
                               Throttled 11.03% (5.64%)             15399444
                     Relay access denied 0.01%  (0.00%)                 7034
                   Bogus DNS (Broadcast) 0.01%  (0.00%)                11692
              Bogus DNS (RFC 1918 space) 0.07%  (0.03%)                82135
                         Spoofed Address 0.26%  (0.12%)               319551
                      Unclassified Event 0.77%  (0.35%)               949388
                 Temporary Local Problem 0.01%  (0.00%)                 8165
             Require FQDN sender address 0.04%  (0.02%)                51022
          Require FQDN for HELO hostname 8.97%  (4.02%)             10988455

[...]

Can I ask how you produced those stats? They look very helpful.

Thanks,
Alex


Re: Barracuda RBL in first place

2009-08-15 Thread MySQL Student
Hi,

 What log script do you good people use to generate the list above ? Is it
 a home brew or one we can download so we can compare our own hits ?

 http://www.rulesemporium.com/programs/sa-stats.txt

Any chance someone knows where there is a compatible one that parses
amavisd instead of spamd? I've tried, but guess I don't know enough
perl to get it right.

Any chance someone has a bit of time to hack on it on this lazy
Saturday afternoon? :-)

Thanks,
Alex


Counting RAZOR2 hits

2009-08-15 Thread MySQL Student
Hi,

I thought grep -c RAZOR2_CHECK through my mail logs would give me a
good approximation of the number of times RAZOR2 was consulted, but
that doesn't seem to be the case. There are some mails that don't have
it listed in the tests= section.

I've also tried the razor-* commands, and they don't appear to be able
to help here either. What am I missing?

Does RAZOR2_CHECK mean that it was found in the RAZOR2 db, or that it
merely consulted the db?

Thanks,
Alex


Re: Barracuda RBL in first place

2009-08-15 Thread Benny Pedersen
On Sat, 15 Aug 2009 13:28:01 -0400, MySQL Student mysqlstud...@gmail.com
wrote:

 Any chance someone has a bit of time to hack on it on this lazy
 Saturday afternoon? :-)

http://www.mikecappella.com/logwatch/

-- 
Benny Pedersen


RE: DKIM-Reputation list

2009-08-15 Thread R-Elists

is this DKIM-Reputation setup for any *general* current spamassassin
deployment or does it only work with certain MTA setups ???

i am asking because i believe what i saw was that Amavis was mentioned, and
nothing else.

TIA

 - rh



Re: Counting RAZOR2 hits

2009-08-15 Thread Matt Kettler
MySQL Student wrote:
 Hi,

 I thought grep -c RAZOR2_CHECK through my mail logs would give me a
 good approximation of the number of times RAZOR2 was consulted, but
 that doesn't seem to be the case. There are some mails that don't have
 it listed in the tests= section.

 I've also tried the razor-* commands, and they don't appear to be able
 to help here either. What am I missing?

 Does RAZOR2_CHECK mean that it was found in the RAZOR2 db, or that it
 merely consulted the db?
   
That means it was found and was above your min_cf. i.e.: Razor believes
it is spam.




Re: Barracuda RBL in first place

2009-08-15 Thread Roger Marquis

well, you have half of it, as any hit shown here by invaluement was
missed by spamhaus.  I can't give you the data for other cases because
it's a short circuit - 550 type of thing.


That's not an ideal metric.  You really need to test every incoming message
against each RBL (up to 4 or so, to avoid DNS timeouts).  Postfix supports
this with warn_if_reject before doing the actual 5XX reject.  It's the
warnings that yield valid data, or at least they do with large and
representative samples (which IME = 100K msgs/day).

Roger Marquis