subject:"Spamcheck and how it affects bayes question"

Re: Spamcheck and how it affects bayes question

2009-07-22 Thread Bowie Bailey


Matt Kettler wrote:

Gary Smith wrote:
  

We have a process in place using the perl CPAN module for invoking SA.  This is 
outside of the scope of the normal mail system.  Basically we use this to see 
what scores emails would generate for some statistical stuff.  The spam engine 
this calls is to set use -100 as the score so that everything is considered 
spam.  Our production spam engine is set to 7.  We are looking at the score 
that the perl modules returns and logging it (rather than the isspam flag).  To 
complicate things a little more, we are using MySql for the bayes store.  This 
store is also used by our production boxes.  This isn't the problem, just what 
we are doing.

The CPAN module has this as the decription:
public instance (\%) process (String $msg, Boolean $is_check_p)
Description:
This method makes a call to the spamd server and depending on the value of
C$is_check_p either calls PROCESS or CHECK.

Given that the perl call as a boolean option for PROCESS and CHECK, I would assume that 
they make some difference, but it really doesn't what the difference is.  Currently in 
our code we are it with a false value, which executes the PROCESS commnad.

What I'm wondering is will this through off bayes if we keep doing this as 
everything that SA is returning is considered spam?  I'm just worried that 
these continued tests will cause bayes to get wacky.  Also, should we be using 
PROCESS or CHECK when doing this type of checks.

Gary

  


The bayes auto-learning system does not care what your required_score
is set to, and does not care if messages are tagged as spam or not. It
uses its own thresholds, and its own additional criteria for learning.

So, feeding it lots of mail with the threshold set to -100 shouldn't
matter at all.
  


If you're worried about it, set  bayes_auto_learn 0 in whatever conf 
file you use for your statistical setup.  That way, you can take 
advantage of Bayes for scoring, but nothing you do on that system will 
affect the db.


--
Bowie

Spamcheck and how it affects bayes question

2009-07-21 Thread Gary Smith

We have a process in place using the perl CPAN module for invoking SA.  This is 
outside of the scope of the normal mail system.  Basically we use this to see 
what scores emails would generate for some statistical stuff.  The spam engine 
this calls is to set use -100 as the score so that everything is considered 
spam.  Our production spam engine is set to 7.  We are looking at the score 
that the perl modules returns and logging it (rather than the isspam flag).  To 
complicate things a little more, we are using MySql for the bayes store.  This 
store is also used by our production boxes.  This isn't the problem, just what 
we are doing.

The CPAN module has this as the decription:
public instance (\%) process (String $msg, Boolean $is_check_p)
Description:
This method makes a call to the spamd server and depending on the value of
C$is_check_p either calls PROCESS or CHECK.

Given that the perl call as a boolean option for PROCESS and CHECK, I would 
assume that they make some difference, but it really doesn't what the 
difference is.  Currently in our code we are it with a false value, which 
executes the PROCESS commnad.

What I'm wondering is will this through off bayes if we keep doing this as 
everything that SA is returning is considered spam?  I'm just worried that 
these continued tests will cause bayes to get wacky.  Also, should we be using 
PROCESS or CHECK when doing this type of checks.

Gary

Re: Spamcheck and how it affects bayes question

2009-07-21 Thread Matt Kettler

Gary Smith wrote:
 We have a process in place using the perl CPAN module for invoking SA.  This 
 is outside of the scope of the normal mail system.  Basically we use this to 
 see what scores emails would generate for some statistical stuff.  The spam 
 engine this calls is to set use -100 as the score so that everything is 
 considered spam.  Our production spam engine is set to 7.  We are looking at 
 the score that the perl modules returns and logging it (rather than the 
 isspam flag).  To complicate things a little more, we are using MySql for the 
 bayes store.  This store is also used by our production boxes.  This isn't 
 the problem, just what we are doing.

 The CPAN module has this as the decription:
 public instance (\%) process (String $msg, Boolean $is_check_p)
 Description:
 This method makes a call to the spamd server and depending on the value of
 C$is_check_p either calls PROCESS or CHECK.

 Given that the perl call as a boolean option for PROCESS and CHECK, I would 
 assume that they make some difference, but it really doesn't what the 
 difference is.  Currently in our code we are it with a false value, which 
 executes the PROCESS commnad.

 What I'm wondering is will this through off bayes if we keep doing this as 
 everything that SA is returning is considered spam?  I'm just worried that 
 these continued tests will cause bayes to get wacky.  Also, should we be 
 using PROCESS or CHECK when doing this type of checks.

 Gary

   
The bayes auto-learning system does not care what your required_score
is set to, and does not care if messages are tagged as spam or not. It
uses its own thresholds, and its own additional criteria for learning.

So, feeding it lots of mail with the threshold set to -100 shouldn't
matter at all.

RE: Spamcheck and how it affects bayes question

2009-07-21 Thread Gary Smith

 The bayes auto-learning system does not care what your required_score
 is set to, and does not care if messages are tagged as spam or not. It
 uses its own thresholds, and its own additional criteria for learning.
 
 So, feeding it lots of mail with the threshold set to -100 shouldn't
 matter at all.

I can live with that answer.  That's what I was looking for.

Thanks, 

Gary

Re: Spamcheck and how it affects bayes question

Spamcheck and how it affects bayes question

Re: Spamcheck and how it affects bayes question

RE: Spamcheck and how it affects bayes question

4 matches

Site Navigation

Mail list logo

Footer information