Re: Scoring for DATE_IN_FUTURE_96_XX

2009-12-01 Thread Matt Kettler
Thomas Harold wrote:
 On 11/30/2009 9:27 PM, Thomas Harold wrote:
 While looking at the scores in 50_scores.cf, I noticed the following:

 score DATE_IN_FUTURE_03_06 2.303 0.416 1.461 0.274
 score DATE_IN_FUTURE_06_12 3.099 3.099 2.136 1.897
 score DATE_IN_FUTURE_12_24 3.300 3.299 3.000 2.189
 score DATE_IN_FUTURE_24_48 3.599 2.800 3.599 3.196
 score DATE_IN_FUTURE_48_96 3.199 3.182 3.199 3.199
 score DATE_IN_FUTURE_96_XX 3.899 3.899 2.598 1.439

 Why does the 96+ hour rule score so much lower then the 48-96 hour test
 for the last two entries?

 (I'm also wondering if there should be an even higher score rule for
 stuff over 168 hours in the future or past.)

 I did dig up the following thread from back in Oct '06...

 http://mail-archives.apache.org/mod_mbox/spamassassin-users/200611.mbox/browser


 I'm guessing that what it boils down to is contained in the wiki page?
 The spam is better off caught by another rule once network tests are
 allowed?
Yep, since SA is scored as a set, score stealing between rules is
pretty common when there's a lot of overlap between two rules and one
performs slightly better than the other. It's also possible for there to
be more complicated cascades where one rule affects another, which in
turn affects a third, which affects a fourth...

Also looking at the above scores, there's likely no spam network tests
that cover the same mail as 48_96, because its score is pretty much the
same.

 On average the scores of all non-network spam rules should go down a
little bit when the network tests are enabled there are more rules in
the set competing for score. However since the distribution of hits
across rules is distinctly not random, you'll see a lot of non-average
cases, which means some rules will be:
staying the same because they cover mail the network tests don't
going down radically due to heavy overlap
going up because they correct false negatives in some of the
non-spam network tests.

 http://wiki.apache.org/spamassassin/HowScoresAreAssigned




Scoring for DATE_IN_FUTURE_96_XX

2009-11-30 Thread Thomas Harold

While looking at the scores in 50_scores.cf, I noticed the following:

score DATE_IN_FUTURE_03_06 2.303 0.416 1.461 0.274
score DATE_IN_FUTURE_06_12 3.099 3.099 2.136 1.897
score DATE_IN_FUTURE_12_24 3.300 3.299 3.000 2.189
score DATE_IN_FUTURE_24_48 3.599 2.800 3.599 3.196
score DATE_IN_FUTURE_48_96 3.199 3.182 3.199 3.199
score DATE_IN_FUTURE_96_XX 3.899 3.899 2.598 1.439

Why does the 96+ hour rule score so much lower then the 48-96 hour test 
for the last two entries?


(I'm also wondering if there should be an even higher score rule for 
stuff over 168 hours in the future or past.)


Re: Scoring for DATE_IN_FUTURE_96_XX

2009-11-30 Thread Thomas Harold

On 11/30/2009 9:27 PM, Thomas Harold wrote:

While looking at the scores in 50_scores.cf, I noticed the following:

score DATE_IN_FUTURE_03_06 2.303 0.416 1.461 0.274
score DATE_IN_FUTURE_06_12 3.099 3.099 2.136 1.897
score DATE_IN_FUTURE_12_24 3.300 3.299 3.000 2.189
score DATE_IN_FUTURE_24_48 3.599 2.800 3.599 3.196
score DATE_IN_FUTURE_48_96 3.199 3.182 3.199 3.199
score DATE_IN_FUTURE_96_XX 3.899 3.899 2.598 1.439

Why does the 96+ hour rule score so much lower then the 48-96 hour test
for the last two entries?

(I'm also wondering if there should be an even higher score rule for
stuff over 168 hours in the future or past.)


I did dig up the following thread from back in Oct '06...

http://mail-archives.apache.org/mod_mbox/spamassassin-users/200611.mbox/browser

I'm guessing that what it boils down to is contained in the wiki page? 
The spam is better off caught by another rule once network tests are 
allowed?


http://wiki.apache.org/spamassassin/HowScoresAreAssigned