From: "Chris Santerre" <[EMAIL PROTECTED]>
-----Original Message-----
From: jo3 [mailto:[EMAIL PROTECTED]
Hi,
This is an observation, please take it in the spirit in which it is
intended, it is not meant to be flame bait.
After using spamassassin for six solid months, it seems to me
that the
bayes process (sa-learn [--spam | --ham]) has only very
limited success
in learning about new spam. Regardless of how many spams and
hams are
submitted, the effectiveness never goes above the default
level which,
in our case here, is somewhere around 2 out of 3 spams correctly
identified. By the same token, after adding the "third party" rule,
airmax.cf, the effectiveness went up to 99 out of 100 spams correctly
identified.
I have long said that IMHO, I do not think bayes is worth it. Left
unattended, it isn't as good. A simple rule can take out a lot of spam. Some
may say rule writing is more complicated then training bayes. Maybe. Not so
much the rule writing, but the figuring out what to look for and testing for
FPs.
I do not run Bayes for our company. Obviously I'm partial to URIBL.com and
SARE rules ;) I get about 98% of spam caught, and little FPs.
This is going to sound like tooting our own horn, but so be it. Before SARE,
Bayes was cool. After SARE, I see no need.
Autolearning Bayes is not really very good based on what people here
seem to say. I do note that I raised by BAYES_99 score to 5. If BAYES_99
hits the odds that the message is spam are so high that it's silly to
give BAYES_99 a low score, theoretical nonsense notwithstanding.
If you apply the wrong statistical theory with the wrong conceptual
criteria the math or theory may be good but the results are trash. For
an existing spam database the rule setup that exists is probably quite
good. If 99 hits then other rules probably hit as well. This leads to
artificially lowering the 99 score. Then when a new technique hits that
Bayes can recognize but nothing else does comes along the message floats
on through. At least on this system 99 misses once in 2000 to 10000
times. Most of those times other very light whitelisting rules let the
messages come through. Probably the right score for more general use
would be 4.95 or something such that if any other rule hits it's dinged
as spam. It depends on your spam tolerance compared to your tolerance
for sorting spam by score and looking at the few that are marginal.
Anyway, making that ONE change made the already good results I was getting
with SARE and BAYES combined quite a bit better. Missed spam went down
almost a factor of 10 and tagged ham went up by about 1 in 10,000 or
less. (I can't remember the last time I got a ham marked as spam on
the sole basis of BAYES_99 with a score of 5 that I had to fetch out of
the spam folder.) I take this as a proof of concept that penalizing a
rule for being too good is ridiculous on its face, statistical theories
notwithstanding. I maintain this is a positive indication that either
the criteria, the chosen statistical approach, or both are wrong.
It might be entertaining to setup "stock" BAYES on your system, Chris,
with all BAYES scores being very very low, 0.01 or something. Then run
the SARE version of sa_stats.pl to see what the "goodness" of each
BAYES level really is. From that you can guesstimate some scores that
might improve your system. I'd be really interested to see what the
autolearn BAYES really can perform like when it's used in your sort
of environment. I know for my environment it's silly to use it due to
the automated mis-learning on marginal messages. (Either it learns
wrong or not at all on the most critical portions of the email load,
the marginal messages.)
{^_^} Joanne steps down off her soapbox yet again.