John D. Hardin wrote:
I've never trusted automatic learning. Why let your Bayes database be
(even partially) under the control of a third party, particularly
when that third party is the attacker?
Because there's no other (practical and/or ethical) way of getting
enough ham to make it useful?
Anyone using SA in an ISP environment will run into this problem; about
the only way I can see to legitimately get any real volume of ham is to
send customers' outbound mail into a learning queue somewhere. Even
that has its limits and issues - for instance, the fact that any ISP
larger than a few thousand customers will likely have completely
separate paths for inbound and outbound mail, which *will* affect the
usefulness of the learning. :/
I've been running the same Bayes databases on one system and my personal
email since I upgraded from SA2.44 to 2.54 and started using Bayes; I'd
be running the original Bayes DB on another system if I had figured out
I *could* just continue to use the exact same files upgrading
2.64->3.1.7 at the time.
Accuracy on the continuous-use databases hasn't suffered for the
autolearning, so far as I can tell... but the more out-of-date SA
itself got the worse it was at tagging spam.
I *do* regularly feed back both my own missed-spams (my account, and
three role accounts), as well as customer-submitted missed-spam. Lately
there have only been four or five (reported) FNs per day, across the
whole system.
-kgd
- Re: Is Bayes Dead? Have the spammers won? Kris Deugau
-