John D. Hardin wrote:
I've never trusted automatic learning. Why let your Bayes database be (even partially) under the control of a third party, particularly when that third party is the attacker?

Because there's no other (practical and/or ethical) way of getting enough ham to make it useful?

Anyone using SA in an ISP environment will run into this problem; about the only way I can see to legitimately get any real volume of ham is to send customers' outbound mail into a learning queue somewhere. Even that has its limits and issues - for instance, the fact that any ISP larger than a few thousand customers will likely have completely separate paths for inbound and outbound mail, which *will* affect the usefulness of the learning. :/

I've been running the same Bayes databases on one system and my personal email since I upgraded from SA2.44 to 2.54 and started using Bayes; I'd be running the original Bayes DB on another system if I had figured out I *could* just continue to use the exact same files upgrading 2.64->3.1.7 at the time.

Accuracy on the continuous-use databases hasn't suffered for the autolearning, so far as I can tell... but the more out-of-date SA itself got the worse it was at tagging spam.

I *do* regularly feed back both my own missed-spams (my account, and three role accounts), as well as customer-submitted missed-spam. Lately there have only been four or five (reported) FNs per day, across the whole system.

-kgd

Reply via email to