Paul Boven wrote:
Hi everyone,

Here are some observations on using Bayes and autolearning I would like to share, and have your input on.

Autolearning is turining out to be more trouble than it's worth. Although it helps the system to get to know the ham we send and get, and learn some of the spams on its own, it also tends to 'reward' the 'best' spammers out there. Spams that hit none of the rules (e.g. the current deluge of stock-spams) drive the score for all kinds of misspelled words towards the 'hammy' side of the curve, which makes it possible for more of that kind of junk to slip trough even if it hits SURBLSs or other rules.



<SNIP>


Bayes is a very powerfull system, especially for recognising site-specific ham. But at this moment, apx. 30% of the spam that slips trough my filter has 'autolearn=ham' set. And another 60% of the spam slipping trough has a negative Bayes score to help them along. For the moment, I've disabled the autolearning in my Bayes system.


Regards, Paul Boven.



If your system is autolearning 30% of the spam as ham it is seriously screwed up. It only autolearns when its pretty damn sure of its classification of the message in question. A bad bayes database will only continue to get worse if left alone. The trick is starting out good with the learning and its cake from there. On some systems its even less of an issue. I've maybe manually sa-learn'ed 20-30 messages ever in a little over a year using SA. Everything else has been autolearned. Its rare that i see bayes scores other than _00 and _99. I'd say my bayes db is pretty damn accurate at this point, and its done most of it on its own. Now keep in mind that i've altered the scores of some rules (bayes mostly) and i've also adjusted the autolearn thresholds for my system. I've upped the spam and lowered the ham numbers so nothing will be autolearned unless SA is REALLY sure it knows what its doing. I'd tend to think its easier to tweak the system a bit than to change the way bayes/autolearning works..but hey, thats just me.


-Jim

Reply via email to