Am Dienstag, 15. Februar 2005 22:08 schrieb Chris Santerre: > >I have autolearned disabled in my SpamAssassin config. > > > >I get certain e-mail accounts that are old and JUST GET SPAM > >(no question > >about it). I set up a script that takes e-mails from these > >accounts and feds > >them in to sa-learn as SPAM. > > > >I have no HAM's right now, however I have plans to add at > >least a couple > >hundred to bayes (that is the bare minimum, I believe). > > > >My question is: Is there anything wrong with doing this? I've seen some > >posts about ratio's. I figured the more SPAM you feed it, the > >smarter it > >will get. Keep in mind I am not trying to use bayes scoring > >right now, but I > >thought this setup was better instead of using auto-learn to > >try to guess > >which were spam (they are ALL spam!) > > When taking a survey on abstinence, is it good to only go and ask college > kids? :) > > A proper Bayes Diet consists of 50% ham and %50 spam. This would be the > optimum. Drastic differences can skew the results. Remember Bayes doesn't > just look for spam, it also looks for ham just as much.
The 1:1 ratio is a mistake based on a wrong interpretation of the bayes theorem. I have a ham : spam ratio of 1 : 40. > > And YES, Ninja Chris has just answered a Bayes question. I know, I know, > don't panic! ;) > > --Chris (I don't usually answer Bayes questions because I don't think Bayes > is a good solution.) I thing bayes is a very good addition to individual rules. And when it's trained propper it works fine. Thomas -- icq:133073900 http://www.t-arend.de
pgpyxMSo57Wq1.pgp
Description: PGP signature