I only train SA with either: 1) Spam which got through (false negatives). 2) Ham which triggered a false positive and was mistakenly marked as spam.
As I understand it SA automatically trains itself on spam which it traps to tune on what spam techniques are au courant. I haven't seen too much in the way of false positives out of SA in quite a while. The gmail spam filter on the other hand seems to get a few false positives every week. On Mon, 24 Jan 2005 08:46:17 -0500, Brian Henning <[EMAIL PROTECTED]> wrote: > Hi Guys, > As I'm sure the postmasters among us are well-aware, in order to keep > my Bayesian filter working efficiently, it's occasionally necessary to > retrain it for new tacks the spammers have invented. My question is > thus: Is it "safe" and/or wise to train with only -spam input? Our > non-spam profile hasn't changed much, so the previous -ham training > ought to still be valid...right? Or do I need to somehow balance the > training "sessions" with equal parts -spam and -ham even though the -ham > still looks about the same? > > Thanks as always, > ~Brian > -- > TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug > TriLUG Organizational FAQ : http://trilug.org/faq/ > TriLUG Member Services FAQ : http://members.trilug.org/services_faq/ > TriLUG PGP Keyring : http://trilug.org/~chrish/trilug.asc > -- TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug TriLUG Organizational FAQ : http://trilug.org/faq/ TriLUG Member Services FAQ : http://members.trilug.org/services_faq/ TriLUG PGP Keyring : http://trilug.org/~chrish/trilug.asc
