Good afternoon folks, I've been playing around with the new Spamassassin, version 2.50, which includes Bayesian filtering (see http://www.paulgraham.com/spam.html for the paper about this, mentioned at ESR's talk, and see the man page for the "sa-learn" command).
As per the sa-learn man page, the default in SA 2.50 is to operate in Unsupervised auto-learning. This means that mail is populated in the "ham/spam" databases based on whether SpamAssassin marks it as spam or not, from the other rules. The man page mentions that this "should be supplemented with some supervised training in addition, if possible." How do I go about "supplementing" the auto-learning mode? One problem I can see with auto-learning is that missed spams become marked as "ham" (non-spam) and could mess up the database. So I'm collecting these mistakes, but how do I properly adjust the database? Do I need to make it "forget" the mistaken emails first, and then run them through sa-learn with --ham? Or is running them through with --ham enough? Anyone know of resources/HOWTOs/examples with actual commands, instead of generalized statements like "supplement with supervised training" ? ==== If anyone else is interested in testing SpamAssassin, it is installed on the TriLUG mail server now. Just put something like this in your .procmailrc : :0fw | /usr/bin/spamc Then your spam will be marked with the X-Spam-Status header, which you can filter on if you like. Regards, Jeremy -- /=====================================================================\ | Jeremy Portzer [EMAIL PROTECTED] trilug.org/~jeremy | | GPG Fingerprint: 712D 77C7 AB2D 2130 989F E135 6F9F F7BC CC1A 7B92 | \=====================================================================/
signature.asc
Description: This is a digitally signed message part
