Hi Jon, The Bayesian filtering that I've been working with uses per-user databases in ~/.spamassassin, not a side-wite database. So the site-wide administrator need not worry about it (if the users are savvy). But you make some good points about how to make it easier if there is a site-wide database.
My question was more along the lines of, what's the proper way to "submit it back via the sa-learn command" (Specifically for missed spams; I haven't seen any false positives yet.) Do I use the --forget option because the message would have been counted as non-spam earlier? Or do I just use the --spam option? None of the documentation is specific on this. --Jeremy On Thu, 2003-03-13 at 14:00, Jon Carnes wrote: > The training looks like a pain in the a**, but I think you could make it > easier on the folks by setting up some scripts to accept forwarded > messages from your local users. > > Local users would forward mistakenly tagged messages to one of two > addresses: > sa_nospam - indicating that this shouldn't have been marked as spam > sa_spam - indicating that this spam message slipped through > > It would be up to you Jeremy to have your script reshape the message to > its original form and then submit it back via the sa-learn command. > > Just an idea, but it may work (and be a good contrib back into the > community). > > Jon > > On Thu, 2003-03-13 at 13:11, Jeremy Portzer wrote: > > Good afternoon folks, > > > > I've been playing around with the new Spamassassin, version 2.50, which > > includes Bayesian filtering (see http://www.paulgraham.com/spam.html for > > the paper about this, mentioned at ESR's talk, and see the man page for > > the "sa-learn" command). > > > > As per the sa-learn man page, the default in SA 2.50 is to operate in > > Unsupervised auto-learning. This means that mail is populated in the > > "ham/spam" databases based on whether SpamAssassin marks it as spam or > > not, from the other rules. The man page mentions that this "should be > > supplemented with some supervised training in addition, if possible." > > > > How do I go about "supplementing" the auto-learning mode? One problem I > > can see with auto-learning is that missed spams become marked as "ham" > > (non-spam) and could mess up the database. So I'm collecting these > > mistakes, but how do I properly adjust the database? Do I need to make > > it "forget" the mistaken emails first, and then run them through > > sa-learn with --ham? Or is running them through with --ham enough? > > > > Anyone know of resources/HOWTOs/examples with actual commands, instead > > of generalized statements like "supplement with supervised training" ? > > > > ==== > > > > If anyone else is interested in testing SpamAssassin, it is installed on > > the TriLUG mail server now. Just put something like this in your > > .procmailrc : > > > > :0fw > > | /usr/bin/spamc > > > > Then your spam will be marked with the X-Spam-Status header, which you > > can filter on if you like. > > > > Regards, > > Jeremy > > > > -- > > /=====================================================================\ > > | Jeremy Portzer [EMAIL PROTECTED] trilug.org/~jeremy | > > | GPG Fingerprint: 712D 77C7 AB2D 2130 989F E135 6F9F F7BC CC1A 7B92 | > > \=====================================================================/ > > > _______________________________________________ > TriLUG mailing list > http://www.trilug.org/mailman/listinfo/trilug > TriLUG Organizational FAQ: > http://www.trilug.org/~lovelace/faq/TriLUG-faq.html > -- /=====================================================================\ | Jeremy Portzer [EMAIL PROTECTED] trilug.org/~jeremy | | GPG Fingerprint: 712D 77C7 AB2D 2130 989F E135 6F9F F7BC CC1A 7B92 | \=====================================================================/
signature.asc
Description: This is a digitally signed message part
