Re: Why shouldn't I set the score for SPAM_99 and SPAM_999 higher?

Dave Wreski Thu, 05 May 2022 12:29:39 -0700

That's a great call, thanks. I grepped my mail files and didn't findany SPAM_99 headers in any of them.
You should be looking for BAYES_99 and BAYES_999 in your corpus.
Thanks, Dave. I use my various mailboxes (sa-learn --ham --mbox/home/thomas.cameron/mail/INBOX/[mailbox file] and then sa-learn --spam--mbox /home/thomas.cameron/mail/INBOX/spam) to train SA, doesn't thatmean that I've already checked my corpora?

No, that's how you train your corpora. If you manually look through theheaders of mail that's already been processed by your mail system, theham should be as close to BAYES_00 as possible, and spam should be atBAYES_99. If that's not the case, then it's been trained incorrectly.


/etc/mail/spamassassin/local.cf:
bayes_auto_learn  0
bayes_auto_expire 0

I'd also recommend disabling auto-learn, if you have that enabled.

If you've gone through your corpus manually, and are certain the ham isall good mail and the spam emails are all bad mail, then it might beworth it to dump the existing bayes database and just retrain it withthe corresponding mboxes.


I also typically add --progress to sa-learn.

Best,
Dave


Thomas

Re: Why shouldn't I set the score for SPAM_99 and SPAM_999 higher?

Reply via email to