Re: Workflow for adding new ham/spam to existing site-wide database?

Steve Dondley Tue, 16 Mar 2021 12:34:11 -0700

You covered a lot of ground here. Thanks.. If you have some sparecycles, I have follow up questions to get an understanding of how youprocess your email:

21 seconds at that includes fetch the samples via imap from two
folders, fire them against a bayes-only spamassasin instance,

What is a "bayes-only" instance? I don't follow. What other kinds ofinstances are there?



ignore

BAEYS_00/BAYES_99 messages, move the rest to the both training
folders, anonymize them, strip useless headers, fire sa-learn against

OK, so it looks like you are suggesting that emails get kind ofpre-screened to determine if they are obvious spam or not.

And by anonymize, what do you mean? Remove the headers that containemail addresses? What other headers are useless? What exactly is thegoal of anonymizing and removing the headers? I think I have a vagueidea why but can't quite crystallize it in my head.

both folders, fire bogfilkter training against both folders and verify
that the new sampel files score with BEYS_99/BAYES_00 now


bogfilkter training?

So the goal is to get all the new emails to score either 99 (spam) or 00(ham).

So once I verify they score 00 or 99, do I then throw them on the largercollection of ham/spam with all headers restored? And what do I do ifthey still don't score 00 or 99?

Re: Workflow for adding new ham/spam to existing site-wide database?

Reply via email to