On Tue, 16 Apr 2019 15:16:59 +0300 Jari Fredriksson wrote: > John Hardin kirjoitti 15.4.2019 1:33: > > On Sun, 14 Apr 2019, Jari Fredriksson wrote: > > > >> Now, I am part of RuleQA. Should I accept everything and pass it > >> so SpamAssassin and to my corpus or not? > > > > I would suggest yes, you should accept everything that reaches your > > spamtrap addresses and include it in your corpora. Don't worry about > > that, worry about whether or not the messages get correctly > > classified. > > Thanks. I might test next weekend about dropping the postscreen > scanning.
Before you do that, I would suggest you read what Henrik K wrote: "There are already major spamtrappers etc contributing to ruleqa, I think most of the "easy dialup" spam is seen there too. Just try to look for the hard to catch spam not ending up in ham corpus." IMO the corpus should contain a bit of everything, but ideally it should be dominated by the spam that would reach SA on a server following best practice. A corpus generated from unfiltered spamtraps is very heavily biased in the wrong direction. Also if someone is processing mail from an MTA they don't control it doesn't mean that there's no upstream MTA filtering. My experience is that lighter (low FP) spam filtering is something you have to pay extra for these days.