Marcin Krol wrote:
Matus UHLAR - fantomas wrote:
- blocking at MTA by RBL or other techniques (such as graylisting)
is efficient and effective, but deprives SpamAssassin of spam samples,
so if your resources permit, it is better to let SpamAssassin deal
with all RBLs.
I don't think so. We get "enough" of spam even if using many RBLs at SMTP
level.
Plus note that characteristics of spam that got through RBL "sieve"
*might* be different than characteristics of the spam that didn't.
If so - I have not done any tests, so I have no idea really - then Bayes
would be at least partially mistrained.
Having said that, I do have exceptions to my sender-verify and RBL rules
for spam traps. :-) Now, getting something useful done with that stuff
is another story.
Genuine spam traps are great for bayes training as they should contain a
representative sample of spam your users will be seeing plus you know
they only contain spam so you don't need to check the contents before
feeding them to bayes to learn :)
I do the same - whitelist a few *good* spamtraps through all my
different levels of filtering specifically to feed bayes. I also use
these for statistical analysis to see which types of mail SA scores
poorly on and then target custom rules towards those spam to help bump
the scores.
I'm sure there's other useful stuff you can do with spamtrap mails too.