Ned Slider a écrit :
> Genuine spam traps are great for bayes training as they should contain a
> representative sample of spam your users will be seeing plus you know
> they only contain spam so you don't need to check the contents before
> feeding them to bayes to learn :)
> 

you must be careful with traps. They can get non spam mail:

- bounces (backscatter). you may consider this spam, but I'm not sure
this won't simply poison your bayes

- spammers can use the trap address in subscription forms. (I mean, if
they can send mail to these addresses, then they can use them otherwise.
if they can't send mail to, then the address is useless!). so you should
at least exclude "confirmation requests".

I do "whitelist" some pseudo-traps from time to time, but I manually
review the messages (quickly of course).

> I do the same - whitelist a few *good* spamtraps through all my
> different levels of filtering specifically to feed bayes. I also use
> these for statistical analysis to see which types of mail SA scores
> poorly on and then target custom rules towards those spam to help bump
> the scores.
> 
> I'm sure there's other useful stuff you can do with spamtrap mails too.
> 
> 
> 
> 

Reply via email to