RE: question on training spamassassin

Webmaster Thu, 02 Mar 2006 09:26:26 -0800

 

> -----Original Message-----
> From: Matt Kettler [mailto:[EMAIL PROTECTED] 
> Sent: March 2, 2006 8:53 AM
> To: [EMAIL PROTECTED]
> Cc: users@spamassassin.apache.org
> Subject: Re: question on training spamassassin
> 
> Webmaster wrote:
> 
> >> Also if your users are only or mostly forwarding spam, 
> SA's bayes is 
> >> going to have a bayes bias that all messages forwarded by 
> your mail 
> >> clients are spam, regardless of content.
> >>
> >>
> > 
> > Does this also mean that it is almost useless to share 
> bayes from one 
> > server to the next if each server has its own set of hosted 
> domains ?
> 
> Yes, it's definitly very sub-optimal to share bayes DB's 
> across different domains, but not for the reason of header 
> differences.
> 
> The reason this is useless is that the nonspam mail received 
> by different domains is not likely to be similar.
> 
> Take for example a shipping company and a law firm. How much 
> similarity is there going to be in the day-to-day nonspam of 
> these sites? Sure both are likely to have some personal "Hi 
> hon, working late, be home at 7pm" type emails. However their 
> commercial nonspam is going to be VERY different.
> 
> 
> > Because if the headers play such an important role, spams 
> targetting 
> > different sets of domains, I assume, are learned differently.
> 
> To some degree, yes, but this is less severe than forwarding.
> 
> At least things like source IP, User-agent, Message-ID and 
> other patterns are NOT going to be different across domains.
> 
> To: and Received: headers will be considerably different, but 
> with a forward you retain ZERO of the original headers.
>


ok gotcha!
At least it will not be entirely useless.  I am assuming that 
the non-spam, false positive issues will not be severe.

Thanks.

RE: question on training spamassassin

Reply via email to