On Tue, 14 Oct 2014 13:58:27 +0200
Axb wrote:

> On 10/14/2014 01:51 PM, RW wrote:
> > On Tue, 14 Oct 2014 10:44:51 +0200
> > Axb wrote:
> >
> >>
> >> have you verified that some of these are not included?
> >>
> >> X-Originating-IP will not be included as it can be used to help
> >> detect ham or spam
> >
> > It's really no different to other headers you are ignoring.
> 
> for example, if you get a flood of 419s from the same source, you may 
> want it to be tokenized... 


As I do with, for example:

  X-AntiAbuse: Originator/Caller UID/GID - [514 32007] / [47 12]

in this spam Bayes found

  0.999-4--HX-AntiAbuse:32007

These numbers seem to be very good indicators for me. 


Most of the headers in the file have never appeared in my ham, so
they'll be pure spam indicators if they are ever faked. In general
it's difficult for a spammer to gain an overall advantage against
an average per user database using faked headers.

Whatever the merits of this on system-wide Bayes (if any beyond
reducing token count), I think it would have a negative effect on
per user Bayes. 

Reply via email to