Re: low spam scores (was "bayes scoring q")

Glenn Little 10 Mar 2004 23:52:14 -0000

We seeded the bayes db with manual training at the
beginning (few hundred messages each ham and spam
I think).  Have been letting it auto-learn since
then.  Is that a bad paradigm?

I also saw a score config for HABEAS_VIOLATOR, but
it wasn't triggered by our spam with habeas headers.

                -glenn

Matt Kettler wrote:

At 06:05 PM 3/10/2004, [EMAIL PROTECTED] wrote:
We have auto-learn, and many spams don't make
a high enough score to be auto-learned as spam.  In addition,
some spams actually score low enough (see the habeas problem
I mentioned earlier) to be auto-learned as ham :-(
Autolearn is a good thing, but how much manual training are you doing?
Autolearning alone as your sole source of bayes training is a very bad idea, and prone to disaster.

I might also suggest the following to help mitigate some of the habeas damage:
bayes_ignore_header X-Habeas-SWE-1
bayes_ignore_header X-Habeas-SWE-2
bayes_ignore_header X-Habeas-SWE-3
bayes_ignore_header X-Habeas-SWE-4
bayes_ignore_header X-Habeas-SWE-5
bayes_ignore_header X-Habeas-SWE-6
bayes_ignore_header X-Habeas-SWE-7
bayes_ignore_header X-Habeas-SWE-8
bayes_ignore_header X-Habeas-SWE-9
This will make the bayes database never give ham nor spam points because an email has these headers.. since there's already a rule for them, there's no reason to give "double credit" and give them bayes consideration as well.

Re: low spam scores (was "bayes scoring q")

Reply via email to