Hi,

As a start I'd start by turning off all remote checks (-L) and only
using the bayes database. I found that using auto-whitelisting on a
large volume of mail started to slow it down and increased the IO of the
SA disks. This is because each spawn spamd proc would try and gain a
lock on the auto-whitelisted database. This could sometimes cause
unnecessary delays.

Another option to look at is the bayes learning thresholds. Setting them
too low/high could also cause problems. My Score is set currently to 9.5
Points. With this score my bayes spam threshold is 12 and my ham
threshold is -2.0.

See how it goes from here. 

--

Regards

Richard Mayhew
Unix Specialist

MWEB Business
Tel:  + 27 11 340 7200
Fax:  + 27 11 340 7288
Website: www.mwebbusiness.co.za

-----Original Message-----
From: Johann Spies [mailto:[EMAIL PROTECTED] 
Sent: 10 May 2004 10:38 AM
To: [EMAIL PROTECTED]
Subject: Serious bayesian filter problems

I happened the first time on 26-28 April: the bayesian filter began to
give a lot of false positives.

I replaced the bayesian database on the that server with that of a
second one.  Apart from auto-learning I run sa-learn at least once a
day feeding them the same input. 

These mail servers accept about 100000 emails per day and together
stop about 25 000 unwanted messages (most of it spam) at smtp-level.

Yesterday the same thing happened on the second server: the bayesian
filter/auto whitelisting combination started to give false positives:
The same message tested with spamc scored a 9.3 on the one server and
2.0 on the other (which is about what it should be). I then used
sa-learn to learn the message as ham on the first one and tested it
again: 9.4!  The threshold is 8.0.

As a result of this behaviour I even received  a warning from the
spamassassin mailing list server that messages sent to me bounced.

It might be both auto-whitelisting and bayesian corruption. 

I can not afford unreliable software to do this important job.  Am I
the only one who experience this type of behaviour?

How can I prevent this?  I can not watch spamassassin 24 hours per day
to jump in when something goes wrong.  

Regards
Johann

-- 
Johann Spies          Telefoon: 021-808 4036
Informasietegnologie, Universiteit van Stellenbosch

     "My son, do not despise the LORD's discipline and do
      not resent his rebuke, because the LORD disciplines
      those he loves, as a father the son he delights in."
                                       Proverbs 3:11,12 

Reply via email to