As a result I have been going back over the documentation to see what I
can do to improve the situation. I suspect the problem may be the shear
volume of spam that has been fed to the bayes database. Currently I
have a little over 47000 spam and less than 4000 ham in the database. Each day they have been adding several hundred spam to the database and
sometimes over the weekend there will be several thousand spam adde
Would I improve things by flushing the entire database and starting over? Is the differential in spam vs. ham causing the scores to drift and should I try to keep these numbers closer?
interestingly I'm seeing the problem in reverse (high ham count / low spam), I'm wondering if the maths around bayes is not good when there's an inequality between spam to ham.
-- Robert Brooks, Network Manager, Cable & Wireless UK <[EMAIL PROTECTED]> http://hyperlink-interactive.co.uk/ Tel: +44 (0)20 7339 8600 Fax: +44 (0)20 7339 8601 - Help Microsoft stamp out piracy. Give Linux to a friend today! -
