Well, my two cents on this: When I upgraded my servers (about half a year ago) and started using a mysql-based Bayes DB, image spams began to drive me crazy. Seemed like there was no way to stop them. But with a good purge of bayes, a rebuild, and the addition of sa-update rules, it all began to get better. Right now, I have implemented a system for my users to train a global Bayes database, and I must say it is working almost flawlessly. Only a few discussion lists got BAYES_99 hits, but as soon as the users forwarded them to the ham training account (or moved them to their webmail-based HAM folders), everything got better. I'm a small fish in this fight (two servers, about 400 users each, ~25000 messages a day, ~20000 rejected via zenspamhaus.org mostly, ~1100 spam messages, and ~30 virus messages a day), but I must say that taking good care of my Bayes database has improved a lot the spam fighting capabilities of my servers. It includes making sa-forget of false positives, then feeding them to sa-learn as ham, sa-forget of false negatives and making SA analyze and report them, etc. Luckily, I managed to write some scripts to do the work for me. They're still at test stage, but I'm convinced that they seem to perform very well...
A taste: http://www.biol.unlp.edu.ar/cgi-bin/mailgraph.cgi Luis 2007/3/23, Jim Maul <[EMAIL PROTECTED]>:
Marc Perkel wrote: > Perhaps what I need to do is to get rid of autolearn and write my own > learning system that strips out the body of messages with images and > just learns the headers. My problem is that when users get image spam > they put it in the spam folders and they get learned. But the text in > the image spam causes ham type text to be learned as spam. That causes > ham to get higher scores. > > Are you sure of this? Have you also trained these ham messages to counter this effect? Not too long ago we were in the same situation. I have autolearn enabled but I have adjusted the thresholds to avoid learning false positives/negatives. We were getting ham (although arguably - they were newsletter type ham) that was hitting BAYES_99. As soon as i started training them as ham the problem went away. Spam is still detected correctly by bayes and these newsletters no longer hit bayes_99. -Jim
-- ------------------------------------------------- GNU-GPL: "May The Source Be With You... -------------------------------------------------
