Re: getting bayes back on track after habeas mess

Bob George 21 Mar 2004 06:11:27 -0000

Brian Dial wrote:

[...] Given the fact that I recieve about 200 times more spam that contains the habeas headers than actual mail, I've decided toh score habes 0 rendering it useless. However, I cannot seem to reverse the damage it has done. My guess is that enough of those Fwd: Get All Meds. Fwd: V|@GRA \ Vali/u/m ( [EMAIL PROTECTED] etc... mails go through and were autolearned as ham.

When I started using bayes, I didn't enable auto-learn for a LONG time, preferring to instead training manually. Bayes is fragile in that way. I can't offer a fix, but considering the bayes database is polluted, and you don't know exactly by what, a fresh start might be easiest.

I would recommend saving anything you train on for some time, if not indefinitely. I'm keeping about 2,000 "fresh" spam around for retraining purposes. I've managed to screw up my bogofilter bayes database more than once, and will no doubt get the others eventually. Having the common base they were trained from is helpful.

I would also suggest running some add-on rule sets. Those same spams are coming here in droves, but all have been ably tagged as spamd, and enough so that even the habeas default -8 score didn't let them pass.

Does anyone have any suggestions on how to get bayes to start learning these as spam now that I have habeas turned off?

Well, feed them to bayes manually to start. However, without the aforementioned add-on rules, they may STILL score low enough to hit auto-training (.1 by default).

- Bob

Re: getting bayes back on track after habeas mess

Reply via email to