Re: [dspam-users] DSPAM has become horribly inaccurate, purge training data?

LedHed Mon, 28 Apr 2008 15:44:20 -0700

Campbell Krueger wrote:

Hey everyone,
Well, my DSPAM implementation has dropped to an accuracy ofapproximately 4% over the past few weeks, and training seems to haveabsolutely no impact on this. I'm pretty sure this is my own fault,as for a while I figured that running messages through the dspam_traintool repeatedly until they were positively identified was the best wayto go (but later started to think about it and realized it probablypolluted my training data). So, moving forward, I have the followingquestions:
1) When initially training, is TEFT the best way to go?

Yes, The first 2500 messages are always trained in TEFT mode regardless.

2) Should I initially train using an extremely large collection ofSPAM I already have, as well as all my legit mail?

Yes (in equal proportions)

3) At what point should I switch over to TOE from TEFT?

After the initial training period (Training > 2500)

4) What's the best overall procedure to go about training DSPAM?

Doesn't matter, but do it correctly. (Don't re-feed your corpus over andover again)

And most importantly...
5) How the heck do I purge all the training data already in place formy account?

This depends on your Storage Driver, and how many accounts you have.

I would just truncate the token, signature, and stats tables if you onlyhave a few users, this assumes you are using a SQL based storage driver.


I'd sincerely appreciate any information you can give.  Thanks!

Regards,
Campbell Krueger


-Jeff Harris

Re: [dspam-users] DSPAM has become horribly inaccurate, purge training data?

Reply via email to