On Mon, Aug 20, 2012 at 11:24:04AM -0700, David Rees wrote: > I've run into a situation where I get emails that so stubborn that > even after repeated training they still classify as spam when they > should be innocent. > > I have always retrained these messages when I've found them in my spam folder. > > For a long time I used teft, but recently I've switched to toe to > avoid inadvertantly reinforcing misclassifications where I might miss > them. > > What I've been doing after checking these emails is testing them on > the command line: > > # dspam --classify --stdout < {path-to-mail-file} > > Then to train them as innocent additional times: > > # dspam --source=corpus --class=innocent < {path-to-mail-file} > > And repeat until the mail is properly classified. > > This has worked for some stubborn emails - after say 5-10 retrainings, > but now I've come across a couple which are stuck even after > re-feeding them dozens of times. > > Is there anything I can do to fix this or get additional information > to debug the issue? > > -Dave >
Hi Dave, You have just been whammied by the "TEFT-is-a-bad-bad-idea" problem. I suspect that if you look at the tokens involved, because of TEFT's constant storing/updating of tokens, you have very large counts. This means that you would need many, many retraining to make enough of a differenct to change its rating. You may want to try "innoculate" as that is a bit more aggressive. Regards, Ken ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Dspam-user mailing list Dspam-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-user