On Mon, Aug 20, 2012 at 11:24:04AM -0700, David Rees wrote:
> I've run into a situation where I get emails that so stubborn that
> even after repeated training they still classify as spam when they
> should be innocent.
> 
> I have always retrained these messages when I've found them in my spam folder.
> 
> For a long time I used teft, but recently I've switched to toe to
> avoid inadvertantly reinforcing misclassifications where I might miss
> them.
> 
> What I've been doing after checking these emails is testing them on
> the command line:
> 
> # dspam --classify --stdout < {path-to-mail-file}
> 
> Then to train them as innocent additional times:
> 
> # dspam --source=corpus --class=innocent < {path-to-mail-file}
> 
> And repeat until the mail is properly classified.
> 
> This has worked for some stubborn emails - after say 5-10 retrainings,
> but now I've come across a couple which are stuck even after
> re-feeding them dozens of times.
> 
> Is there anything I can do to fix this or get additional information
> to debug the issue?
> 
> -Dave
> 

Hi Dave,

You have just been whammied by the "TEFT-is-a-bad-bad-idea" problem. I suspect
that if you look at the tokens involved, because of TEFT's constant 
storing/updating
of tokens, you have very large counts. This means that you would need many, many
retraining to make enough of a differenct to change its rating. You may want to
try "innoculate" as that is a bit more aggressive.

Regards,
Ken

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to