Can someone help me with this?

I continue to be unable to get dspam to correctly classify some messages.

I'm working with one message as an example, I have repeatedly classified it as spam (--class=spam --source=corpus) into dspam, but dspam continues to define it as innocent.

My dspam_stats currently shows

TP: 7595  TN: 6121  FP: 1  FN: 133  SC: 64  NC: 0

When I run the message (one of those viagra messages) through dspam, I pipe it into the command:

dspam  --user <user> --stdout --deliver=innocent

and I always get

X-DSPAM-Result: Innocent
X-DSPAM-Confidence: 0.8749
X-DSPAM-Probability: 0.0000


----- Original Message ----- From: "Ricardo Kleemann" <[EMAIL PROTECTED]> To: "David Rees" <[EMAIL PROTECTED]>; <[email protected]>
Sent: Wednesday, February 07, 2007 12:54 PM
Subject: Re: [dspam-users] false negatives


But I've trained with over 5000 messages on both negative and positive...

I've just ran dspam with --class=spam --source=corpus, and afterwards, I still get the message as innocent.

----- Original Message ----- From: "David Rees" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Wednesday, February 07, 2007 12:49 PM
Subject: Re: [dspam-users] false negatives


On 2/7/07, Arnaldo Mandel <[EMAIL PROTECTED]> wrote:
Ricardo Kleemann wrote (on Feb 7, 2007):
> I've installed and configured dspam, and trained it with my own set of
 > messages.
 >
> But I still get false negatives when testing dspam with a spam message that > was fed as spam into dspam_train. I've also called dspam with --class=spam, > feeding it the message, but when I test the message (using dspam --stdout)
 > the message still comes out as innocent.

Have you used the parameter --source=corpus for dspam?
After some feeding it will certainly classify as spam, but in the
process you may bias your token database in an unwanted way.

Yep, and also keep in mind the number of messages you have trained
dspam with. Generally a couple thousand are required before dspam
really starts getting good.

-Dave




Reply via email to