On Wed, Apr 10, 2013 at 12:49 PM, David Rees <dree...@gmail.com> wrote:
> So it appears that the training is completely ineffective as I understand
> that once an email as been marked as spam, it should no longer consider the
> email for the whitelist.
>
> If you look at the dspam log or look at the dspam webui, the history page
> seems to indicate that the email is indeed being retrained successfully.
>
> Now, if I take one of these emails, remove the dspam headers and train then
> as an inoculation source, after retraining around 5 times the email will be
> successfully marked as spam.
>
> How can I debug this issue further?

Here is something interesting. In a sample email that was marked as
Innocent, I took a token from the headers and run it through
dspam_dump:

$ dspam_dump <user> Subject*nodeposit+package
4592615711378120074  S: 00000  I: 00050  P: 0.0100

Then I retrained the message as an error.
$ dspam --class=spam --source=error < <plain-text-spam-file>

Then I checked dspam_dump again:
$ dspam_dump <user> Subject*nodeposit+package
4592615711378120074  S: 00000  I: 00050  P: 0.0100

Still no SPAM hits! Looking up the token in the database there are no
spam_hits marked as expected from dspam_dump.

Checking the webui the email has been marked as retrained.

Then I remove the DSPAM headers from the message, and innoculate the
message once:

$ dspam --class=spam --source=inoculation < <plain-text-spam-file>

And check dspam_dump again:
$ dspam_dump customer-support Subject*nodeposit+package
4592615711378120074  S: 00002  I: 00050  P: 0.0434

Look at that, the spam probability has gone up!

So at this point, it's very clear that error-training is not working
at all. Any hints before I start digging in to the code?

-Dave

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to