Hi, Here are my stats after retraining 100's of messages. Both spam and ham:
{227} dspam_stats -H antispam: TP True Positives: 4818 TN True Negatives: 22115 FP False Positives: 4 FN False Negatives: 5 SC Spam Corpusfed: 0 NC Nonspam Corpusfed: 0 TL Training Left: 0 SHR Spam Hit Rate 99.90% HSR Ham Strike Rate: 0.02% PPV Positive predictive value: 99.92% OCA Overall Accuracy: 99.97% Last night it caught maybe 100 emails, but I had much more than that in my inbox. Kind Regards, Al On Jul 23, 2015, at 12:13 PM, Nathanael D. Noblet wrote: > On Wed, 2015-07-22 at 17:48 -0700, waterdog wrote: > >> The dspam_stats for this user don't look too good even after multiple >> training attempts: >> >> TP True Positives: 0 >> TN True Negatives: 4 >> FP False Positives: 2353 >> FN False Negatives: 1947 >> SC Spam Corpusfed: 0 >> NC Nonspam Corpusfed: 0 >> TL Training Left: 143 > > You can see from this line that it needs to receive another 143 > messages before it is out of training. It requires about 2500 messages > before it flips a switch. I can't remember what switch but it flips > one. > > When I setup myself years ago, I found a corpus of spam, and I fed it > my entire mailbox + the spam. Now you can see my stats years later: > > TP True Positives: 3354 > TN True Negatives: 239849 > FP False Positives: 1448 > FN False Negatives: 981 > SC Spam Corpusfed: 0 > NC Nonspam Corpusfed: 0 > TL Training Left: 0 > SHR Spam Hit Rate 77.37% > HSR Ham Strike Rate: 0.60% > PPV Positive predictive value: 69.85% > OCA Overall Accuracy: 99.01% > > You don't have enought data for dpsam do reliably do anything. > Retraining one message as spam will *not* automatically get it to be > classified as spam on the *next* classification. > > Watch the numbers in your stats which says whether training is > occuring. If you have a false negative (ham as spam), train it and you > should see the FN increment. If it does dspam is working as expected. > > The other implied part of your question is 'Why isn't dspam effective > yet?'. Which is partly due to the amount of mail you've received so > far, the type of spam, and the dspam settings. I used to setup people > with TEFT as those were the recommendations and I think the default. > Over the years I've seen it mentioned on this list multiple times that > you should use TOE by default. > > I also use > > Algorithm graham burton > Tokeninzer osb > > because of users of this list back in the day explaining that they > were > better defaults. > > > > ---------------------------------------------------------------------- > -------- > _______________________________________________ > Dspam-user mailing list > Dspam-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dspam-user > > !DSPAM:55b11c8d189367246910663! > ------------------------------------------------------------------------------ _______________________________________________ Dspam-user mailing list Dspam-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-user