On Fri, Jul 24, 2015 at 10:17:18AM -0700, waterdog wrote: > Okay, I apologize for all the following questions but, the more I > troubleshoot dspam without progress, the more questions I have. > > Are there recommendations/documentation on how to properly train? It seems > that some users do corpus training and other users just train based on > actual messages. > > What are the pros/cons of using a corpus vs. actual messages? > > Does it help to retrain multiple times using the same corpus and/or > messages? > > What are the specific stats that one should look to achieve to determine if > dspam has had enough training? > > Does TL need to be at zero before dspam will work at all? > > Do you have to train separately for each user or can all users share the > same training? > > I've tried training and retraining multiple times using corpuses and actual > messages but don't seem to be making any real progress. Here are my current > stats after training with a corpus: > > sudo dspam_train <username> spam_2 easy_ham_2 > > sudo dspam_stats -H <username> > > TP True Positives: 0 > TN True Negatives: 1315 > FP False Positives: 2443 > FN False Negatives: 2154 > SC Spam Corpusfed: 0 > NC Nonspam Corpusfed: 0 > TL Training Left: 0 > SHR Spam Hit Rate 0.00% > HSR Ham Strike Rate: 65.01% > PPV Positive predictive value: 0.00% > OCA Overall Accuracy: 22.24% > > As you can see, the OCA is still low but better than it was before. > > It might help if someone could post working configurations for postfix, > dspam, dovecot, and clamAV for comparison. I've tried to follow the online > documentation but apparently I'm missing something. > Hi,
I am not sure what your training corpus looks like, but those are pretty bad as results. Training a global/merged group can reduce the accuracy hit at the beginning, but in general, using a train-on-error setup, with no initial training would probably be better. Training with some valid good content is good if your ham/spam ratio is very small. The accuracy is best with an even mix of spam/ham to start. Then the TOE will keep it balanced. Regards, Ken ------------------------------------------------------------------------------ _______________________________________________ Dspam-user mailing list Dspam-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-user