Hi, On Jun 13, 2015, at 5:37 PM, RW wrote:
> On Thu, 11 Jun 2015 19:06:50 -0400 > Phil Stracchino wrote: > >> On 06/11/15 16:07, Al Zick wrote: > >>> At first, it looked >>> like dspam was working, but then I realized that I could not >>> retrain. After doing more research I found where people were saying >>> that you can not retrain with hash and toe, so I switched to tum. > > I don't know for sure, but it sounded like that was just a > bootstrapping problem that could be solved by creating the database > with training from corpus. I have been creating the database with: cat 782081.emlx | dspam --mode=tum --process --deliver=stdout --user antispam --client This last time I created the database with: cat dspam_training/spam/new/781097.emlx | dspam --user antispam -- class=spam --source=corpus --deliver=summary How should the database be created? > It's not a bad idea to do that anyway because TOE doesn't turn-on > until > there are 2500 non-spams in the database. With TOE it filters very little spam. With TUM it gets almost everything, however it usually crashes after 12 hours. >> Why are you using hash storage? Even if you don't have a real DB >> engine installed that you can use, is sqlite not an option for some >> reason? > > I frequently rebuild from corpus. Training ~9000 emails takes > ~ 3 minutes with the hash driver and ~ 7 hours with sqlite3. That is an incredible difference. > I can't say I've ever noticed any corruption problem, and because > of the > corpora I don't regard the css files as precious, or care about token > expiry. Is there any particular reason why I shouldn't use the > hash driver? I moved dspam from procmail and put it into my postfix master.cf. I am hoping this will fix the css corruption problem. Kind Regards, Al ------------------------------------------------------------------------------ _______________________________________________ Dspam-user mailing list Dspam-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-user