hi tony changing over to teft gives me better results now
all the spam related to the one i discussed below http://24x7server.net/spam.html concering USNMA is getting caught correctly thanks for your help rajesh ---------- Original Message ---------------------------------- From: Tony Earnshaw <[EMAIL PROTECTED]> Date: Fri, 03 Aug 2007 09:35:54 +0200 >Raj skrev, on 03-08-2007 06:18: > >> i had a question concerning dspam training ... >> >> i used shared group -- one single user "common" for the entire server with >> toe mode > >Same here with the group on 2 (entirely differently configured Postfix >MTA) sites, but on both I use a shared group and teft. One of the sites >is my own PC with Postfix/Fetchmail and few Postfix-configurable >anti-spam features possible, one is a production site for 1500+ users on >which Postfix/policyd is refusing a massive (and increasing every day) >amount of stuff before it ever gets to dspam. > >> i train dspam using aliases -- ie just forward to spam / not-spam aliases > >I train dspam by the user dragging incorrectly judged messages (spam or >non-spam) to a "misjudged" folder and running a cron job on it every >hour. Same at both sites. > >> i have not done any corpus training till today > >The school site has had a massive corpus training, the home site didn't >at first, but after a while the results were so unsatisfactory, that I >fed it as much spam and non-spam as I could, with dspam_train. This >doesn't offer trained spam as corpusfed, though. > >> i have never purged the dspam database > >Purge both sites every week with 'dspam_clean -p', 'cos I don't trust >purge-4.1.sql. > >> i have noticed a few emails (html text) of absolutely the same type come >> into my mailbox undetected as spam. This is a rare incident but happens. ie >> once in around 2-3 days. >> >> Major part of the entire body content of the spam email ie html code behind >> the scene is exactly the same. All that varies is the hyperlink at the >> bottom which points to different websites every time. >> >> you can see them here >> http://24x7server.net/spam.html > >Unfortunately, the code renders in my Firefox 2.0.0.6 and all I see is >the spam message :) > >However, I have 2 of these from 22-05 and 26-05 in my own site's spam >folder and can look at them there. My policy is to put everything that >is spam that gets into my inbox and I have to retrain, into the spam >folder after training. Everything - 80-90 per day - that dspam judges >correctly I chuck. The fact that I only have two of these in my spam >folder would tend to show that dspam has learned very quickly. > >> i want to know your experience in this matter ...any tips would be helpful > >Change toe to teft. Turn on debugging and go through the debug output >for stuff that you're interested in and see on which premises spam is >being detected. If you don't immediately know what some of the criteria >mean, post here. Make sure logrotate is switched on for your debug >stuff, with compress on. Purging old stuff does no harm, doesn't affect >dspam's accuracy negatively. I don't think that my spams can help you, >since, even though using a shared group, the recipient's name is used by >dspam to judge, but if you want them, I can offer a tarball on my ftp site. > >> my dspam stats >> common: >> TP True Positives: 40383 >> TN True Negatives: 81087 >> FP False Positives: 41 >> FN False Negatives: 813 >> SC Spam Corpusfed: 759 >> NC Nonspam Corpusfed: 0 >> TL Training Left: 0 >> SHR Spam Hit Rate: 98.03% >> HSR Ham Strike Rate:0.05% >> OCA Overall Accuracy: 99.30% > >That's better than my home site, but not good enough: > > TP True Positives: 3465 > TN True Negatives: 21215 > FP False Positives: 4 > FN False Negatives: 323 > SC Spam Corpusfed: 74 > NC Nonspam Corpusfed: 7 > TL Training Left: 0 > SHR Spam Hit Rate 91.47% > HSR Ham Strike Rate: 0.02% > OCA Overall Accuracy: 98.69% > >The school's site is: > > TP True Positives: 20963 > TN True Negatives: 111208 > FP False Positives: 508 > FN False Negatives: 408 > SC Spam Corpusfed: 3486 > NC Nonspam Corpusfed: 3002 > TL Training Left: 0 > SHR Spam Hit Rate 98.09% > HSR Ham Strike Rate: 0.45% > OCA Overall Accuracy: 99.31% > >I'm content with that. > >Best, > >--Tonni > >-- >Tony Earnshaw >Email: tonni at hetnet dot nl >
