hi tony

changing over to teft gives me better results now

all the spam related to the one i discussed below
http://24x7server.net/spam.html concering USNMA is getting caught correctly

thanks for your help

rajesh



---------- Original Message ----------------------------------
From: Tony Earnshaw <[EMAIL PROTECTED]>
Date:  Fri, 03 Aug 2007 09:35:54 +0200

>Raj skrev, on 03-08-2007 06:18:
>
>> i had a question concerning dspam training ...
>> 
>> i used shared group -- one single user "common" for the entire server with 
>> toe mode
>
>Same here with the group on 2 (entirely differently configured Postfix 
>MTA) sites, but on both I use a shared group and teft. One of the sites 
>is my own PC with Postfix/Fetchmail and few Postfix-configurable 
>anti-spam features possible, one is a production site for 1500+ users on 
>which Postfix/policyd is refusing a massive (and increasing every day) 
>amount of stuff before it ever gets to dspam.
>
>> i train dspam using aliases -- ie just forward to spam / not-spam aliases
>
>I train dspam by the user dragging incorrectly judged messages (spam or 
>non-spam) to a "misjudged" folder and running a cron job on it every 
>hour. Same at both sites.
>
>> i have not done any corpus training till today
>
>The school site has had a massive corpus training, the home site didn't 
>at first, but after a while the results were so unsatisfactory, that I 
>fed it as much spam and non-spam as I could, with dspam_train. This 
>doesn't offer trained spam as corpusfed, though.
>
>> i have never purged the dspam database
>
>Purge both sites every week with 'dspam_clean -p', 'cos I don't trust 
>purge-4.1.sql.
>
>> i have noticed a few emails (html text) of absolutely the same type come 
>> into my mailbox undetected as spam. This is a rare incident but happens. ie 
>> once in around 2-3 days.
>> 
>> Major part of the entire body content of the spam email ie html code behind 
>> the scene is exactly the same. All that varies is the hyperlink at the 
>> bottom which points to different websites every time.
>> 
>> you can see them here
>> http://24x7server.net/spam.html
>
>Unfortunately, the code renders in my Firefox 2.0.0.6 and all I see is 
>the spam message :)
>
>However, I have 2 of these from 22-05 and 26-05 in my own site's spam 
>folder and can look at them there. My policy is to put everything that 
>is spam that gets into my inbox and I have to retrain, into the spam 
>folder after training. Everything - 80-90 per day - that dspam judges 
>correctly I chuck. The fact that I only have two of these in my spam 
>folder would tend to show that dspam has learned very quickly.
>
>> i want to know your experience in this matter ...any tips would be helpful
>
>Change toe  to teft. Turn on debugging and go through the debug output 
>for stuff that you're interested in and see on which premises spam is 
>being detected. If you don't immediately know what some of the criteria 
>mean, post here. Make sure logrotate is switched on for your debug 
>stuff, with compress on. Purging old stuff does no harm, doesn't affect 
>dspam's accuracy negatively. I don't think that my spams can help you, 
>since, even though using a shared group, the recipient's name is used by 
>dspam to judge, but if you want them, I can offer a tarball on my ftp site.
>
>> my dspam stats
>> common:
>> TP True Positives: 40383
>> TN True Negatives: 81087
>> FP False Positives: 41
>> FN False Negatives: 813
>> SC Spam Corpusfed: 759
>> NC Nonspam Corpusfed: 0
>> TL Training Left: 0
>> SHR Spam Hit Rate: 98.03%
>> HSR Ham Strike Rate:0.05%
>> OCA Overall Accuracy: 99.30%
>
>That's better than my home site, but not good enough:
>
>                 TP True Positives:           3465
>                 TN True Negatives:          21215
>                 FP False Positives:             4
>                 FN False Negatives:           323
>                 SC Spam Corpusfed:             74
>                 NC Nonspam Corpusfed:           7
>                 TL Training Left:               0
>                 SHR Spam Hit Rate          91.47%
>                 HSR Ham Strike Rate:        0.02%
>                 OCA Overall Accuracy:      98.69%
>
>The school's site is:
>
>                 TP True Positives:          20963
>                 TN True Negatives:         111208
>                 FP False Positives:           508
>                 FN False Negatives:           408
>                 SC Spam Corpusfed:           3486
>                 NC Nonspam Corpusfed:        3002
>                 TL Training Left:               0
>                 SHR Spam Hit Rate          98.09%
>                 HSR Ham Strike Rate:        0.45%
>                 OCA Overall Accuracy:      99.31%
>
>I'm content with that.
>
>Best,
>
>--Tonni
>
>-- 
>Tony Earnshaw
>Email: tonni at hetnet dot nl
>

Reply via email to