Re: [Dspam-user] Increase Spam Hit Rate

Stevan Bajić Fri, 20 Apr 2012 09:46:08 -0700

On 20.04.2012 07:32, Steve Fatula wrote:

[...]

If you give me the SPAM corpus, I can just run dspam_train on it (andI'd even add my 80). But it will be pretty unbalanced since I have fewHAM messages since I only keep a month (maybe a few thousandmessages). I am not sure that matters much? In the end, won't thedetection still work, maybe biased towards SPAM at first, but, surely,it woudln't take too long to stop false positives?

Lets say you want to make that merged global group. Then this is whatyou should do:

1) Create a new DSPAM user. If you can create a flat user (nolocalp...@domain.tld) because a flat user name will be easier torecognize on your setup where you usually have full blown up emailaddresses as user name. Lets say that new created user is called"SpamHitRate".


2) Change preferences for that user to:
dspam_admin change preference "SpamHitRate" "dailyQuarantineSummary" "off"
dspam_admin change preference "SpamHitRate" "enableBNR" "on"
dspam_admin change preference "SpamHitRate" "enableWhitelist" "off"
dspam_admin change preference "SpamHitRate" "fallbackDomain" "off"
dspam_admin change preference "SpamHitRate" "ignoreGroups" "on"
dspam_admin change preference "SpamHitRate" "ignoreRBLLookups" "on"
dspam_admin change preference "SpamHitRate" "makeCorpus" "off"
dspam_admin change preference "SpamHitRate" "optIn" "on"
dspam_admin change preference "SpamHitRate" "optOut" "off"
dspam_admin change preference "SpamHitRate" "optOutClamAV" "on"
dspam_admin change preference "SpamHitRate" "processorBias" "off"
dspam_admin change preference "SpamHitRate" "showFactors" "off"
dspam_admin change preference "SpamHitRate" "signatureLocation" "headers"
dspam_admin change preference "SpamHitRate" "spamAction" "deliver"
dspam_admin change preference "SpamHitRate" "spamSubject" ""
dspam_admin change preference "SpamHitRate" "statisticalSedation" "0"
dspam_admin change preference "SpamHitRate" "storeFragments" "off"
dspam_admin change preference "SpamHitRate" "tagNonspam" "off"
dspam_admin change preference "SpamHitRate" "tagSpam" "off"
dspam_admin change preference "SpamHitRate" "trainingMode" "TOE"
dspam_admin change preference "SpamHitRate" "trainPristine" "off"
dspam_admin change preference "SpamHitRate" "whitelistThreshold" "9999999"

Basically you want that user to not use ClamAV, nor any groups, nor anyRBL, nor do you want whitelisting or any other mambo jambo. Usually youwould not turn off that many helper mechanism on a normal user but thisis not a normal user. You want that user to be as hard as possible. Youdon't care about false positive or false negative on that user. In factthis is exactly what you want. You want that user to generate as muchfalse positive / negative as needed. Because the more FP/FN you have themore you can make DSPAM to learn. And this is what you are going to domainly with that user. You are going to use dspam_train with Spam/Ham corpi.

3) Now go on and train with dspam_train: dspam_train SpamHitRate[spam_corpus maildir or mbox] [nonspam_corpus maildir or mbox]

4) After you are finished with dspam_train you should go on and rundspam_clean: dspam_clean -s0 -p0 -u0,0,0,0 SpamHitRate

5) Now you enable the merged global group by editing the DSPAM groupfile and there you add:

SpamHitRate:merged:*

6) You are using MySQL right? Now it is time to delete all users tokensexcept for SpamHitRate. To do that you just execute this (assuming theuid of SpamHitRate is 1000):


delete from dspam_signature_data where uid!=1000;
delete from dspam_stats where uid!=1000;
delete from dspam_token_data where uid!=1000;

analyze table dspam_signature_data;
analyze table dspam_stats;
analyze table dspam_token_data;

optimize table dspam_signature_data;
optimize table dspam_stats;
optimize table dspam_token_data;

After you have done that all old tokens and signatures and statisticsfor each user should be removed. This will lead to problems if user aregoing to try to retrain stuff that they got in the last days (since thesignature data is purged). I don't think this will be a big issue onyour setup since your users are using the dovecot anti-spam plugin andall DSPAM stuff is masked/hidden for them.

7) Change your dspam.conf to run in TOE instead of TEFT. Don't forget tocheck the preferences of each user if they don't have set by accident"trainingMode" to anything other than "TOE". Actually you could delete"trainingMode" if the user has that preference (it will fall back tothat what you have set in dspam.conf, which should be in your case TOE).


8) Restart the DSPAM daemon.

Thanks in advance!




--
Kind Regards from Switzerland,

Stevan Bajić

------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2

_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Re: [Dspam-user] Increase Spam Hit Rate

Reply via email to