On Apr 20, 2012, at 9:39 AM, Stevan Bajić wrote:

> On 20.04.2012 07:32, Steve Fatula wrote:
>> [...]
> 
>> If you give me the SPAM corpus, I can just run dspam_train on it (and I'd 
>> even add my 80). But it will be pretty unbalanced since I have few HAM 
>> messages since I only keep a month (maybe a few thousand messages). I am not 
>> sure that matters much? In the end, won't the detection still work, maybe 
>> biased towards SPAM at first, but, surely, it woudln't take too long to stop 
>> false positives?
>> 
> Lets say you want to make that merged global group. Then this is what you 
> should do:
> 
> 1) Create a new DSPAM user. If you can create a flat user (no 
> localp...@domain.tld) because a flat user name will be easier to recognize on 
> your setup where you usually have full blown up email addresses as user name. 
> Lets say that new created user is called "SpamHitRate".
> 
> 2) Change preferences for that user to:
> dspam_admin change preference "SpamHitRate" "dailyQuarantineSummary" "off"
> dspam_admin change preference "SpamHitRate" "enableBNR" "on"
> dspam_admin change preference "SpamHitRate" "enableWhitelist" "off"
> dspam_admin change preference "SpamHitRate" "fallbackDomain" "off"
> dspam_admin change preference "SpamHitRate" "ignoreGroups" "on"
> dspam_admin change preference "SpamHitRate" "ignoreRBLLookups" "on"
> dspam_admin change preference "SpamHitRate" "makeCorpus" "off"
> dspam_admin change preference "SpamHitRate" "optIn" "on"
> dspam_admin change preference "SpamHitRate" "optOut" "off"
> dspam_admin change preference "SpamHitRate" "optOutClamAV" "on"
> dspam_admin change preference "SpamHitRate" "processorBias" "off"
> dspam_admin change preference "SpamHitRate" "showFactors" "off"
> dspam_admin change preference "SpamHitRate" "signatureLocation" "headers"
> dspam_admin change preference "SpamHitRate" "spamAction" "deliver"
> dspam_admin change preference "SpamHitRate" "spamSubject" ""
> dspam_admin change preference "SpamHitRate" "statisticalSedation" "0"
> dspam_admin change preference "SpamHitRate" "storeFragments" "off"
> dspam_admin change preference "SpamHitRate" "tagNonspam" "off"
> dspam_admin change preference "SpamHitRate" "tagSpam" "off"
> dspam_admin change preference "SpamHitRate" "trainingMode" "TOE"
> dspam_admin change preference "SpamHitRate" "trainPristine" "off"
> dspam_admin change preference "SpamHitRate" "whitelistThreshold" "9999999"
> 
> Basically you want that user to not use ClamAV, nor any groups, nor any RBL, 
> nor do you want whitelisting or any other mambo jambo. Usually you would not 
> turn off that many helper mechanism on a normal user but this is not a normal 
> user. You want that user to be as hard as possible. You don't care about 
> false positive or false negative on that user. In fact this is exactly what 
> you want. You want that user to generate as much false positive / negative as 
> needed. Because the more FP/FN you have the more you can make DSPAM to learn. 
> And this is what you are going to do mainly with that user. You are going to 
> use dspam_train with Spam/Ham corpi.
> 
> 3) Now go on and train with dspam_train: dspam_train SpamHitRate [spam_corpus 
> maildir or mbox] [nonspam_corpus maildir or mbox]
> 
> 4) After you are finished with dspam_train you should go on and run 
> dspam_clean: dspam_clean -s0 -p0 -u0,0,0,0 SpamHitRate
> 
> 5) Now you enable the merged global group by editing the DSPAM group file and 
> there you add:
> SpamHitRate:merged:*
> 
> 6) You are using MySQL right? Now it is time to delete all users tokens 
> except for SpamHitRate. To do that you just execute this (assuming the uid of 
> SpamHitRate is 1000):
> 
> delete from dspam_signature_data where uid!=1000;
> delete from dspam_stats where uid!=1000;
> delete from dspam_token_data where uid!=1000;
> 
> analyze table dspam_signature_data;
> analyze table dspam_stats;
> analyze table dspam_token_data;
> 
> optimize table dspam_signature_data;
> optimize table dspam_stats;
> optimize table dspam_token_data;
> 
> After you have done that all old tokens and signatures and statistics for 
> each user should be removed. This will lead to problems if user are going to 
> try to retrain stuff that they got in the last days (since the signature data 
> is purged). I don't think this will be a big issue on your setup since your 
> users are using the dovecot anti-spam plugin and all DSPAM stuff is 
> masked/hidden for them.
> 
> 7) Change your dspam.conf to run in TOE instead of TEFT. Don't forget to 
> check the preferences of each user if they don't have set by accident 
> "trainingMode" to anything other than "TOE". Actually you could delete 
> "trainingMode" if the user has that preference (it will fall back to that 
> what you have set in dspam.conf, which should be in your case TOE).
> 
> 8) Restart the DSPAM daemon.

Thank you Stevan :)

Regards,
Bradley Giesbrecht


------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to