On 20.04.2012 19:41, Bradley Giesbrecht wrote:
> On Apr 20, 2012, at 9:39 AM, Stevan Bajić wrote:
>
>> On 20.04.2012 07:32, Steve Fatula wrote:
>>> [...]
>>> If you give me the SPAM corpus, I can just run dspam_train on it (and I'd 
>>> even add my 80). But it will be pretty unbalanced since I have few HAM 
>>> messages since I only keep a month (maybe a few thousand messages). I am 
>>> not sure that matters much? In the end, won't the detection still work, 
>>> maybe biased towards SPAM at first, but, surely, it woudln't take too long 
>>> to stop false positives?
>>>
>> Lets say you want to make that merged global group. Then this is what you 
>> should do:
>>
>> 1) Create a new DSPAM user. If you can create a flat user (no 
>> localp...@domain.tld) because a flat user name will be easier to recognize 
>> on your setup where you usually have full blown up email addresses as user 
>> name. Lets say that new created user is called "SpamHitRate".
>>
>> 2) Change preferences for that user to:
>> dspam_admin change preference "SpamHitRate" "dailyQuarantineSummary" "off"
>> dspam_admin change preference "SpamHitRate" "enableBNR" "on"
>> dspam_admin change preference "SpamHitRate" "enableWhitelist" "off"
>> dspam_admin change preference "SpamHitRate" "fallbackDomain" "off"
>> dspam_admin change preference "SpamHitRate" "ignoreGroups" "on"
>> dspam_admin change preference "SpamHitRate" "ignoreRBLLookups" "on"
>> dspam_admin change preference "SpamHitRate" "makeCorpus" "off"
>> dspam_admin change preference "SpamHitRate" "optIn" "on"
>> dspam_admin change preference "SpamHitRate" "optOut" "off"
>> dspam_admin change preference "SpamHitRate" "optOutClamAV" "on"
>> dspam_admin change preference "SpamHitRate" "processorBias" "off"
>> dspam_admin change preference "SpamHitRate" "showFactors" "off"
>> dspam_admin change preference "SpamHitRate" "signatureLocation" "headers"
>> dspam_admin change preference "SpamHitRate" "spamAction" "deliver"
>> dspam_admin change preference "SpamHitRate" "spamSubject" ""
>> dspam_admin change preference "SpamHitRate" "statisticalSedation" "0"
>> dspam_admin change preference "SpamHitRate" "storeFragments" "off"
>> dspam_admin change preference "SpamHitRate" "tagNonspam" "off"
>> dspam_admin change preference "SpamHitRate" "tagSpam" "off"
>> dspam_admin change preference "SpamHitRate" "trainingMode" "TOE"
>> dspam_admin change preference "SpamHitRate" "trainPristine" "off"
>> dspam_admin change preference "SpamHitRate" "whitelistThreshold" "9999999"
>>
>> Basically you want that user to not use ClamAV, nor any groups, nor any RBL, 
>> nor do you want whitelisting or any other mambo jambo. Usually you would not 
>> turn off that many helper mechanism on a normal user but this is not a 
>> normal user. You want that user to be as hard as possible. You don't care 
>> about false positive or false negative on that user. In fact this is exactly 
>> what you want. You want that user to generate as much false positive / 
>> negative as needed. Because the more FP/FN you have the more you can make 
>> DSPAM to learn. And this is what you are going to do mainly with that user. 
>> You are going to use dspam_train with Spam/Ham corpi.
>>
>> 3) Now go on and train with dspam_train: dspam_train SpamHitRate 
>> [spam_corpus maildir or mbox] [nonspam_corpus maildir or mbox]
>>
>> 4) After you are finished with dspam_train you should go on and run 
>> dspam_clean: dspam_clean -s0 -p0 -u0,0,0,0 SpamHitRate
>>
>> 5) Now you enable the merged global group by editing the DSPAM group file 
>> and there you add:
>> SpamHitRate:merged:*
>>
>> 6) You are using MySQL right? Now it is time to delete all users tokens 
>> except for SpamHitRate. To do that you just execute this (assuming the uid 
>> of SpamHitRate is 1000):
>>
>> delete from dspam_signature_data where uid!=1000;
>> delete from dspam_stats where uid!=1000;
>> delete from dspam_token_data where uid!=1000;
>>
>> analyze table dspam_signature_data;
>> analyze table dspam_stats;
>> analyze table dspam_token_data;
>>
>> optimize table dspam_signature_data;
>> optimize table dspam_stats;
>> optimize table dspam_token_data;
>>
>> After you have done that all old tokens and signatures and statistics for 
>> each user should be removed. This will lead to problems if user are going to 
>> try to retrain stuff that they got in the last days (since the signature 
>> data is purged). I don't think this will be a big issue on your setup since 
>> your users are using the dovecot anti-spam plugin and all DSPAM stuff is 
>> masked/hidden for them.
>>
>> 7) Change your dspam.conf to run in TOE instead of TEFT. Don't forget to 
>> check the preferences of each user if they don't have set by accident 
>> "trainingMode" to anything other than "TOE". Actually you could delete 
>> "trainingMode" if the user has that preference (it will fall back to that 
>> what you have set in dspam.conf, which should be in your case TOE).
>>
>> 8) Restart the DSPAM daemon.
> Thank you Stevan :)
No problem. If anyone needs a bunch of spam corpi then have a look here 
-> http://untroubled.org/spam/


> Regards,
> Bradley Giesbrecht
>
>
> ------------------------------------------------------------------------------
> For Developers, A Lot Can Happen In A Second.
> Boundary is the first to Know...and Tell You.
> Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
> http://p.sf.net/sfu/Boundary-d2dvs2
> _______________________________________________
> Dspam-user mailing list
> Dspam-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspam-user


-- 
Kind Regards from Switzerland,

Stevan Bajić


------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to