-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 29/01/2010 09:33, Stevan Bajić wrote: > On Fri, 29 Jan 2010 08:26:44 +0100 "[email protected]" > <[email protected]> wrote: > >> our users are able to train dspam, crm114 and SA. They share the >> same dateset. >> > So basically one user could mess up the whole data set for all > other users. Is that really something you want? > > I' m aware of that, it can be a problem, (bayes poisoning). We check submission sample ramdomly before retraining, but it's not enough to be certain that everything ok. >> We use postfix as global MTA, but we dont use it to retraining. >> (no special alias) >> > Postfix acting as an edge MTA. Right? Do you use other things in > Postfix? Stuff like SPF, DKIM, SenderID, Milters, Policy > Delegation, etc? What would that be? > > Right, postfix act like a first line of defense, with all the less cpu intensive tests: DNS, RFC compliant, RBL, trafic control with policydv2. but SPF, DKIM are managed later by amavisd/SA.
95 % of spam are blocked by postfix controls. >> In order to retrain FP, our customers can move email into 2 imap >> folders in their mailbox, one for spam learning, the other for >> ham learning. it feeds 2 special folders on one centralized >> server from which we can apply learning scripts. This script do >> sa-learn for SA and for DSPAM, it checks email headers and if >> dspam is not agree with classification, email is retrained with >> command: /usr/bin/dspam --client --user amavis --class=spam >> --source=error (or class=ham of course) >> > Sounds pretty much to do what the Dovecot Anti-Spam plugin is > doing. How do you handle POP users? How do they retrain? We have very limited number of pop users, but it's a limit of our system, by choice, POP users doesnt retrain. > > >> This retraining increase greatly accuracy of the 3 engines. >> >> Autolearning is more tricky because it will massively rely on >> heuristics engine (main scoring) to adjusts statistical engine >> (SA bayes, CRM) on the fly. But i'm agree with you, what's the >> point to use the 3 statisticals engine this way. For SA, it's >> OK, but for CRM114 and DSPAM, I'm wonder if it's really clever. >> > I personally would say that it's not clever. > > >> So I think i will let DSPAM do his job, and continue use his >> scoring to balance the others. >> > As an ISP you should consider using groups in DSPAM and split DSPAM > so that every user has his/her own data set. I see a merged group > for your scenario. Then you could just train that merged group > while leave it up to the user to train his/her data. I only would > feed Spam honeypots to the merged group and from time to time I > would feed some ham to the merged group. Or maybe setting up a > mechanism to feed users outbound mails to his/her data set in order > to get bulk ham data. > interesting, I will take a look into it. Is it possible to do this with amavisd integration or do I need to swith to a more standard one ? > >> It's the way it works actually, and I'm really satisfied: >> accuracy is great and FP are very low. >> > My current setup has about 1% spam volume. But I use a Policy > Delegation service to block 60% to 80% of inbound mail. Out of the > total inbound (excluding the blocked inbound) I have a very, very > low FP/FN amount. I have no numbers handy but it's very low (as > well a one digit percent number). > > that's why I always try to adapt our system to be more effective, I really appreciate your advice, thanks a lot >> And may be I will do the same with CRM114. >> >> So I will give it a try to dspam plugin at >> http://eric.lubow.org/projects/dspam-spamassassin-plugin/ >> because, if i'm understand correctly, it can be used to balance >> scoring more precisely. >> >> Thanks for your help on this Regards, Tonio >> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAktiqasACgkQ8FtMlUNHQIOcFgCfQEUhboxgf4WPruBOMT/K7VI1 fgoAn3vRuI0QYKjogTfRTeepXX0RpeY6 =2PKi -----END PGP SIGNATURE----- ------------------------------------------------------------------------------ The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com _______________________________________________ Dspam-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspam-user
