-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 29/01/2010 09:33, Stevan Bajić wrote:
> On Fri, 29 Jan 2010 08:26:44 +0100 "[email protected]"
> <[email protected]> wrote:
>
>> our users are able to train dspam, crm114 and SA. They share the
>> same dateset.
>>
> So basically one user could mess up the whole data set for all
> other users. Is that really something you want?
>
>
I' m aware of that, it can be a problem, (bayes poisoning).
We check submission sample ramdomly before retraining, but it's not
enough to be certain that everything ok.
>> We use postfix as global MTA, but we dont use it to retraining.
>> (no special alias)
>>
> Postfix acting as an edge MTA. Right? Do you use other things in
> Postfix? Stuff like SPF, DKIM, SenderID, Milters, Policy
> Delegation, etc? What would that be?
>
>
Right, postfix act like a first line of defense, with all the less cpu
intensive tests: DNS, RFC compliant, RBL, trafic control with policydv2.
but SPF, DKIM are managed later by amavisd/SA.

95 % of spam are blocked by postfix controls.
>> In order to retrain FP, our customers can move email into 2 imap
>> folders in their mailbox, one for spam learning, the other for
>> ham learning. it feeds 2 special folders on one centralized
>> server from which we can apply learning scripts. This script do
>> sa-learn for SA and for DSPAM, it checks email headers and if
>> dspam is not agree with classification, email is retrained with
>> command: /usr/bin/dspam --client --user amavis --class=spam
>> --source=error  (or class=ham of course)
>>
> Sounds pretty much to do what the Dovecot Anti-Spam plugin is
> doing. How do you handle POP users? How do they retrain?
We have very limited number of pop users, but it's a limit of our
system, by choice, POP users doesnt retrain.

>
>
>> This retraining increase greatly accuracy of the 3 engines.
>>
>> Autolearning is more tricky because it will massively rely on
>> heuristics engine (main scoring) to adjusts statistical engine
>> (SA bayes, CRM) on the fly. But i'm agree with you, what's the
>> point to  use the 3 statisticals engine this way. For SA, it's
>> OK, but for CRM114 and DSPAM, I'm wonder if it's really clever.
>>
> I personally would say that it's not clever.
>
>
>> So I think i will let DSPAM do his job, and continue use his
>> scoring to balance the others.
>>
> As an ISP you should consider using groups in DSPAM and split DSPAM
> so that every user has his/her own data set. I see a merged group
> for your scenario. Then you could just train that merged group
> while leave it up to the user to train his/her data. I only would
> feed Spam honeypots to the merged group and from time to time I
> would feed some ham to the merged group. Or maybe setting up a
> mechanism to feed users outbound mails to his/her data set in order
> to get bulk ham data.
>
interesting, I will take a look into it.
Is it possible to do this with amavisd integration or do I need to
swith to a more standard one ?

>
>> It's the way it works actually, and I'm really satisfied:
>> accuracy is great and FP are very low.
>>
> My current setup has about 1% spam volume. But I use a Policy
> Delegation service to block 60% to 80% of inbound mail. Out of the
> total inbound (excluding the blocked inbound) I have a very, very
> low FP/FN amount. I have no numbers handy but it's very low (as
> well a one digit percent number).
>
>
that's why I always try to adapt our system to be more effective,

I really appreciate your advice, thanks a lot
>> And may be I will do the same with CRM114.
>>
>> So I will give it a try to dspam plugin at
>> http://eric.lubow.org/projects/dspam-spamassassin-plugin/
>> because, if i'm understand correctly, it can be used to balance
>> scoring more precisely.
>>
>> Thanks for your help on this Regards, Tonio
>>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktiqasACgkQ8FtMlUNHQIOcFgCfQEUhboxgf4WPruBOMT/K7VI1
fgoAn3vRuI0QYKjogTfRTeepXX0RpeY6
=2PKi
-----END PGP SIGNATURE-----


------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
Dspam-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to