Re: [Dspam-user] Fw: clamav success story

Stevan Bajić Mon, 19 Apr 2010 15:46:11 -0700

On Mon, 19 Apr 2010 23:25:38 +0200 (CEST)
"Edward P. Ross" <epr...@acrocat.com> wrote:


> Weird -- I've been on the list for a few days now (just got my 2nd
> confirmation)...
> 
I subscribed you now to the list. That's the reason you got a welcome message. 
:)


> Thanks Stevan!
> 
No problem.


[...] 

> > I saw another thread where you suggested some things re: clamav and
> > additional databases... I followed your instructions and installed them.
> >
> > Amazingly, more than 35% of all incoming messages to our mail server here
> > are getting caught by clamav (no false positives).  Between than and
> > graymilter, dspam is now almost officially on vacation :) !
> >
I can imagine. I don't know what MTA you are using. I use Postfix (in the 
front) and I have policyd-weight with a bunch of own made patches to add 
additional features to it. This beast alone is stopping the biggest chunk of my 
Spam mails. DSPAM almost has nothing to do anymore after adding that patched 
policyd-weight. So far I have installed that thing at the MX server of Paul 
Cockings and at Marko Weber his new MX server. Paul has one particular account 
that is getting a lot of Spam. After adding those additional ClamAV signatures 
the account on Paul's server is capturing a lot of Spams with the additional 
ClamAV signatures. Policyd-weight does not caputre anything on that particular 
account since Paul had to disable it because the target user is mostly used by 
other (external) accounts that forward automatically mail to that account. So 
policyd-weight would not help much (and is disabled for that user). But the 
other accounts all profit from the new additional functionallity provided by 
the patched policyd-weight (and from the additional ClamAV signatures).

On Makro's MX server I think he did not got any single Spam mail since I 
activated that patched policyd-weight on his server (slightly over a week now). 
Okay, okay. Marko's setup is different then Paul's. I have enabled a lot of 
Anti-Spam features on Makro's Postfix installation. And I automated everything. 
I realy mean everything. All stuff get's downloaded and updated without the 
need to do anything.

Last year I did invest some time to make a virtual appliance (VMWare Server/ESX 
image) for Paul because no one really had the time to play around with DSPAM 
3.9.0 and I wanted Paul to be able to look at DSPAM 3.9.0 without the need to 
convert his current setup to 3.9.0.

When doing that appliance I did a lot of scripts that automated the management 
of the appliance. In my setup I do have much of those scripts already but I 
never really took the time to separate them and put them into logical entities. 
But for the appliance I did it. Not all of them. Just the most important ones. 
And this is what I have +/- included on Makro's setup. It was btw easier for me 
to do work on Marko's system since he is using Gentoo like me and I instantly 
feel @ home on his server landscape :)

I am still working with Marko on his setup. I am not finished. There is much 
additional things that can be done and that I will implement at Marko's setup. 
So far the most important things are implemented.

What Paul's and Marko's setup is still lacking is all that automatic lerning 
stuff that I have implemented on my part. Stuff like honeypot traps that 
automatically feed DSPAM and stuff like the automatic positive feedback/process 
for DSPAM (aka: learning outbound from authentificated users into their DSPAM 
instance).

On both setups I have trained a global merged group in DSPAM and used that in 
order to minimize the training for new and old users. Actually all users where 
new for DSPAM since we removed all tokens from DSPAM and went with a complete 
new tokenizer and that anyway required to remove all tokens since the old 
tokenizer used is not compatible with the one we used on the new setup.

For training I used my modified dspam_train that is able to use TONE training 
(TONE = Train On Error or Near Error) with an asymmetric training 
threshold/thickness and with double side training and other small additional 
features. On Paul's setup I first used a dump from my DSPAM database and 
imported it into his setup but that did not resulted in a goot catch rate. So 
we dumped the data and used his own Ham corpi and for Spam corpi we used public 
available corpi. The result was that after training with his data the catch 
rate was way better then with my tokens. This is probably because I have a 
unique situation where I need multiple languages in my DSPAM token database 
(Switzerland has officially 4 national languages) and Paul almost has only 
English mails.

For Makro I straight went ahead and asked him for Ham mails and trained with 
his Ham mails and again with public available Spam corpi and trained 
additionally to that Ham messages that I capture since years by pulling a dozen 
German speaking news groups. I categorize those news groups automatically with 
DSPAM but only train FP/FN and messages that are below my training 
threshold/thickness. I verify EVERY message that falls into one of the above 
conditions by hand. So they are all 100% verified. In the old days this was 
much work but right now I hardly have more then 100 messages a week that I need 
to verify and correct (my DSPAM installation is very old).

Why am I writing this? Hm... my point is that if you take the time to plan an 
MTA and you take the time to harden that MTA with what ever is available out 
there on the net and implement stuff that is not available as an out of the box 
solution but you know how to code, then you can make even without a content 
filter already a very good setup that is blocking above 95% of your Spam mails 
with virtually no false positive/false negative. 

Spam is today hardly an issue. The problem is only lazy admins. I know, I know. 
This sounds hard. But hey. It is my personal oppinion and challange me and 
proove me that I am wrong.

From my viewpoint so many MTA admins out there have not well secured their 
infrastructure. It always puzzles me to see some basic stuff not closed down. 
One such simple example is that they don't prevent spammers claiming to be an 
email address that they obviously can't be. For example you will NOT be able to 
send me a mail directly over my mailserver and claim to be <ste...@bajic.ch>. 
You can try as long as you want but it will not work (that netbox system is NOT 
on my network. It is a system out on the internet):
-------------------------------------
netbox ~ # telnet mail.bajic.ch 25
Trying 62.12.131.156...
Connected to mail.bajic.ch.
Escape character is '^]'.
220 nyx.bajic.name ESMTP Postfix (2.7.0) [NO UCE, NO UBE, C=CH, L=ZU]
ehlo localhost
250-nyx.bajic.name
250-PIPELINING
250-SIZE 52428800
250-ETRN
250-STARTTLS
250-AUTH PLAIN LOGIN DIGEST-MD5 CRAM-MD5
250-AUTH=PLAIN LOGIN DIGEST-MD5 CRAM-MD5
250-ENHANCEDSTATUSCODES
250-8BITMIME
250 DSN
mail from:<ste...@bajic.ch>
553 5.7.1 <ste...@bajic.ch>: Sender address rejected: not logged in
rset
250 2.0.0 Ok
quit
221 2.0.0 Bye
Connection closed by foreign host.
netbox ~ #
-------------------------------------

And even trying from within that system it self does not help:
-------------------------------------
nyx ~ # telnet nyx.bajic.name 25
Trying 62.12.131.156...
Connected to nyx.bajic.name.
Escape character is '^]'.
220 nyx.bajic.name ESMTP Postfix (2.7.0) [NO UCE, NO UBE, C=CH, L=ZU]
ehlo nyx.bajic.name
250-nyx.bajic.name
250-PIPELINING
250-SIZE 52428800
250-ETRN
250-STARTTLS
250-AUTH PLAIN LOGIN DIGEST-MD5 CRAM-MD5
250-AUTH=PLAIN LOGIN DIGEST-MD5 CRAM-MD5
250-ENHANCEDSTATUSCODES
250-8BITMIME
250 DSN
mail from:<ste...@bajic.ch>
553 5.7.1 <ste...@bajic.ch>: Sender address rejected: not logged in
rset
250 2.0.0 Ok
quit
221 2.0.0 Bye
Connection closed by foreign host.
nyx ~ #
-------------------------------------

I don't allow such stupid things.

And I punish those idiots trying to work around those limits. I block them on 
IP level. Alone in the last 3 days I have blocked (temporarily each time 600 
seconds) over 400 IP's on one of my MX clusters:
-------------------------------------
theia ~ # head -n 1 /var/log/vmail.log
Apr 17 03:16:16 theia postfix/smtpd[7183]: connect from 
c-71-228-120-132.hsd1.nm.comcast.net[71.228.120.132]
theia ~ # tail -n 1 /var/log/vmail.log
Apr 20 00:17:41 theia postfix/smtpd[21703]: disconnect from 
unknown[77.31.181.253]
theia ~ # fail2ban-client status postfix-attack
Status for the jail: postfix-attack
|- filter
|  |- File list:        /var/log/vmail.log
|  |- Currently failed: 4
|  `- Total failed:     2352
`- action
   |- Currently banned: 1
   |  `- IP list:       189.55.36.31
   `- Total banned:     406
theia ~ #
-------------------------------------

400 does not sound much. But the above is only one of many defense measures 
implemented. All together summed prevent a lot of Spam mail even reaching DSPAM.

And I don't just have one domain on that server. There are hundreds of them and 
each of them is unique. Unique in that term that you can't implement to hard 
rules without breaking the one or other domain. I mean breaking mail for that 
domain. So I have taken extra care to make the whole setup very flexible. Every 
domain owner can turn on/off things for his/her domain and they can control 
what their users are allowed to change and what not. It's their domain and they 
can control almost every aspect of it.

I am an email user since long time (over 20 years). I used first email on IBM 
Mainframe (using OfficeVision) and for me email is and was always an open 
communication platform. But in todays crazy world a responsibile MTA admin 
needs to take his job serious and close all those gaping holes. And there are 
so many tools and options available to do that. I am always surpized to see so 
many MTA's in very bad shape. I don't know why this is that way? Especially 
since there is so much collective knowledge available on the internet that one 
can just use.


> > Edward
> >
-- 
Kind Regards from Switzerland,

Stevan Bajić

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Re: [Dspam-user] Fw: clamav success story

Reply via email to