On Thursday 01 May 2008 04:03, [EMAIL PROTECTED] 
wrote:
> Steven Champeon has been in touch regarding 'testing my enemieslist rDNS
> patterns data against the SpamAssassin spam/ham corpus(es) to see if
> there's a reason for us to collaborate.'

> I'm curious to see how incorporating EL DNSBL lookups into SpamAssassin
> might be useful; we have a DNSBL mirror network (currently three hosts,
> with more on the way) or I can talk about how to use it with a patched
> rbldnsd if you wanted to do some local testing. It'd be really

Actually, this is surprising that SA hasn't looked at something like this 
already.. We also use a similar method in our Mail Server technologies, 
albeit we do it in the SMTP layer.. but I think this begs a few questions..

o Should it be RBL based..

In the past SA users have been stung with RBL based lookups, when RBL's get 
blocked etc.. leading to very high system loads..

o Should SA start integrating a definition update program for something like 
this?

Compiling even 10k regex patterns takes very little overhead, and by doing 
daily updates of a locally cached list there is little risk of problems even 
when the updater fails, the latest regex's will always be on hand.

o Should this use one regex supplier, or community based?

This might be more helpful, as since there are projects like Enenies List, our 
own DynaRegex .. or other companies, projects etc.. that might evolve out of 
this.

It also could have several different types of regex patterns, as mentioned 
below  so that SA users could choose score settings for some patterns 
differently than others..  Some patterns are safe enough to score very high, 
while generic shared webhost patterns may want to be scored a little lower.

I think that the regex pattern database would be an excellent candidate for 
building out an SA defintion updater..

> OK, sounds good. I'm really interested in seeing what the various FP
> rates would be for both the HELO and PTR for the various return values;
> I'm also interested in seeing what rates are for the different
> subclasses (as formed by the combination of A response and TXT response
> for the same lookup, so "static/cable" or "dynamic/dsl" or
> "natproxy/vpn"). Basically, I'm using these today as very blunt hammers,
> and I want to make sure I have a good sense of how to better tune the
> scoring. And you guys have such great stats, so I came to you :)
>
> > So, these are generally run against the SMTP connecting host's
> > rDNS, right?
>
> Both PTR and HELO/EHLO string, yes. We've found that PTR is a good
> indicator, but when the HELO string is a match for some EL pattern it's
> a very reliable indicator of bot activity with a very low FP rate, so we
> test both when available. Of course, this differs between the various
> types, so I wouldn't assume webhost or outmx or static PTR are
> necessarily bad, just indicative. But we'll see what the numbers
> look like after we run some tests, I suppose :)
>
> > By the way, do you mind if we conduct this conversation on a public
> > Bugzilla entry?  that's generally how we do it.  Doing that in the
> > open is also more likely to get useful info on how other hosts
> > have found the increased load from SpamAssassin lookups, too.
>
> No, not at all, though I definitely want to know how adding this to
> SA would affect our load; and give me time to throw a few more rbldnsd
> mirrors into the rotation if required. (Running lookups against the
> patterns is very fast, 75K/s here on my macbook, but once you add
> logging and DNS overhead it slows down considerably :-/)
>
> So, what next? Should we look at setting up a local rbldnsd instance
> to isolate testing from our production machines? Was the doc I sent
> a URL for in my last email sufficient to tweak whatever SA rules
> you need to test? I'm here to answer any questions you have :)
>
>
>
> Anyway, usage details are here:  http://enemieslist.com/how/use.html --
> we'd need to add some rules to do this.  I've been meaning to do this for
> several weeks(!) but things have been busy :( so here's a new ticket.

-- 
--
"Catch the Magic of Linux..."
------------------------------------------------------------------------
Michael Peddemors - President/CEO - LinuxMagic
Products, Services, Support and Development
Visit us at http://www.linuxmagic.com
------------------------------------------------------------------------
A Wizard IT Company - For More Info http://www.wizard.ca
"LinuxMagic" is a Registered TradeMark of Wizard Tower TechnoServices Ltd.
------------------------------------------------------------------------
604-589-0037 Beautiful British Columbia, Canada

This email and any electronic data contained are confidential and intended 
solely for the use of the individual or entity to which they are addressed. 
Please note that any views or opinions presented in this email are solely 
those of the author and are not intended to  represent those of the company.

Reply via email to