RE: SA Functional extension suggestions?

Vincent Fleming Sat, 21 Apr 2007 05:34:34 -0700

Warnocked indeed.

You're making me feel really old.  ;-)



I'm not so sure about File::Tail - "tail +f"'ing a logfile is ok for
short periods of time, but gets problematic with log rotations when you
try to tail a file for more than a day or two.  Besides, parsing the
logfile entries is needlessly memory and cpu intensive compared to using
sockets, and also not nearly as elegant. ;-)

Thanks for pointing out the plugin API. I haven't looked at 3.2.0 yet...
I guess it's time I did. :)

I was thinking that only sites that used the milter would be interested
in 
doing something like this, so I hadn't considered using spamd to report
scores...

That, and I'm haven't done much in Perl before.  I'm more of a C guy. ;)

One thought I had was to use UDP datagrams (low overhead, no errors to
handle) to report the scores to a daemon that would track them, and
decide if an ipaddr needed to be blacklisted, ages and removes blacklist
entries, and updates the dns database.  I guess this architecture would
not have to change if the scores came from a Perl plugin to spamd,
rather than the milter... 

So, thanks again for pointing out the plugin API.  That sounds like the
way to go if datagrams are used.

I guess I should poll the users@ list to see how many people would
rather have realtime auto-blacklisting vs. a daily logparsing style.  I
like the idea of realtime because I can effectively age the entries and
delist them during the day, rather than at day's end, but I suppose
there isn't a really significant difference there in the end.

Thanks for not Warnocking me further.  ;-)


-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Friday, April 20, 2007 1:34 PM
To: Vincent Fleming
Cc: [EMAIL PROTECTED]; [email protected]
Subject: Re: SA Functional extension suggestions? 


Vincent Fleming writes:
> Thanks for responding - I was wondering about the lack of response.

Warnocked! ;)
http://en.wikipedia.org/wiki/Warnocked

> My dilemma is that I want to add a few lines of code to spamass-milter
> so it can report spamscores in realtime.  I didn't think the users
list
> was the right place for that.
> 
> What "existing interfaces" are you referring to?  I haven't seen
> anything of the sort in the code.  Are you referring to the logfiles?
> 
> Certainly, one could collect such data from logfiles, if a
once-or-twice
> daily update is desired, but I'm thinking that doing this in realtime
> would be way cool.  It would be too much overhead to scan the logfiles
> every 10 minutes...

I've heard of similar systems driven from the logs -- File::Tail is
useful
for this, iirc.  In addition, SpamAssassin 3.2.0 now has a plugin API
which is called with the spamd "result:" log line -- log_scan_result()
--so a SpamAssassin plugin can now take action based on that data.

--j.

> --Vince
> 
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
> Sent: Friday, April 20, 2007 12:55 PM
> To: Vincent Fleming
> Cc: [email protected]
> Subject: Re: SA Functional extension suggestions? 
> 
> 
> Vincent --
> 
> It's a very good idea; I think however that you would get more
> interest/response on the users list.  the dev list is more oriented
> towards the internals of SpamAssassin, not developers using the
> existing interfaces. ;)
> 
> --j.
> 
> Vincent Fleming writes:
> > Hi Everyone,
> > 
> >  
> > 
> > Hi; I'm new to this list, but not to Spamassassin - I've been using
it
> > for years, and thank all of you for all your efforts - it works
great
> > for me.
> > 
> >  
> > 
> > I had an idea for a functional extension to SA, and thought I would
> > share it with you (I don't think the users@ list would really be
> > interested or appropriate, so please tell me if I'm bringing this up
> > with the wrong group of people).  Please don't flame me terribly if
> I'm
> > sending this to the wrong list.  ;-)
> > 
> >  
> > 
> > So, here's my idea - an option to feed spam scores from SA into a
> local
> > blacklist.
> > 
> >  
> > 
> > Here's where I got this idea from:
> > 
> >  
> > 
> > I've been using DNSBLs (njabl.org, spamcop.net, spamhaus.org, and
> sorbs)
> > from Sendmail to reduce the load on SA.
> > 
> >  
> > 
> > Although the DNSBLs reject a lot of connections, many sites are
still
> > not listed in the blacklists.  I have SA reject anything scoring
over
> 7,
> > but SA's been pretty busy...  
> > 
> >  
> > 
> > Looking at where the spam is coming from, I see it's a relatively
> small
> > number of sites/subnets (70, currently).  It seems that the spammers
> are
> > just moving their IP servers to different IP addrs quicker than the
> > DNSBLs can keep up with them.  (After all, most DNSBL's do try to
> verify
> > the spam sources).
> > 
> >  
> > 
> > So, blacklisting sites based on my own past experience (well, SA's
> > experience) seemed like a good idea.  It's merging the two forms of
> > anti-spam - blacklisting and content-filtering, and using the two to
> > augment each other.
> > 
> >  
> > 
> >  
> > 
> >  
> > 
> > So, as an experiment, I added blacklist functionality to
> spamass-milter.
> > (I know, I know, but please read on ;-)
> > 
> >  
> > 
> > I must say - it's been working *very* well.  SA is experiencing a
90%
> > reduction in workload, and it hasn't blacklisted a ham site yet.
> > 
> >  
> > 
> > Here's what I did:  I decided to track spam scores (a running total)
> and
> > a timestamp (of the last spam detection).  If a ipaddr's spamscore
> gets
> > over a certain number (I picked 20), I reject connections in
> > mlfi_connect().  I implemented an auto-delisting by deducting 1
point
> > per day, so they won't stay on the blacklist forever, and then track
> the
> > number of times I delist them.  I weight their scores thereafter
with
> > the number of times they've been delisted, so they'll re-list
> > automatically if they continue to send spam, and list for longer
each
> > time. (I multiply the spamcore of all new messages by the number of
> > times I've delisted them.)
> > 
> >  
> > 
> > So, it "learns" - at least to some limited degree.
> > 
> >  
> > 
> > Anyway, I digress...
> > 
> >  
> > 
> >  
> > 
> >  
> > 
> > I know the architecture I've implemented is not appropriate for
larger
> > sites (I run a particularly small site), but it was a good exercise
> > nonetheless.
> > 
> >  
> > 
> > After little research on how DNSBLs work, I think it would be
> reasonable
> > to scale this by integrating it with rbldnsd somehow.  If I can
> collect
> > scores from SA in realtime (via spamass-milter?), and add blacklist
> > entries to a rbldnsd via creating a new "local" dataset (ie:
> > ip4set:local), that might work.
> > 
> >  
> > 
> > I think this will scale well, as larger sites are probably running
> > rbldnsd already (ie: rsync'ed databases from njabl.org and/or
> > spamhaus.net), and this would merely extend the namespace.
> > 
> >  
> > 
> >  
> > 
> >  
> > 
> > My question is, do you people think this is a good idea, and if so,
I
> > would like to discuss topics like how to get scores from SA, overall
> > architecture, more elaborate logic of when to locally blacklist,
> aging,
> > etc.
> > 
> >  
> > 
> > Thoughts?
> > 
> >  
> > 
> > Hey - thanks for listening.  I look forward to comments.
> > 
> > 
> > Regards,
> > 
> >  
> > 
> > Vince Fleming
> > 
> > HOME: [EMAIL PROTECTED] 
> > 
> > WORK: [EMAIL PROTECTED]

RE: SA Functional extension suggestions?

Reply via email to