Warnocked indeed. You're making me feel really old. ;-)
I'm not so sure about File::Tail - "tail +f"'ing a logfile is ok for short periods of time, but gets problematic with log rotations when you try to tail a file for more than a day or two. Besides, parsing the logfile entries is needlessly memory and cpu intensive compared to using sockets, and also not nearly as elegant. ;-) Thanks for pointing out the plugin API. I haven't looked at 3.2.0 yet... I guess it's time I did. :) I was thinking that only sites that used the milter would be interested in doing something like this, so I hadn't considered using spamd to report scores... That, and I'm haven't done much in Perl before. I'm more of a C guy. ;) One thought I had was to use UDP datagrams (low overhead, no errors to handle) to report the scores to a daemon that would track them, and decide if an ipaddr needed to be blacklisted, ages and removes blacklist entries, and updates the dns database. I guess this architecture would not have to change if the scores came from a Perl plugin to spamd, rather than the milter... So, thanks again for pointing out the plugin API. That sounds like the way to go if datagrams are used. I guess I should poll the users@ list to see how many people would rather have realtime auto-blacklisting vs. a daily logparsing style. I like the idea of realtime because I can effectively age the entries and delist them during the day, rather than at day's end, but I suppose there isn't a really significant difference there in the end. Thanks for not Warnocking me further. ;-) -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Friday, April 20, 2007 1:34 PM To: Vincent Fleming Cc: [EMAIL PROTECTED]; [email protected] Subject: Re: SA Functional extension suggestions? Vincent Fleming writes: > Thanks for responding - I was wondering about the lack of response. Warnocked! ;) http://en.wikipedia.org/wiki/Warnocked > My dilemma is that I want to add a few lines of code to spamass-milter > so it can report spamscores in realtime. I didn't think the users list > was the right place for that. > > What "existing interfaces" are you referring to? I haven't seen > anything of the sort in the code. Are you referring to the logfiles? > > Certainly, one could collect such data from logfiles, if a once-or-twice > daily update is desired, but I'm thinking that doing this in realtime > would be way cool. It would be too much overhead to scan the logfiles > every 10 minutes... I've heard of similar systems driven from the logs -- File::Tail is useful for this, iirc. In addition, SpamAssassin 3.2.0 now has a plugin API which is called with the spamd "result:" log line -- log_scan_result() --so a SpamAssassin plugin can now take action based on that data. --j. > --Vince > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: Friday, April 20, 2007 12:55 PM > To: Vincent Fleming > Cc: [email protected] > Subject: Re: SA Functional extension suggestions? > > > Vincent -- > > It's a very good idea; I think however that you would get more > interest/response on the users list. the dev list is more oriented > towards the internals of SpamAssassin, not developers using the > existing interfaces. ;) > > --j. > > Vincent Fleming writes: > > Hi Everyone, > > > > > > > > Hi; I'm new to this list, but not to Spamassassin - I've been using it > > for years, and thank all of you for all your efforts - it works great > > for me. > > > > > > > > I had an idea for a functional extension to SA, and thought I would > > share it with you (I don't think the users@ list would really be > > interested or appropriate, so please tell me if I'm bringing this up > > with the wrong group of people). Please don't flame me terribly if > I'm > > sending this to the wrong list. ;-) > > > > > > > > So, here's my idea - an option to feed spam scores from SA into a > local > > blacklist. > > > > > > > > Here's where I got this idea from: > > > > > > > > I've been using DNSBLs (njabl.org, spamcop.net, spamhaus.org, and > sorbs) > > from Sendmail to reduce the load on SA. > > > > > > > > Although the DNSBLs reject a lot of connections, many sites are still > > not listed in the blacklists. I have SA reject anything scoring over > 7, > > but SA's been pretty busy... > > > > > > > > Looking at where the spam is coming from, I see it's a relatively > small > > number of sites/subnets (70, currently). It seems that the spammers > are > > just moving their IP servers to different IP addrs quicker than the > > DNSBLs can keep up with them. (After all, most DNSBL's do try to > verify > > the spam sources). > > > > > > > > So, blacklisting sites based on my own past experience (well, SA's > > experience) seemed like a good idea. It's merging the two forms of > > anti-spam - blacklisting and content-filtering, and using the two to > > augment each other. > > > > > > > > > > > > > > > > So, as an experiment, I added blacklist functionality to > spamass-milter. > > (I know, I know, but please read on ;-) > > > > > > > > I must say - it's been working *very* well. SA is experiencing a 90% > > reduction in workload, and it hasn't blacklisted a ham site yet. > > > > > > > > Here's what I did: I decided to track spam scores (a running total) > and > > a timestamp (of the last spam detection). If a ipaddr's spamscore > gets > > over a certain number (I picked 20), I reject connections in > > mlfi_connect(). I implemented an auto-delisting by deducting 1 point > > per day, so they won't stay on the blacklist forever, and then track > the > > number of times I delist them. I weight their scores thereafter with > > the number of times they've been delisted, so they'll re-list > > automatically if they continue to send spam, and list for longer each > > time. (I multiply the spamcore of all new messages by the number of > > times I've delisted them.) > > > > > > > > So, it "learns" - at least to some limited degree. > > > > > > > > Anyway, I digress... > > > > > > > > > > > > > > > > I know the architecture I've implemented is not appropriate for larger > > sites (I run a particularly small site), but it was a good exercise > > nonetheless. > > > > > > > > After little research on how DNSBLs work, I think it would be > reasonable > > to scale this by integrating it with rbldnsd somehow. If I can > collect > > scores from SA in realtime (via spamass-milter?), and add blacklist > > entries to a rbldnsd via creating a new "local" dataset (ie: > > ip4set:local), that might work. > > > > > > > > I think this will scale well, as larger sites are probably running > > rbldnsd already (ie: rsync'ed databases from njabl.org and/or > > spamhaus.net), and this would merely extend the namespace. > > > > > > > > > > > > > > > > My question is, do you people think this is a good idea, and if so, I > > would like to discuss topics like how to get scores from SA, overall > > architecture, more elaborate logic of when to locally blacklist, > aging, > > etc. > > > > > > > > Thoughts? > > > > > > > > Hey - thanks for listening. I look forward to comments. > > > > > > Regards, > > > > > > > > Vince Fleming > > > > HOME: [EMAIL PROTECTED] > > > > WORK: [EMAIL PROTECTED]
