On 6/17/2014 10:49 PM, Philip Prindeville wrote:
I’ve contributed fixes to Apache itself since 1997 (though not with any regularity), but can’t remember if I’ve ever had to furnish a CLA or not.
Of course. Small fixes don't meet the level of non-trivialness to merit a CLA but having a CLA on file is a great first step to getting karma in the meritocracy that is the ASF. If you had a CLA, your name SHOULD be on this list: http://people.apache.org/committer-index.html#unlistedclas
Sure, opening a bug is fine.
Thanks.
As to your last questions: for someone who doesn’t need the complexity of using an DNSBL, doesn’t want the wide scope of using a DNSBL, want to have to configure it, or perhaps just wants a significantly more precise tool to solve a very limited problem, local blacklisting lets you do this.
This is great. I would put ALL of this in the pm file so the perldoc includes it.
As an example, we were recently hit by a volley of SPAM from a variety of mail relays, but they all had something in common. All of them contained HTML with URL’s pointing to websites hosted by “Solar VPS”, and in particular on the subnet 65.181.64.0/18 (in some cases, the web hosts had additional A records on the subnet 192.99.0.0/16). It took a couple of hours to get URIDNSBL configured, scored appropriately, and working… and verifying that the ill-behaved hosts had corresponding entries in multi.uribl.com without prior understanding of the record encoding also took some time (since the use of DNS RR’s is an overloading of their intended use, it’s less than intuitive). When it was all over, it occurred to me that a trivial configuration like: uri_block_cidr L_BLOCK_CIDR 65.181.64.0/18 192.99.0.0/16 body L_BLOCK_CIDR eval:check_uri_local_bl("L_BLOCK_CIDR") describe L_BLOCK_CIDR Block URI's pointing to bad CIDR's score L_BLOCK_CIDR 5.20 would be a lot more of a pinpoint fix to my issue, rather than the overly generalized approach of using multi.uribl.com. And I didn’t want to score everyone that was in that DNSBL, just to particular subnets. After that, it occurred to me that I had never seen a legitimate email with a URL pointing to Vietnam or Nigeria in my life, and it would be nice to restrict those as well. So the plugin later evolved to: uri_block_cc L_BLOCK_CC cn vn ro bg ru ng eg body L_BLOCK_CC eval:check_uri_local_bl("L_BLOCK_CC") describe L_BLOCK_CC Block URI's pointing to countries with no CERT or anti-SPAM laws score L_BLOCK_CC 5.65 In the case of the 65.181.0.0/16 SPAM which provided this call to action, here are some subject lines you might recognize: News alert: you could apply for a CNA education program Wireless Internet plans online You've Been Accepted into the Who's Who Don't overpay for a phone. Try a free* one today Is your home missing something? How about custom blinds? Could you study at a CNA education program? cable service is a possibility etc. All within a 6 hour spam. Looking at some recent traffic on the SpamAssassin users mailing list, it seemed that other people had had a similar idea at the same time to provide surgical blacklisting locally. At this point, I’m thinking of adding whitelisting support to the country, ISP, and CIDR blacklists. For example, we’ve had issues with ServerBeach being proactive about Spam or even acknowledging complaints in a timely fashion: that said, we get legitimate traffic with URL’s pointing to a Fedora Project resource hosted on one of their networks. So we couldn’t blacklist that entire ISP without “punching a hole” for Fedora build reports. The whitelisting would either take individual IP addresses and/or host names as they appear in the URL’s. Hope that answers your questions. On Jun 17, 2014, at 9:24 AM, Kevin A. McGrail <[email protected]> wrote:Philip, Do you have a CLA with the ASF? From checking, I don't believe so. Can you please take a look at http://wiki.apache.org/spamassassin/AboutClas What might help you is that since this is a plugin, we could open a bug, add it to trunk, etc. for people to more readily test it. it wouldn't be enabled by default but should allow more people to readily implement it and provide feedback. However, for me I know I am curious if you could do a bit more description on why this is good to implement, what time of spam you use it to block, etc. in the pm? Regards, KAM On 6/15/2014 10:47 PM, Philip Prindeville wrote:Here’s a first attempt at a module. I based it on Plugin::URIDetail. It depends on Net::CIDR::Lite and Geo::IP. If it detects a valid (though not necessarily current) ISP database, it will publish a handler for that. Same with the IP-Lite (or licensed IP) database from MaxMind. We’ve been using the MaxMind database for a couple of years on a commercial project with good success. Currently the filtering is done by country code, ISP name, and explicit CIDR blocks. The last test is the least costly, but also the most fine grained… you can configure rules to run in whichever order suits your needs best. I personally sort by country (cn ru bg vn ro ng ir) and then by ISP (won’t name them here, but one of them is Over tHere in France), and lastly by CIDR block. The only real wart on these plugins is that they all index their databases by IP address, and do their own (implicit or explicit) name or IP mapping. Obviously, this is both blocking and repetitive. Not sure why PerMsgStatus.pm can’t do the asynchronous name lookups when get_uri_detail_list() runs so we have that handy for each of the plugins. If I had the mappings already available, I’d definitely use that. That is, instead of having: hosts => { ‘nqtel.com’ => ‘nqtel.com’ } why not instead have: hosts => ‘nqtel.com’ => [ ‘107.158.259.74’ ] } or even both, e.g. [ ‘nqtel.com’, ‘107.158.259.74’ ] (i.e. the domain at index 0 followed by the list of A records). One other shortcoming I noticed was the somewhat limited list of error returns such as MISSING_REQUIRED_VALUE, INVALID_VALUE, INVALID_HEADER_FIELD_NAME… what about MISSING_DEPENDENCY or MISSING_RESOURCE? What if we want to filter on Geo::IP’s ISP database, but the database isn’t present? I don’t do a lot of volume (maybe 10 messages per second peak), so doing blocking lookups isn’t a problem. But obviously this might be an issue for some high volume sites. Feedback is welcome. -Philip
-- *Kevin A. McGrail* President Peregrine Computer Consultants Corporation 3927 Old Lee Highway, Suite 102-C Fairfax, VA 22030-2422 http://www.pccc.com/ 703-359-9700 x50 / 800-823-8402 (Toll-Free) 703-359-8451 (fax) [email protected] <mailto:[email protected]>
