Re: RCVD_VIA_APNIC: CIDR to regex generator?

2009-10-02 Thread Steven W. Orr
On 10/02/09 02:43, quoth Warren Togami:
> # 2005/07/29, http://www.apnic.net/db/ranges.html header   RCVD_VIA_APNIC
> Received =~ 
> /[^0-9.](?:5[89]|6[01]|12[456]|20[23]|21[0189]|22[012])(?:\.[012]?[0-9]{1,2}){3}(?:\]|\)|
>  )/ describe RCVD_VIA_APNIC Received through a relay in Asia/Pacific
> Network
> 
> Adam Katz had this rule in one of his channels.  While it is wholly unsafe
> to be used alone, it could be useful in masscheck statistics and possibly
> if used in meta booleans in combination with other rules.
> 
> http://www.apnic.net/publications/research-and-insights/ip-address-trends/apnic-resource-range
> 
> 
> Unfortunately, in testing the above rule on my own corpus I see it is 
> missing some obvious Asian addresses.  This page reveals that the regex is
> out of date.  Does there exist a good automated way to convert many CIDR
> ranges to a single regex?
> 
> Warren Togami wtog...@redhat.com

http://www.brandonhutchinson.com/CIDR_netmasks_with_sendmail.html

-- 
Time flies like the wind. Fruit flies like a banana. Stranger things have  .0.
happened but none stranger than this. Does your driver's license say Organ ..0
Donor?Black holes are where God divided by zero. Listen to me! We are all- 000
individuals! What if this weren't a hypothetical question?
steveo at syslang.net



signature.asc
Description: OpenPGP digital signature


Re: [SA] RE RCVD_VIA_APNIC

2009-10-02 Thread Adam Katz
Warren Togami wrote:
> # 2005/07/29, http://www.apnic.net/db/ranges.html
> header   RCVD_VIA_APNIC Received =~ 
> /[^0-9.](?:5[89]|6[01]|12[456]|20[23]|21[0189]|22[012])(?:\.[012]?[0-9]{1,2}){3}(?:\]|\)|
>  
> )/
> describe RCVD_VIA_APNIC Received through a relay in Asia/Pacific Network

> Adam Katz had this rule in one of his channels. While it is wholly 
> unsafe to be used alone, it could be useful in masscheck statistics
> and possibly if used in meta booleans in combination with other
> rules.
> 
> Unfortunately, in testing the above rule on my own corpus I see it
> is missing some obvious Asian addresses. This page reveals that the
> regex is out of date. Does there exist a good automated way to
> convert many CIDR ranges to a single regex?

Hm.  I didn't know that APNIC's space was updated that often.  I'll
adjust my rule.  Also, though I didn't say anything when you
approached me in IRC (we're on vastly different schedules), I did make
some changes to the rule so as to make it safer, including checking
against trusted networks and DNS whitelists and scoring it at 0.001.

__RCVD_VIA_APNIC will soon be updated to a monster constructed from a
hand-tweaked copy of the table at http://www.apnic.net/db/ranges.html
and fed into Regexp::Assemble (post-tweaked perl code is attached).

The attached apnic.cf.txt file (named so as to better appear in your
mail reader) is a sample of the pending latest revision in khop-bl.

As to its "missing some obvious Asian addresses" ... I believe that is
because many Asian addresses are outside the jurisdiction of APNIC,
for example, I believe Japan has three /8 networks (43, 126, and 133)
independent of APNIC ... and that's just by eying the XKCD map of the
IPv4 space!
# 2009/10/02 from http://www.apnic.net/db/ranges.html   meta bits added 20090930
header __RCVD_VIA_APNIC Received =~ 
/(?-xism:[^0-9.](?:2(?:0(?:2(?:\.1(?:2(?:3\.(?:0?(?:[4-9][0-9]|3[2-9])|[12][0-9]{2})\.[012]?[0-9]{1,2}|[^3]\.(?:012]?[0-9]{1,2}){2})|[^2]3\.(?:012]?[0-9]{1,2}){2})|(?:.[02]?[0-9]{1,2}){3})|3(?:.[012]?[0-9]{1,2}){3})|(?:1[0189]|2[012])(?:.[012]?[0-9]{1,2}){3})|1(?:(?:2[0123456]|8[023]|1\d|75)(?:.[012]?[0-9]{1,2}){3}|69\.2(?:1[0-9]|2[0-3]|0[89])(?:.[012]?[0-9]{1,2}){2})|(?:5[89]|6[01])(?:.[012]?[0-9]{1,2}){3})(?:\]\)\s]))/
meta RCVD_VIA_APNIC __RCVD_VIA_APNIC && !__KHOP_DNSWLD && !ALL_TRUSTED
describe RCVD_VIA_APNIC Received through a relay in Asia/Pacific Network
tflags   RCVD_VIA_APNIC noautolearn 
#score   RCVD_VIA_APNIC 0.4 0.2 0.7 0.5 # lowered for autolearn BLs
scoreRCVD_VIA_APNIC 0.001 # 20090930: not suitable for blanket publication

meta __KHOP_DNSWLD  RCVD_IN_DNSWL_LOW || RCVD_IN_DNSWL_MED || 
RCVD_IN_DNSWL_HI || RCVD_IN_JMF_W || RCVD_IN_BSP_TRUSTED || RCVD_IN_IADB_DOPTIN 
|| RCVD_IN_IADB_ML_DOPTIN || RCVD_IN_IADB_VOUCHED || RCVD_IN_SSC_TRUSTED_COI
#!/usr/bin/perl -w

use Regexp::Assemble;

my $ra = Regexp::Assemble->new;
my $start = '[^0-9.]';
my $end = '(?:\]\)\s])';
my $cidr8tail = '(?:.[012]?[0-9]{1,2}){3}' . $end;

$ra->add($start . '58' . $cidr8tail);
$ra->add($start . '59' . $cidr8tail);
$ra->add($start . '60' . $cidr8tail);
$ra->add($start . '61' . $cidr8tail);
$ra->add($start . '110' . $cidr8tail);
$ra->add($start . '111' . $cidr8tail);
$ra->add($start . '112' . $cidr8tail);
$ra->add($start . '113' . $cidr8tail);
$ra->add($start . '114' . $cidr8tail);
$ra->add($start . '115' . $cidr8tail);
$ra->add($start . '116' . $cidr8tail);
$ra->add($start . '117' . $cidr8tail);
$ra->add($start . '118' . $cidr8tail);
$ra->add($start . '119' . $cidr8tail);
$ra->add($start . '120' . $cidr8tail);
$ra->add($start . '121' . $cidr8tail);
$ra->add($start . '122' . $cidr8tail);
$ra->add($start . '123' . $cidr8tail);
$ra->add($start . '124' . $cidr8tail);
$ra->add($start . '125' . $cidr8tail);
$ra->add($start . '126' . $cidr8tail);
$ra->add($start . '169\.20[89](?:.[012]?[0-9]{1,2}){2}' . $end);
$ra->add($start . '169\.21[0-9](?:.[012]?[0-9]{1,2}){2}' . $end);
$ra->add($start . '169\.22[0-3](?:.[012]?[0-9]{1,2}){2}' . $end);
$ra->add($start . '175' . $cidr8tail);
$ra->add($start . '180' . $cidr8tail);
$ra->add($start . '182' . $cidr8tail);
$ra->add($start . '183' . $cidr8tail);
$ra->add($start . '202(?:.[02]?[0-9]{1,2}){3}' . $end);
$ra->add($start . '202\.12[^3]\.(?:012]?[0-9]{1,2}){2}' . $end);
$ra->add($start . '202\.1[^2]3\.(?:012]?[0-9]{1,2}){2}' . $end);
$ra->add($start . '202\.123\.[12][0-9]{2}\.[012]?[0-9]{1,2}' . $end);
$ra->add($start . '202\.123\.0?[4-9][0-9]\.[012]?[0-9]{1,2}' . $end);
$ra->add($start . '202\.123\.0?3[2-9]\.[012]?[0-9]{1,2}' . $end);
$ra->add($start . '203' . $cidr8tail);
$ra->add($start . '210' . $cidr8tail);
$ra->add($start . '211' . $cidr8tail);
$ra->add($start . '218' . $cidr8tail);
$ra->add($start . '219' . $cidr8tail);
$ra->add($start . '220' . $cidr8tail);
$ra->add($start . '221' . $cidr8tail);
$ra->add($start . '222' . $cidr8tail);

print "header __RCVD_VIA_APNIC\tReceived =~ /" . $ra->re . "/\n"

RE RCVD_VIA_APNIC

2009-10-02 Thread hamann . w

>> Warren Togami wrote:
>> # 2005/07/29, http://www.apnic.net/db/ranges.html
>> header   RCVD_VIA_APNIC Received =~ 
>> /[^0-9.](?:5[89]|6[01]|12[456]|20[23]|21[0189]|22[012])(?:\.[012]?[0-9]{1,2}){3}(?:\]|\)|
>>  
>> )/
>> describe RCVD_VIA_APNIC Received through a relay in Asia/Pacific Network

>> Adam Katz had this rule in one of his channels.  While it is wholly 
>> unsafe to be used alone, it could be useful in masscheck statistics and 
>> possibly if used in meta booleans in combination with other rules.
>>
>> http://www.apnic.net/publications/research-and-insights/ip-address-trends/apnic-resource-range
>> Unfortunately, in testing the above rule on my own corpus I see it is 
>> missing some obvious Asian addresses.  This page reveals that the regex 
>> is out of date.  Does there exist a good automated way to convert many 
>> CIDR ranges to a single regex?
>> 
>> Warren Togami

Hi Warren,

I am using the geoIP database in a similar context, but rather than converting 
to regex,
I convert to a cdb file and do a lookup on that.
To integrate with spamassassin, a perl cdb module would be needed

More info about cdb is available at http://cr.yp.to/cdb.html

Regards
Wolfgang