On 10/03/2009 05:08 PM, John Hardin wrote:
On Sat, 3 Oct 2009, Warren Togami wrote:

On 10/01/2009 02:36 PM, John Hardin wrote:
On Thu, 1 Oct 2009, Warren Togami wrote:

> The "Oddity" I was pointing out at the beginning of the thread is not
> prevalence of .cn URI's, but rather most of them appear to be exactly
> 8 characters long. Could someone please commit my T_CN_8_URL rule to
> the sandbox so we can see if that trend holds beyond my own corpa?

I've put a .CN 8 URI rule into my sandbox file but it may be a few days
before it gets committed, my stuff is in flux right now...


# 8-letter .cn domain, per Warren Togami
uri CN_EIGHT m;^https?://(?:[^./]+\.)*[^./]{8}\.cn/;
describe CN_EIGHT .CN uri with eight-letter domain name
score CN_EIGHT 0.10

Possible bug here... Do all URI's necessarily have a trailing slash?

First results are in:

http://ruleqa.spamassassin.org/20091003-r821273-n/T_CN_EIGHT/detail


Can't trust those results yet.  The trailing slash bug, and John Rudd might be 
correct about whitespace?

[^./]{8}\.cn

Actually, doesn't this match other characters that shouldn't be in a domain 
name?

Then there are "valid" URL's like http://password:usern...@example.com/  not 
matched by this rule.

Could you please add the following to the sandbox before tomorrow?

# from http://www.apnic.net/db/ranges.html at 20091002, meta bits added 20090930
# copied from khop-bl.sa.khopesh.com
header __RCVD_VIA_APNIC Received =~ 
/(?-xism:[^0-9.](?:2(?:0(?:2(?:\.1(?:2(?:3\.(?:0?(?:[4-9][0-9]|3[2-9])|[12][0-9]{2})\.[012]?[0-9]{1,2}|[^3]\.(?:012]?[0-9]{1,2}){2})|[^2]3\.(?:012]?[0-9]{1,2}){2})|(?:\.[02]?[0-9]{1,2}){3})|3(?:\.[012]?[0-9]{1,2}){3})|(?:1[0189]|2[012])(?:\.[012]?[0-9]{1,2}){3})|1(?:(?:2[0123456]|8[023]|1\d|75)(?:\.[012]?[0-9]{1,2}){3}|69\.2(?:1[0-9]|2[0-3]|0[89])(?:\.[012]?[0-9]{1,2}){2})|(?:5[89]|6[01])(?:\.[012]?[0-9]{1,2}){3})(?:[\]\)\s]))/
describe __RCVD_VIA_APNIC Received through a relay in Asia/Pacific Network

meta CN_EIGHT_NOAPNIC CN_EIGHT && !__RCVD_VIA_APNIC && !ALL_TRUSTED
describe CN_EIGHT_NOAPNIC .cn URI exactly 8 characters long, excluding APNIC

One silly arbitrary rule, excluding prejudiced rule.  This is still unsafe but 
should show us some interesting numbers.

Warren Togami
wtog...@redhat.com

Reply via email to