Re: URL rule creation question

2009-09-12 Thread MySQL Student
 \s is the proper way to represent whitespace.

 lol, yes, I know that; I was actually trying to match 's' and the
 slash is the start of the pattern match.

 I wasn't referring to the beginning of the RE.

Yeah, I realized that just after I sent this, if anyone cares :-)

Thanks again,
Alex


Re: URL rule creation question

2009-09-11 Thread Matus UHLAR - fantomas
On 10.09.09 18:28, MySQL Student wrote:
 I've seen this pattern in spam quite a bit lately:
 
 href=http://doubleheaderover.com/jazert/html/?39.6d.3d.31.66.67.6b.79.77.63.77.63.65.6e.74.69.6e.6e.69
 .61.6c.5f.68.31.33.33.2e.6f.39.39.41.4d.2e.30.30.45.33.39.2e.30.32.30.61.64.6b.37.61.76.61.67.63.31.66.
 62.2e.6a.61.7a.65.72.74.2e.68.74.6d.6c3az8fO

what kind of URL/service is this? Isn't it worth to block this at all?
-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
   One OS to rule them all, One OS to find them, 
One OS to bring them all and into darkness bind them 


Re: URL rule creation question

2009-09-11 Thread McDonald, Dan
On Fri, 2009-09-11 at 14:37 +0200, Matus UHLAR - fantomas wrote:
 On 10.09.09 18:28, MySQL Student wrote:
  I've seen this pattern in spam quite a bit lately:
  
  href=http://EXAMPLE.com/jazert/html/?39.6d.3d.31.66.67.6b.79.77.63.77.63.65.6e.74.69.6e.6e.69
  .61.6c.5f.68.31.33.33.2e.6f.39.39.41.4d.2e.30.30.45.33.39.2e.30.32.30.61.64.6b.37.61.76.61.67.63.31.66.
  62.2e.6a.61.7a.65.72.74.2e.68.74.6d.6c3az8fO
 
 what kind of URL/service is this? Isn't it worth to block this at all?

The 'doubleheadedrover' domain currently shows up in Razor(E8),
uribl_black, surbl_jp, and invaluement.

But it wasn't in all of those when he first started posting about it.
So he is looking for a way of identifying bad urls by examining the path
portion rather than the domain


-- 
Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX
www.austinenergy.com


signature.asc
Description: This is a digitally signed message part


Re: URL rule creation question

2009-09-11 Thread MySQL Student
Hi,

 The 'doubleheadedrover' domain currently shows up in Razor(E8),
 uribl_black, surbl_jp, and invaluement.

 But it wasn't in all of those when he first started posting about it.

Yes, that's correct. Thanks for your help. That's already caught a
few. I have another that I thought you could help with.

I'd like to create a rule that matches a specific letter and up to 5
spaces after it, repeated ten times. I'm thinking something like this:

/s\ {5}o\ {5}n\ {5}i\ {5}c\ {5}\ m\ {5}e\ {5}d\ {5}i\ {5}a/i

I'm still learning regex's, so hopefully this isn't too far off. The
opportunities for rules are coming faster than my ability to learn.

Thanks,
Alex


Re: URL rule creation question

2009-09-11 Thread Karsten Bräckelmann
On Fri, 2009-09-11 at 15:09 -0400, Alex wrote:
 I'd like to create a rule that matches a specific letter and up to 5
 spaces after it, repeated ten times. I'm thinking something like this:
 
 /s\ {5}o\ {5}n\ {5}i\ {5}c\ {5}\ m\ {5}e\ {5}d\ {5}i\ {5}a/i

A space does not have any special meaning in REs. Don't escape it.

The quantifier {5} means *exactly* 5 occurrences. What you are after is
the {n,m} quantifier with an lower n and (optional) upper m bound. Thus,
to match at least one, and up to 5 occurrences: {1,5}


 I'm still learning regex's, so hopefully this isn't too far off. The
 opportunities for rules are coming faster than my ability to learn.

  http://perldoc.perl.org/perlre.html

The reference. In particular, also do have a look at the perlrequick
Introduction and perlretut Tutorial referenced early in the Description
section.


-- 
char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: URL rule creation question

2009-09-11 Thread Karsten Bräckelmann
On Fri, 2009-09-11 at 12:43 -0700, John Hardin wrote:
 \s is the proper way to represent whitespace.

True. However, in all rule types that use rendered text, there is only a
space -- no tabs. Well, there are newlines, but that doesn't matter
unless you use special modifiers. ;)

Actually, this reminds me -- if Alex is writing his rule as a body rule,
the text parts are rendered and normalized. This effectively means any
number of consecutive whitespace (within a paragraph) will be condensed
to a single space.

Thus /a b/ and /a {1,5}b/ become identical.


-- 
char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: URL rule creation question

2009-09-11 Thread McDonald, Dan
On Fri, 2009-09-11 at 15:09 -0400, MySQL Student wrote:
 Hi,
 
  The 'doubleheadedrover' domain currently shows up in Razor(E8),
  uribl_black, surbl_jp, and invaluement.
 
  But it wasn't in all of those when he first started posting about it.
 
 Yes, that's correct. Thanks for your help. That's already caught a
 few. I have another that I thought you could help with.
 
 I'd like to create a rule that matches a specific letter and up to 5
 spaces after it, repeated ten times.

unless you are using rawbody rules, multiple spaces are collapsed to
single spaces on the regularized body that rules are run against


-- 
Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX
www.austinenergy.com


signature.asc
Description: This is a digitally signed message part


Re: URL rule creation question

2009-09-11 Thread Matt Kettler
McDonald, Dan wrote:

 From: Matt Kettler [mailto:mkettler...@verizon.net]

 This rule  should detect 10 consecutive occurrences.
 uri   L_URI_FUNNYDOTS   /(?:\.[a-z,0-9]{2}\.){10}

 Warning: I wrote this quickly without too much thought. It may have
 bugs, but I'm short on time at the moment.

 your variant would require two periods in a row between each pair.

So it would... Hence the warning :)


URL rule creation question

2009-09-10 Thread MySQL Student
Hi all,

I've seen this pattern in spam quite a bit lately:

href=http://doubleheaderover.com/jazert/html/?39.6d.3d.31.66.67.6b.79.77.63.77.63.65.6e.74.69.6e.6e.69
.61.6c.5f.68.31.33.33.2e.6f.39.39.41.4d.2e.30.30.45.33.39.2e.30.32.30.61.64.6b.37.61.76.61.67.63.31.66.
62.2e.6a.61.7a.65.72.74.2e.68.74.6d.6c3az8fO

Would it be reasonable to create a rule that looks for this two-char
then dot pattern, or is it reasonable that it might appear in a
legitimate email too frequently? If possible, how would you create a
rule to capture this?

Thanks,
Alex


Re: URL rule creation question

2009-09-10 Thread Matt Kettler
MySQL Student wrote:
 Hi all,

 I've seen this pattern in spam quite a bit lately:

   
snip - URI that verizon won't let me send
 Would it be reasonable to create a rule that looks for this two-char
 then dot pattern, or is it reasonable that it might appear in a
 legitimate email too frequently? If possible, how would you create a
 rule to capture this?
   

This rule  should detect 10 consecutive occurrences.
uri   L_URI_FUNNYDOTS   /(?:\.[a-z,0-9]{2}\.){10}

I do think that 4-in-a-row might be pretty common (ie: IP addresses),
but 10 in a row seems unlikely.

Warning: I wrote this quickly without too much thought. It may have
bugs, but I'm short on time at the moment.



Re: URL rule creation question

2009-09-10 Thread McDonald, Dan
On Thu, 2009-09-10 at 18:28 -0400, MySQL Student wrote:
 Hi all,
 
 I've seen this pattern in spam quite a bit lately:
 
 href=http://doubleheaderover.com/jazert/html/?39.6d.3d.31.66.67.6b.79.77.63.77.63.65.6e.74.69.6e.6e.69
 .61.6c.5f.68.31.33.33.2e.6f.39.39.41.4d.2e.30.30.45.33.39.2e.30.32.30.61.64.6b.37.61.76.61.67.63.31.66.
 62.2e.6a.61.7a.65.72.74.2e.68.74.6d.6c3az8fO
 
 Would it be reasonable to create a rule that looks for this two-char
 then dot pattern, or is it reasonable that it might appear in a
 legitimate email too frequently? If possible, how would you create a
 rule to capture this?

uri URI_HEX_DOTTED  /(?:[[:xdigit:]]{2}\.){10}/

That would look for 10 two-digit hex numbers separated by periods in a
url.  Figure if you have at least 10 of them, its probably a match...

-- 
Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX
www.austinenergy.com


signature.asc
Description: This is a digitally signed message part


RE: URL rule creation question

2009-09-10 Thread McDonald, Dan
From: Matt Kettler [mailto:mkettler...@verizon.net]
 
This rule  should detect 10 consecutive occurrences.
uri   L_URI_FUNNYDOTS   /(?:\.[a-z,0-9]{2}\.){10}

Warning: I wrote this quickly without too much thought. It may have
bugs, but I'm short on time at the moment.

your variant would require two periods in a row between each pair.