McDonald, Dan wrote:
>>> I'm considering a low-scoring rule like:
>>> body         AE_MEDS37 
>>> /\(\s?w{2,4}\s[:alpha:]{4}\d{1,4}\s(?:net|com|org)\s?\)/
>>> describe AE_MEDS37  rule to catch the next wave of spaced domains
>>> score        AE_MEDS37  1.0
>
> oops.  Doesn't compile.  should be:
> body   AE_MEDS37 /\(\s?w{2,4}\s[[:alpha:]]{4}\d{1,4}\s(?:net|com|org)\s?\)/

Maybe we can't anticipate which way the spammers are going to go.  The
next wave actually turned out to use punctuation but keep the same
style of domain name.  A straightforward way of catching several possible
ways of presenting these is:

body CK_MEDS50 
/\sw\s*w{1,3}\s*(?:[.,]\s*)?(?:meds|shop)\d{1,4}\s*(?:[.,]\s*)?(?:net|c\s?o\s?m|org)\b/i
describe CK_MEDS50              gappy pharma website address in text
score CK_MEDS50                 3.0

Spam with stops (=periods) would also hit a rule looking for obfuscation like:

body EVADE_URI2B /\b(?:H\s*T\s*T\s*P\s*:(?<!http:)|W\s*W\s*W\s*\.(?<!www\.))/i
describe EVADE_URI2B            gappy web address
score EVADE_URI2B               0.6

Bearing in mind that "full" rules can be something of a CPU-killer,
another low-scoring rule might simply look for a short message with some
kind of link marker, obfscated or not:

full LINK_NR_TOP                /\nContent-Type: 
text.{5,500}(?:http|www).{5,700}$/si
describe LINK_NR_TOP            Short message with link
score LINK_NR_TOP               0.2

Keeping the general idea of CK_MEDS50, but specifying the position of the link,
I have:

full NONLINK_SHORT              
/^Content-Type:\s*text([^\n]+\n){0,30}\n.{0,300}\b(?:H\s*T\s*T\s*P\s*[:;](?<!http:)\W{0,10}|W\s{0,10}W\s{0,10}W\s{0,10}(?:[.,\'`]\s{0,10})(?<!www\.)\s{0,10})[a-z0-9\-]{3,13}\s{0,10}(?:[.,\'`]\s{0,10})?(?:net|c\s{0,10}o\s{0,10}m|org)\b/msi

describe NONLINK_SHORT          Obfuscated link near top of text
score NONLINK_SHORT             2.0

I don't think there's any way to specify the start of the body in a body rule,
is there? A caret matches the start of a paragraph.

Also sometimes MSGID_SHORT hits, and other unusual header forms like:

header FROM_NOSPACE             From =~ /[a-z0-9"']</i
describe FROM_NOSPACE           Address directly follows 'real' name
score FROM_NOSPACE              0.6

CK


Reply via email to