Going through a few spams tonight, I came up with a small handful of
possibly useful rules.
I'd appreciate it if anyone who has a corpus set up and some spare cpu
cycles could test these and see if they are actually any good.

Thanks,
        Loren

PS: Beware the probable wrap on several of these!


# Misc test spammer rules

header X_UNAUTHENTIC_WARNING X-Authentication-Warning =~
/(?:[a-z]{4,20},? ){2,8}/ # no /i, trailing space
describe X_UNAUTHENTIC_WARNING Fake X-Authentication-Warning header

header X_BOGUS_MAILER   X-Mailer =~ /(?:[a-z]{4,20},? ){2,8}/ # no /i,
trailing space
describe X_BOGUS_MAILER   Fake X-Mailer header

header __BOGUS_SUBJECT   Subject =~ /(?:R[eE]: )?(?:[a-z]{2,20},?\s?){1,8}/
# no /i!
meta BOGUS_SUBJECT   (__BOGUS_SUBJECT && (X_BOGUS_MAILER ||
X_UNAUTHENTIC_WARNING))
describe BOGUS_SUBJECT   Subject is possibly random words
score BOGUS_SUBJECT   0.5

header BOGUS_MSGID    ALL =~ /\<[^\>[EMAIL PROTECTED]@[a-z]+\>/ # no /i!
describe BOGUS_MSGID   Possibly bogus <msgid> construct
score BOGUS_MSGID    0.1

header STMP_NO_ID    Received =~ /(?:\([^\)]+\))? with STMP(?!\sid
[\w\-\.\$]{4,40})/ # no /i!
describe STMP_NO_ID    Received header with "STMP" not followed by "id"

header DES_ENC_STMP    Received =~ /\bwith with DES-CBC3-SHA encrypted
SMTP\;/
describe DES_ENC_STMP   Spammer fake STMP id

body    PT_WORDLIST_30
/(?:\b(?!(?:from|that|have|this|were|with)\b)[a-z]{4,12}\s+){30}/
describe PT_WORDLIST_30   string of 30+ random words
score   PT_WORDLIST_30   10.0



Reply via email to