Going through a few spams tonight, I came up with a small handful of
possibly useful rules.
I'd appreciate it if anyone who has a corpus set up and some spare cpu
cycles could test these and see if they are actually any good.
Thanks,
Loren
PS: Beware the probable wrap on several of these!
# Misc test spammer rules
header X_UNAUTHENTIC_WARNING X-Authentication-Warning =~
/(?:[a-z]{4,20},? ){2,8}/ # no /i, trailing space
describe X_UNAUTHENTIC_WARNING Fake X-Authentication-Warning header
header X_BOGUS_MAILER X-Mailer =~ /(?:[a-z]{4,20},? ){2,8}/ # no /i,
trailing space
describe X_BOGUS_MAILER Fake X-Mailer header
header __BOGUS_SUBJECT Subject =~ /(?:R[eE]: )?(?:[a-z]{2,20},?\s?){1,8}/
# no /i!
meta BOGUS_SUBJECT (__BOGUS_SUBJECT && (X_BOGUS_MAILER ||
X_UNAUTHENTIC_WARNING))
describe BOGUS_SUBJECT Subject is possibly random words
score BOGUS_SUBJECT 0.5
header BOGUS_MSGID ALL =~ /\<[^\>[EMAIL PROTECTED]@[a-z]+\>/ # no /i!
describe BOGUS_MSGID Possibly bogus <msgid> construct
score BOGUS_MSGID 0.1
header STMP_NO_ID Received =~ /(?:\([^\)]+\))? with STMP(?!\sid
[\w\-\.\$]{4,40})/ # no /i!
describe STMP_NO_ID Received header with "STMP" not followed by "id"
header DES_ENC_STMP Received =~ /\bwith with DES-CBC3-SHA encrypted
SMTP\;/
describe DES_ENC_STMP Spammer fake STMP id
body PT_WORDLIST_30
/(?:\b(?!(?:from|that|have|this|were|with)\b)[a-z]{4,12}\s+){30}/
describe PT_WORDLIST_30 string of 30+ random words
score PT_WORDLIST_30 10.0