> Would a rule to calculate some kind of "special chars" vs
> "total chars" ratio be useful?
> Does anybody have that kind of rule already?
Doing that as a ratio would require an eval, I suspect. However, detecting
obfuscated things is pretty easy. You need some new rules! :-) Hie thee
off to exit0 or rulesemporium or the like and get Matt's antidrug set, just
for a start.
Here are some results from the first few of your spam:
Content analysis details: (27.1 points, 4.6 required)
pts rule name description
---- ---------------------- ------------------------------------------------
--
1.4 SARE_ALC Some header matches /improve your/i
0.6 RATWR10a_MESSID Message-ID has ratware pattern (HEXHEX.HEXHEX@)
1.8 LOCAL_OBFU_CELEXA BODY: Obfuscated 'CELEXA' in body
1.8 LOCAL_OBFU_XANAX BODY: Obfuscated 'XANAX' in body
1.8 LOCAL_OBFU_LEVITRA BODY: Obfuscated 'LEVITRA' in body
1.8 LOCAL_OBFU_PAXIL BODY: Obfuscated 'PAXIL' in body
1.8 LOCAL_OBFU_VIAGRA BODY: Obfuscated 'VIAGRA' in body
1.0 SARE_OBFUGIRLS BODY: masked spam word(s)
1.8 LOCAL_OBFU_MERIDIA BODY: Obfuscated 'MERIDIA' in body
0.1 TW_OC BODY: Odd Letter Triples with OC
1.8 LOCAL_OBFU_CIALIS BODY: Obfuscated 'CIALIS' in body
1.8 LOCAL_OBFU_XENICAL BODY: Obfuscated 'XENICAL' in body
1.5 DRUGS_ERECTILE_OBFU Obfuscated reference to an erectile drug
1.0 DRUGS_ANXIETY_OBFU Obfuscated reference to an anxiety control drug
1.0 DRUGS_ERECTILE Refers to an erectile drug
0.0 DRUGS_ANXIETY Refers to an anxiety control drug
0.0 DRUGS_DEPRESSION Refers to an antidepressant
0.0 DRUGS_DIET Refers to a diet drug
1.0 DRUGS_DEPR_EREC Refers to both an erectile and an antidepressant
1.0 DRUGS_ANXIETY_EREC Refers to both an erectile and an anxiety drug
1.0 DRUGS_DIET_EREC Refers to both an erectile and a diet drug
1.0 DRUGS_MANYKINDS Refers to at least four kinds of drugs
Content analysis details: (31.0 points, 4.6 required)
pts rule name description
---- ---------------------- ------------------------------------------------
--
1.8 LOCAL_OBFU_XANAX BODY: Obfuscated 'XANAX' in body
1.8 LOCAL_OBFU_ZOLOFT BODY: Obfuscated 'ZOLOFT' in body
1.8 LOCAL_OBFU_LEVITRA BODY: Obfuscated 'LEVITRA' in body
1.8 LOCAL_OBFU_CELEBREX BODY: Obfuscated 'CELEBREX' in body
1.8 LOCAL_OBFU_PAXIL BODY: Obfuscated 'PAXIL' in body
2.8 LOCAL_OBFU_VICODIN BODY: Obfuscated 'VICODIN' in body
1.8 LOCAL_OBFU_VIAGRA BODY: Obfuscated 'VIAGRA' in body
1.8 LOCAL_OBFU_MERIDIA BODY: Obfuscated 'MERIDIA' in body
1.8 LOCAL_OBFU_VIOXX BODY: Obfuscated 'VIOXX' in body
1.8 LOCAL_OBFU_XENICAL BODY: Obfuscated 'XENICAL' in body
-0.0 BAYES_44 BODY: Bayesian spam probability is 44 to 50%
[score: 0.4966]
1.5 DRUGS_ERECTILE_OBFU Obfuscated reference to an erectile drug
1.0 DRUGS_ANXIETY_OBFU Obfuscated reference to an anxiety control drug
1.0 DRUGS_ERECTILE Refers to an erectile drug
0.0 DRUGS_ANXIETY Refers to an anxiety control drug
1.0 DRUGS_PAIN_OBFU Obfuscated reference to a pain relief drug
0.0 DRUGS_DEPRESSION Refers to an antidepressant
0.0 DRUGS_PAIN Refers to a pain relief drug
0.0 DRUGS_DIET Refers to a diet drug
1.0 DRUGS_DEPR_EREC Refers to both an erectile and an antidepressant
1.0 DRUGS_ANXIETY_EREC Refers to both an erectile and an anxiety drug
1.0 DRUGS_PAIN_EREC Refers to both an erectile and a painkiller
0.5 DRUGS_DIET_PAIN Refers to both a diet drug and a pain drug
1.0 DRUGS_DIET_EREC Refers to both an erectile and a diet drug
1.0 DRUGS_MANYKINDS Refers to at least four kinds of drugs
Content analysis details: (9.3 points, 4.6 required)
pts rule name description
---- ---------------------- ------------------------------------------------
--
0.6 RATWR10a_MESSID Message-ID has ratware pattern (HEXHEX.HEXHEX@)
3.3 SARE_SUB_ONLINE_OB subject has obfuscated spammer topic
1.7 BAYES_80 BODY: Bayesian spam probability is 80 to 90%
[score: 0.8257]
1.7 SARE_SPEC_ANUMA URI: Domain with ALPHAs NUMBERs APLHAs