Hi,
our Bayes data! It seems that the ASCII artists don't
always change the strings they use for their "art", so things
like "rvwndsho" and "xpoebbcr" started to become statistically
significant in our Bayes data.
We use a specific ruleset against those 'ASCII artists', the
rawbody __SMALL_FONT /font-size:[\s\t ]{1,3}(?:1|2)(?:px|pt|;)/i
rule is part of them. We also look for different gaps between chars
rawbody __GAP_2_CHAR /[a-z][ ]{5}[a-z]/i
rawbody __GAP_3_CHAR /[a-z][ ]{6}[a-z]/i
rawbody __GAP_4_CHAR /[a-z][ ]{7}[a-z]/i
rawbody __GAP_5_CHAR /[a-z][ ]{8}[a-z]/i
rawbody __GAP_6_CHAR /[a-z][ ]{9}[a-z]/i
rawbody __GAP_7_CHAR /[a-z][ ]{10}[a-z]/i
rawbody __GAP_8_CHAR /[a-z][ ]{11}[a-z]/i
rawbody __GAP_9_CHAR /[a-z][ ]{12}[a-z]/i
rawbody __GAP_10_CHAR /[a-z][ ]{13}[a-z]/i
But just using this rules would produce too many false positives, so we
have developed our own ruleset (and I have tried to avoid false positives,
but I am still very happy to get bugfixes)
http://antispam.imp.ch/rules/asciispam.cf
Maybe it is useful for you.
Martin
_______________________________________________
Visit http://www.mimedefang.org and http://www.canit.ca
MIMEDefang mailing list
[email protected]
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang