[EMAIL PROTECTED] (Justin Mason) writes:

> Namely, Theo wrote a plugin which allows rules to be written which

*cough*

Actually, it was written by Felix Bauer and I extended/rewrote it some.

I had a ticket in bugzilla way back when to do what the normalized text
does except on a per-rule basis.  However, it would have been expensive.
The normal rules idea is not nearly as expensive, but is somewhat less
flexible and flexibility is important so you can tune each rule to
eliminate false positives.  Also, as Justin mentioned, you can't
transform spam garble to a standard format because there are lots of
characters used more than one way (like '|' can be 1, i, l, or a
building block for multi-character representations of letters).

However, you can loosen a regexp up such that it will match most garbled
text and that's what the new ReplaceTags plugin does (in 3.1.0-pre1).
Further, it does all the replacements at start-up time, so they're cheap
in spamd (still an expensive regexp, but it's no worse than any complex
body rule) *and* you can use different replacements for different rules.

Here's the usage:

Mail::SpamAssassin::Plugin::ReplaceTags - tags for SpamAssassin rules

The plugin allows rules to contain regular expression tags to be used in
regular expression rules.  The tags make it much easier to maintain
complicated rules.

Warning: This plugin replies on data structures specific to this version of
SpamAssasin; it is not guaranteed to work with other versions of SpamAssassin.

  loadplugin    Mail::SpamAssassin::Plugin::ReplaceTags

  replace_start <
  replace_end   >

  replace_tag   A       [EMAIL PROTECTED]
  replace_tag   G       [gk]
  replace_tag   I       [il|!1y\?\xcc\xcd\xce\xcf\xec\xed\xee\xef]
  replace_tag   R       [r3]
  replace_tag   V       (?:[vu]|\\\/)
  replace_tag   SP      [\s~_-]

  body          VIAGRA_OBFU     
/(?!viagra)<V>+<SP>*<I>+<SP>*<A>+<SP>*<G>+<SP>*<R>+<SP>*<A>+/i
  describe      VIAGRA_OBFU     Attempt to obfuscate "viagra"

  replace_rules VIAGRA_OBFU

But, wait, there's more!

You can also define "pre", "post", and "inter" tags which are
automatically, placed before each, after each, and between adjacent
tags, respectively.  So, if you wanted, you could define the above
VIAGRA rule like this:

  replace_post RE       +
  replace_inter SP      [\s~_-]*

  body          VIAGRA_OBFU     /<inter W2><post 
RE>(?!viagra)<V><I><A><G><R><SP>/i
  describe      VIAGRA_OBFU     Attempt to obfuscate "viagra"

  replace_rules VIAGRA_OBFU

In case you're not familiar with Perl regexps, the (?!viagra) just
means: don't match if you see plain-text "viagra".  It only matches
obfuscated "viagra".

Daniel

-- 
Daniel Quinlan
http://www.pathname.com/~quinlan/

Reply via email to