Tony White wrote:
> nothing seems to stop this Subject passing through
> all the filters in QMT.
>
> P_E N-I_S --E-N..L_A-R-G-E_M-E N-T.._ P_I-L L_S

The regex:

    P(\.\.|_| |-)E(\.\.|_| |-)N(\.\.|_| |-)I(\.\.|_| |-)S

applied to the Subject should get pretty much all of them.

Looking at my records, it seems that a typical SpamAssassin result for one
of these looks like:

  * -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
  *      trust
  *      [208.72.237.26 listed in list.dnswl.org]
  *  3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100%
  *      [score: 1.0000]
  *  0.0 HTML_MESSAGE BODY: HTML included in message
  *  0.0 HTML_FONT_LOW_CONTRAST BODY: HTML font color similar or identical to
  *       background
  *  0.7 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
  *  2.0 GAPPY_SUBJECT Subject: contains G.a.p.p.y-T.e.x.t
  *  0.8 RDNS_NONE Delivered to internal network by a host with no rDNS
  *  1.0 BODY_URI_ONLY Message body is only a URI in one line of text or for
  *      an image

It looks like the bulk of the work is being done by BAYES_99, so maybe
when you've seen a few more of them and trained your SpamAssassin against
them, you'll start seeing more tagged as spam.

I don't know if there's a SpamAssassin rule that checks for 'excessive use
of the HTML entities', particularly in URLs, but if there was then
something like:

   http://яфаич&#1095 ...

really ought to set it off.

Angus


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to