On Thursday, July 25, 2013 03:23:39 AM Karsten Bräckelmann wrote:
> On Wed, 2013-07-24 at 20:28 -0400, Ian Turner wrote:
> > I notice that the old rule ADDRESS_IN_SUBJECT was dropped starting in
> > SpamAssassin 3.3 (The change is in bug 5123 and commit 467038). Lately,
> > however, I've started getting a lot of spam again where the To: address is
> > in the subject. Perhaps it's time to evaluate restoring this rule?
> 
> Well, how do they score usually? It's hardly worth adding a point if
> they are rather high scoring anyway.
> 
>   header LOCALPART_IN_SUBJECT    eval:check_for_to_in_subject('user')
> 
> And all of them do hit that rule. A super-set of the ADDRESS variant,
> using the local part instead of the complete address. Still in stock
> rules.

They are moderately low-scoring, sadly (I wouldn't have noticed otherwise!), 
mainly due to bayes poison. A typical message looks like this:

  0.0 NO_DNS_FOR_FROM        DNS: Envelope sender has no MX or A DNS records
  1.9 DATE_IN_FUTURE_06_12   Date: is 6 to 12 hours after Received: date
 -1.9 BAYES_00               BODY: Bayes spam probability is 0 to 1%
                             [score: 0.0000]
  0.5 MISSING_MID            Missing Message-Id: header
  0.8 RDNS_NONE              Delivered to internal network by a host with no 
rDNS
  0.0 T_DKIM_INVALID         DKIM-Signature header exists but is not valid

Looking at the code for check_for_to_in_subject, it looks like the regular 
expression used for LOCALPART_IN_SUBJECT is rather different (much more 
specific) than the one used for ADDRESS_IN_SUBJECT. Presumably that's why this 
rule doesn't match.

An example subject from this spam (address changed to protect the innocent):
<some...@example.com>_Need Approval for Fast Funds? July 24th 2013_

For "address" mode, the regex is this one: /\b\Q$full_to\E\b/i
But for "user" mode, the regex is this one:
    /^(?:
        (?:re|fw):\s*(?:\w+\s+)?\Q$to\E$
        |(?-i:\Q$to\E)\s*[,:;!?-](?:$|\s)
        |\Q$to\E$
        |,\s*\Q$to\E[,:;!?-]$
    )/ix

Among other restrictions, this regex seems to only match the username at the 
beginning or end of the subject.

--Ian

Reply via email to