On Thursday, July 25, 2013 03:23:39 AM Karsten Bräckelmann wrote: > On Wed, 2013-07-24 at 20:28 -0400, Ian Turner wrote: > > I notice that the old rule ADDRESS_IN_SUBJECT was dropped starting in > > SpamAssassin 3.3 (The change is in bug 5123 and commit 467038). Lately, > > however, I've started getting a lot of spam again where the To: address is > > in the subject. Perhaps it's time to evaluate restoring this rule? > > Well, how do they score usually? It's hardly worth adding a point if > they are rather high scoring anyway. > > header LOCALPART_IN_SUBJECT eval:check_for_to_in_subject('user') > > And all of them do hit that rule. A super-set of the ADDRESS variant, > using the local part instead of the complete address. Still in stock > rules.
They are moderately low-scoring, sadly (I wouldn't have noticed otherwise!), mainly due to bayes poison. A typical message looks like this: 0.0 NO_DNS_FOR_FROM DNS: Envelope sender has no MX or A DNS records 1.9 DATE_IN_FUTURE_06_12 Date: is 6 to 12 hours after Received: date -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] 0.5 MISSING_MID Missing Message-Id: header 0.8 RDNS_NONE Delivered to internal network by a host with no rDNS 0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid Looking at the code for check_for_to_in_subject, it looks like the regular expression used for LOCALPART_IN_SUBJECT is rather different (much more specific) than the one used for ADDRESS_IN_SUBJECT. Presumably that's why this rule doesn't match. An example subject from this spam (address changed to protect the innocent): <some...@example.com>_Need Approval for Fast Funds? July 24th 2013_ For "address" mode, the regex is this one: /\b\Q$full_to\E\b/i But for "user" mode, the regex is this one: /^(?: (?:re|fw):\s*(?:\w+\s+)?\Q$to\E$ |(?-i:\Q$to\E)\s*[,:;!?-](?:$|\s) |\Q$to\E$ |,\s*\Q$to\E[,:;!?-]$ )/ix Among other restrictions, this regex seems to only match the username at the beginning or end of the subject. --Ian