On 19 Sep 2015, at 10:51, AK wrote:
Hi all.
I'm getting hit with lots of JUNK mail that has multiple lines with
just a '.' on several lines [0]. Most of the JUNK email has at least
5 and at most 10 lines (so far) with just this '.' character somewhere
in the middle of the message.
I've copied the message source to RegexBuddy [1] and have been able to
come up with a regex that matches what I want using the Perl 5.20
engine:
(^\.\n){5,}
However, adding this rule to /etc/spamassassin/local.cf doesn't hit at
all when I run it against my test message as follows:
===== Start Rule Block =====
rawbody __MANY_PERIODS_1 ALL =~ /(^\.\n){5,}/
meta MANY_PERIODS __MANY_PERIODS_1
score MANY_PERIODS 2.0
describe MANY_PERIODS JUNK mail with several lines that contain single
dot
===== End Rule Block =====
===== Begin Test Command =====
spamassassin -L -t test.msg
===== End Test Command =====
Please help me understand what I'm doing wrong as this is my first
attempt at creating a rule. Previously I've just copied and pasted
what I've found here in the forums, but this time I'm trying to do it
myself but failing.
There are multiple issues...
0. I have no basis to criticize RegexBuddy specifically but as a general
principle, that class of tool is usually more of a hindrance than an aid
for understanding what you're doing with regular expressions. If you're
using SA for anything more than your personal email (i.e. if you're
managing a mail system that uses SA) you really need to learn regular
expressions well enough to write them yourself.
1. As Benny noted, the '=~' isn't used in rawbody or body rules. It is
the Perl regex-match operator that is used in header rules between the
name of the header to be checked and the regex to be matched. I think
'spamassassin --lint' would have identified that as bogus, and it is
always good practice to run that after adding new rules.
2. The 'meta' rule structure is pointlessly complex (but see (4) below.)
3. To match across multiple lines, you need the 'm' modifier.
4. You might find it more flexible to make the base rule match '^\.$'
with a tflags setting of 'multiple' and set one or more meta rules for 5
or more hits OR just make the base rule a normal rule with a score and
let the multiple hits add up.