On 04/11/2010 8:11 PM, Karsten Bräckelmann wrote:
Moving back on-list, since it doesn't appear to be personally directed
at me.
On Thu, 2010-11-04 at 19:22 -0230, Lawrence @ Rogers wrote:
On 04/11/2010 7:13 PM, Karsten Bräckelmann wrote:
No, that requires the Subject to consist of exactly one whitespace.
Read it out load. The ^ beginning of the string, followed by exactly one
whitespace char [2]. Followed by the $ end of the string.
No offense, but I am a C and PHP programmer and Perl's documentation is
lacking, to put it politely. Too much theory and far too few actual real
world examples.
This is not about Perl, but Regular Expressions. The much more feature-
rich (and widely adopted) Perl flavor, out of all the existing variants.
But that's actually irrelevant in this case, cause you would need a very
limited sub-set only, pretty much available in any tool sporting REs.
Any introduction to REs would do, no need to tend to the Perl docs you
don't like. Though it sounds like you didn't even had a look at the docs
I pointed you to.
That is exactly what I am trying to match, and according to my tests, it
works as expected. When the To and Subject are empty, all that's there
(before the newline) is one whitespace.
Are you referring to the whitespace delimiter between the Header: and
its content? It's not part of the content.
What I am looking to check is a situation where both the To: and
Subject: headers contain nothing at all, but are set (I've seen this in
several spam e-mails recently)
Now you're confusing me. Do you want to match a single whitespace, or a
completely empty header?
If there's a better way of doing this, I would appreciate you providing
an example.
Well, better way... One that does what you just described.
Assuming you want to match "headers containing nothing at all", as per
your previous paragraph. That would be nothing between the beginning and
end.
header __FOO Foo =~ /^$/
Or, negated, not anything.
header __FOO Foo !~ /./
Now, since you specifically constrained this, you might want to check
for the header's existence. Probably not worth it, though. The following
is copied from stock 20_head_tests.cf, and documented in SA Conf.
header __HAS_SUBJECT exists:Subject
Anyway, in cases like these it's best to provide a *raw* sample, showing
the headers in question completely un-munged and exactly as seen by SA.
(Otherwise our help often is limited to guessing and an informal
description.) This prohibits copy-n-paste from your MUA, which too often
changes subtle but important details.
One easy way to come to a conclusion whether you want to match
whitespace or not, is the following ad-hoc header rule with spamassassin
debug. The matching header's contents are shown in double quotes.
spamassassin -D --cf="header FOO To =~ /^.*/"< msg 2>&1 | grep FOO
And just for reference, 'grep' uses REs...
Thanks Karsten,
One of these days when I get some free time, I will be sitting down and
reading up on REs :)
Using your examples, and some hackery, I came up with this. It checks
for the existence of the To header as well, as SA doesn't seem to have a
rule for doing this on it's own (a grep -r "exists:To" * on the rules
pulled in from updates.spamassassin.org produced nothing).
# Message has empty To: and Subject: headers
# Likely spam
header __LW_HAS_TO exists:To
header __LW_EMPTY_TO To =~ /^$/
header __LW_EMPTY_SUBJECT Subject =~ /^$/
meta LW_EMPTY_SUBJECT_TO (__HAS_SUBJECT && __LW_HAS_TO &&
__LW_EMPTY_SUBJECT && __LW_EMPTY_TO)
describe LW_EMPTY_SUBJECT_TO Message has empty To and Subject headers
score LW_EMPTY_SUBJECT_TO 2.5
I added this to my custom .cf rules file and ran spamassassin --lint and
got no complaints. I ran it over a sample spam, and it hit. I took
another spam where both headers had information in them, and it didn't
hit. Guess it works as expected :)
Cheers,
Lawrence