On 9/27/2012 1:48 PM, Alexandre Boyer wrote:
Alex, from prypiat.
Yes, I recycle.
On 12-09-27 11:09 AM, Bowie Bailey wrote:
On 9/27/2012 10:41 AM, Alexandre Boyer wrote:
Hello all,
Here is a small ruleset that I'm working with. I added it to our
local ruleset in prod:
# BAD LINKS N-NG ;-) ;
# Canada Post
&n
b sp;
uri_detail AJB_CANPOST_BADLINK raw !~ /canadapost\./
text =~ /(?:https?:\/\/(?:www\.)?|www\.)canadapost\./ type =~ /^a$/
describe AJB_CANPOST_BADLINK Found a mismatch
between href and anchored text pretending to link to
www.canadapost.ca
score AJB_CANPOST_BADLINK 1.0
meta AJB_CANPOST_PHISH_BADTRACKNUM Z_CANPOST_BADLINK &&
!Z_CANPOST_TRACKNUM
describe AJB_CANPOST_PHISH_BADTRACKNUM Mismatch between href
and anchored + unofficial tracking number from CanadaPost
score AJB_CANPOST_PHISH_BADTRACKNUM 2.0
#
youtube
&
n bsp;
uri_detail AJB_UTUBE_BADLINK raw !~ /youtube\./ text =~
/(?:https?:\/\/(?:www\.)?|www\.)youtube\./ type =~ /^a$/
describe AJB_UTUBE_BADLINK Found a mismatch between href and
anchored text pretending to link to www.youtube.com
score AJB_UTUBE_BADLINK 0.5
# because of link trackers (from massmailer for example), we must
meta this with other rulz to be sure we face our fake yutube botnet
meta AJB_FK_UTUBE_BOTNET Z_UTUBE_BADLINK && Z_EMPTY_SUBJ
&& MIME_HTML_ONLY
describe AJB_FK_UTUBE_BOTNET mismatch between href and
anchored + empty subject = botnet
score AJB_FK_UTUBE_BOTNET 5.5
## & nbsp;
# TODO: check if we could workwith DKIM, exists:List-Unsubscribe,
SPF_PASS, RCVD_IN_RP_SAFE, RCVD_IN_RP_CERTIFIED and others
# in order to avoid FPs from MassMailers.
Note the TODO ;-)
Don't know if it makes much difference in this case, but...
(?:https?:\/\/(?:www\.)?|www\.)
Should catch:
http://
https://
http://www.
https://www.
www.
can be simplified to:
(?:https?:\/\/|www\.)
While this catches:
http://
https://
www.
Covering less. It's may be overkill, but my regex has one and only
purpose: match any kind of "valid" web link, as per common user
experience (ie. "as seen on TV").
The spammer will try to lure the common user by mimic what the common
user is habituated to see, no?
Check again. "http://www." and "https://www." are caught by the "www."
pattern. Matching the "https?://" as well is not needed. That's why I
mentioned anchoring. If you were anchoring the front of the regexp, you
would need this match. Since you are not, the extra specificity is not
needed. My regexp matches exactly the same strings as yours.
Since you're not anchoring the front of the regexp or trying to
capture the match, the results will be the same.
Not capturing because not using thereafter. On a small system, this
makes no difference. On large systems (millions+ emails filtered a day),
this is probably making a difference. I take a guess here, I don't want
to prove this on my own systems :-)
Right. No need to capture here or in most SA rules. I only mentioned
it since there would be a difference between your original regexp and my
suggestion if you were doing some capturing.
As I said, it may not make any real difference here, I was simply
pointing out the possible simplification of the regexp.
--
Bowie