Hiyo!

HAving a problem. Maybe with the RE, or with SA processing HTML....
Set-up a rule to look for any.garbage.anything.stuff.realdomain.tld -
anything with a bunch of bogus host names on front:

uri LOC_HTMLLONGHREF      /(?:[^"\/\.]+\.){5,}[^"\/\.]+/i
describe LOC_HTMLLONGHREF href has too many 'hostnames' in domain
score LOC_HTMLLONGHREF    0.5

If I got it right, the RE says:
Look for five or more occurences of 
  ( one or more of anything except a " or / or .  
    followed by a . )
All followed by a string of anything except a " or / or .

Yet it triggered on what looks like a regular URL. There is
definitely something 'weird' about the mail. It has a 'quoted
printable' part, which repeats the plain text mail, and it includes a 
couple of '3D' codes in places where I think they mess up the
scanning of HTML and URI's by SA? A new spammer trick?

Relevant e-mail bits:

X-Spam-Status: No, hits=3.0 required=3.5 autolearn=no tests=HTML_20_30=0.474,
        HTML_MESSAGE=0.001,LOC_HTMLLONGHREF=0.5,NO_DNS_FOR_FROM=1.105,
        PRIORITY_NO_NAME=0.831,RCVD_IN_SORBS=0.1

This is how the HTML is quoted when I just hit my 'reply' in PINE:

!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
HTML>HEAD>
META http-equiv=Content-Type content="text/html; charset=utf-8">
STYLE></STYLE>
/HEAD>
BODY bgColor=#ffffff>
Hello, [EMAIL PROTECTED]>

These btihces don't want to fuck with you? Show WHO THE REAL MAN
IS!BR>BR>BR>
JOIN NOW and FORCE THEM!BR>BR>
Get to know about REAL BURTE RAPE!BR>BR>
Tons of donwoldabale moives, phtoos and stroies!BR>BR>
A href="http://www.sunnymail.info?id=1094";>down mouse button there/A>
dubtzkau ckylcohoup. dnodnad.BR>BR>BR>nylun upybreube; xiuramiri
tigawemus
mokmroa gblcla.>
wortrvanenurm mnogicagaz- racsuzwjigihaa.eowhifr. ygieml ndmbd.BR>BR>
nhenugmogmp quogarnulirr; usbuoei eregespaw.BR><Bhuutrj. topoct-
aptyrtli.BR>
lmlyisfa.BR>BR>
aeghuater hsckhmaj dduknioev- jabulet misairuurf.br>BR>
doidazi.BR>BR>/BODY>/HTML>

I've snipped all the left-angle brackets to avoid having anyone's e-mail
parse the HTML, but I notice there is a right angle bracket and a left
bracket in the obfuscating text, in the last seven lines, above.

Now here is what it looks like when I edit may mailbox as raw text.
Notice the '3D' on the 'meta' line, and in the 'href'.....

META http-equiv=3DContent-Type content=3D"text/html; charset=3Dutf-8">
STYLE>/STYLE>
/HEAD>
BODY bgColor=3D#ffffff>
Hello, [EMAIL PROTECTED]>

These btihces don't want to fuck with you? Show WHO THE REAL MAN =
IS!BR>BR>BR>
JOIN NOW and FORCE THEM!BR>BR>
Get to know about REAL BURTE RAPE!BR>BR>
Tons of donwoldabale moives, phtoos and stroies!BR>BR>
<A href=3D"http://www.sunnymail.info?id=3D1094";>down mouse butto=
n there/A>
dubtzkau ckylcohoup. dnodnad.BR>BR>BR>=

Gee, does the line-break "=" in the middle of the URI mess up the test?
IF this is a spammer trick, I'm thinking that my 'longhref'
test may be doing a better job than I ever intended.... :-)

- Charles

Reply via email to