I don't know if anyone is currently running the latest version of AlliGate
(formerly known as SpamManager) for Declude/IMail, but I have been running
if for the last week or so, and it has a bunch of new features and spam
tests that have greatly increased it's ability to flag spam.

The discussion about excess HTML tags (fake or legit) in e-mail messages may
benefit from a couple of the new tests incorporated into AlliGate.  One of
these tests helps to detect e-mail messages that have a large html to text
ratio.  Here is the pertinent part of the AlliGate manual that explains how
this test works:

Many messages have HTML formatting to make them more interesting and
readable by the end user. Of course, this includes spam as well. Some spam
messages have a higher degree of HTML specific tags and content than other
non-spam messages. SpamManager calculates the ratio of HTML related content
to actual, readable, text and a percentage is calculated. Our research
indicates that as the percentage of HTML/text reached values in excess of
55%, the likelihood of the message being spam increases. This is a
sliding-scale test and the penalty increases as the ratio increases above
the base percentage. The base percentage can be adjusted to suit your needs.

As well as a compression test that works pretty slick:

Many spam messages contain text that is repeated numerous times, such as
repeating HTML tags and URL's. This means that when applying a compression
algorithm to the message, much like is done with ZIP files, that the more a
message can be compressed, the more likely it is to be spam. SpamManager
applies a fast, low overhead, proprietary compression technique that is
optimized for text messages and calculates the amount of compression
achieved. Our research has shown that as a message's compression increases
above 40%, so does its probability of being spam. This is a sliding-scale
test and the penalty increases as the amount of compression increases above
the base percentage. The base percentage can be adjusted to suit your needs.

These are in addition to about a half dozen other spam tests that have been
added to the release version of AlliGate.  You may want to take a look at
www.alligate.com.  Overall, it has been a very nice additional plug-in to
our Declude/Sniffer/SpamCheck spam filtering system.


----- Original Message ----- 
From: "R. Scott Perry" <[EMAIL PROTECTED]>
Sent: Friday, June 06, 2003 5:07 AM
Subject: Re: [Declude.JunkMail] Request for new/enhanced feature

> >I keep getting mail that slipps through that IMO shouldn't be that
> >hard to catch really...
> <G>
> >They use a variant of the html comments but
> >the way they do it it don't get detected as a mail with to many html
> >comments.
> Correct.  Because if Declude JunkMail were count all the HTML tags, then
> all that Microsoft Word E-mail Garbage (those one line E-mails that turn
> into 10K E-mails) would get caught, and a lot of other legitimate HTML
> E-mail would get caught, too.
> >Below is a snippet of example text inside the html formated e-mail :
> >
> >P<k73ch7b1tddy>en<kqjezab3w79ej>is
> >En<kpv36t91gfs2>larg<ktwn2sd3kn7tq>eme<k63uv4i3njxxc>nt
> >Pi<kxl9qjl2r3ervk>ll On The
> >Ma<k9jgo17u5v244>rke<kth2amv3m1s>t!</font></font></font></b><font
> >face="Arial,Helvetica"></font>
> ><p><font face="Arial,Helvetica">* G<ksfvuh135aju042>ai<kndkb4w1ppwy192>n
> >3<kbq72kb2dv2xsd2>+ Full In<kn46ft9yw8p>ch<kwhb2wy27wls3>es In
> >Leng<ka4vte11x26Leng<ka4vte11x26w>th</font>
> ><br><font face="Arial,Helvetica">* Ex<kcay5sz12le0>pand Your
> >Pe<kt70s753udaio49>nis Up To 20<kh3tfh82ejp1>%
> >
> >Basically remove the <xxxxx> junk and you get the text.
> That's exactly what the latest beta version does, so you can filter on it.
> >Since these are "invalid" html comments most e-mail clients just simply
> >ignore the
> >"comment" text all together since it has the <> around the text.
> Technically, these aren't invalid HTML comments, they are made-up HTML
> (which could be valid in the future).  That's the problem.  The only way
> tell whether a tag is valid or not is to have a database of valid tags,
> which would be very expensive (CPU time, storage space, man-hours to
> the data and update it, false positives, etc.).  If I recall correctly,
> HTML isn't even covered by the RFCs, which makes it more difficult to
> assess the tags.
> >IMO this should also have failed HTMLCOMMENTS  which it did not.
> >So my question.. Would it be possible to add the above "junk" as
> >detected html comment ?
> In this case, we could say "OK, '<k73ch7b1tddy>' is a bogus HTML tag.  And
> '<ksfvuh135aju042>' is a bogus HTML tag.  And...", but a spammer could get
> around that simply by making another fake tag.
> So the only alternatives seem to be either [1] Count all HTML tags and
> catch legitimate E-mail, or [2] Keep a database of HTML tags.
>                                                     -Scott
> ---
> Declude JunkMail: The advanced anti-spam solution for IMail mailservers.
> Declude Virus: Catches known viruses and is the leader in mailserver
> vulnerability detection.
> Find out what you have been missing: Ask for a free 30-day evaluation.
> ---
> [This E-mail was scanned for viruses by Declude Virus
> ---
> This E-mail came from the Declude.JunkMail mailing list.  To
> unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
> type "unsubscribe Declude.JunkMail".  The archives can be found
> at http://www.mail-archive.com.

[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]

This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail".  The archives can be found
at http://www.mail-archive.com.

Reply via email to