> Is it possible or practical to create a test that will check 
> to see if the body is only HTML, and if so, if the amount of 
> total characters is small, fail?


We have some test's in the latest beta release of SpamChk. (with mime
support)
There are possible results like

        Mail is HTML only
        Mail is quoted printable
        Mail HTML only AND quoted printable encoded

Note: there are several legit mails and specialy autogenerated mails like
newsletters having only a html part. So be carefull and assign only few
points for this.

Kami,
We already remove the html code in SpamChk before searching for Keywords.
The problem is that most "image-only-spams" contains some hidden, or very
small (font size=1px) text that looks very legit or is random to bypass
bayesan filtering.

Looking for html-only mails having few text can catch also legit messages
(As mentoined Sheldon). But we can make some research if it is statistical
demonstrable that much more of this message-type is spam then legit. (in
this case it would be a usable test in a weighting system)

Question:
How often this type of spam will pass your filters?
>From the eight spam's this month that passed our filters and reached my
personal mailbox only one was an image-only spam.

Markus

---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail".  The archives can be found
at http://www.mail-archive.com.

Reply via email to