There was some talk a while ago about blocking the tags used to make it
difficult to parse text in spam.

this stuff:

<!--r7-->ou th<!--c6-->ousands o<!--sE-->ver the l<!--S2-->ife of your
lo<!--J2-->an! W<!--cJ-->hat are you
wai<!--dR-->ting for? <!--x2--><br><br><!--TY--><A
href="http://mtggreat1.com/4/index.asp?RefID=588897";>
<font size="5">Vi<!--Ow-->sit N<!--OC-->ow</font></a></font><!--iw-->
</td><!--ZP--></tr><br></table><br><!--vz-->

Scott suggested this would be impracticle since any number of random tags
could be invented and the browser must handle them cleanly so scanning for
each one is useless.  However it would be possible to add a text comparison
function that examines only the text outside of the <>'s
This way you could seperate your rules into those which can work on the
entire html body.  For instance;

CONTAINS 10 <charset=windows-1251>

and those which must ignore html

HTML_CLEAN_CONTAINS 10 pen1s

That way you don't burn CPU cycles parsing HTML tags on stuff thats not
important, but you can spend them in cases where you want to.

Rob Salmond
Ontario Die Company
(519)-576-8950 ext. 132


---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail".  The archives can be found
at http://www.mail-archive.com.

Reply via email to