http://bugzilla.spamassassin.org/show_bug.cgi?id=3661





------- Additional Comments From [EMAIL PROTECTED]  2005-03-14 13:17 -------
Subject: Re:  Request for HTML de-obfuscation of invisible SPAN's

On Sun, Mar 13, 2005 at 11:39:04PM -0800, [EMAIL PROTECTED] wrote:
> In passing, I don't understand the general reluctance to add new rule types.

The short version is that adding a rule type is pretty non-trivial.  So having
multiple rule types which are all basically the same except only different in
small ways (such as use array2 instead of array1) is really a pain and causes
a whole bunch of overhead.

> > However, the rule is horrible as a spam detector:
> >
> >   4.187   4.0333   5.2863    0.433   0.00    0.01  T_HTML_INVIS_TEXT
> 
> That amazes me.  I wonder what kind of things are invisible in that ham
> mail?  Are these newsletter type things, or Word HTML output?  Or is there
> just normally some hidden text in most all HTML?
> 
> Maybe there are 2-3 really common hidden things in HTML, and after excepting
> them, the results would improve?

I haven't researched too much into this, but there were samples with
tracking information at the bottom, usually in size 1 font, etc.
Mails from Monster, Sierra Games (now Vivendi Universal), BBC, CNET, etc.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to