http://bugzilla.spamassassin.org/show_bug.cgi?id=1987





------- Additional Comments From [EMAIL PROTECTED]  2004-02-05 14:44 -------
> Rather than reinventing the wheel, how about using the Tidy project to check 
> the HTML portion of an e-mail. I know that the project does have a perl 
> module 
> in addition to a library and executable.

I'm not sure that would work well for performance reasons and because we're not
looking for tidyness, but spamminess, and in the world of HTML email, they're
somewhat different.

However, I suspect we could get some kick ass rules if we (or someone, this is
more of a development experiment) could run Tidy on HTML documents for both HTML
spam and HTML ham and compare the relative frequencies of each type of warning.
Any warning with a good spam/overall ratio and a fair hit rate could be turned
into a new high-powered rule.  I've already written some of my own, but Tidy
probably has more.

Anyone interested in working on this experiment?




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to