http://bugzilla.spamassassin.org/show_bug.cgi?id=3140
Summary: HTML renderer should "remember" certain attributes of
rendered text
Product: Spamassassin
Version: unspecified
Platform: Other
OS/Version: other
Status: NEW
Severity: enhancement
Priority: P3
Component: Rules (Eval Tests)
AssignedTo: [EMAIL PROTECTED]
ReportedBy: [EMAIL PROTECTED]
Currently there are a number of effective RE rules over either rawbody or
sometimes 'full' looking for certain html attributes. One of the most common
is very small fonts, such as 0/1 point/pixel. Making such a test that works
correctly in all cases using an RE is almost impossible due to the many ways
that a font size tag can be worded. Even one that works in most cases will be
moderately complex, and will take some amount of time to scan the body.
The html renderer presumably parses tags correctly, and probably already knows
if a small font has been used in the message. It probably doesn't really care
since it is rendering to text, but it probably at some level knows. If it
could leave this information lying about, a simple eval test could be made on
this value to determine the presence of small fonts, and assign an appropriate
score.
There are doubtless other html attributes that are currently being detected
with RE rules that the html renderer already found and ignored. Bogus tags and
ending tags might be a couple of possibilities. If the render left flags for
these things lying about, it would reduce the number of REs that have to be run
on the body of the message, and thus increase efficiency.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.