http://bugzilla.spamassassin.org/show_bug.cgi?id=3891
Summary: Enhancement: do not pass urls from null <a> references
to SpamcopUri
Product: Spamassassin
Version: SVN Trunk (Latest Devel Version)
Platform: Other
OS/Version: other
Status: NEW
Severity: enhancement
Priority: P3
Component: Rules (Eval Tests)
AssignedTo: [email protected]
ReportedBy: [EMAIL PROTECTED]
Observing a recent spam:
<A href="http://www.ethylene.org"></A>ail re<A
href="http://www.divest.org"></A>mov<A
href="http://www.tailor.org"></A>a<A
href="http://www.english.org"></A>l, g<A
href="http://www.stimuli.org"></A>o <A
href="http://www.riterates.net/book.php">here.</A></FONT></P>
Observe that all the above urls except the last are useless, because the A tag
closes with no content. Only the last URL is real and is the spam domain.
Probably the leading urls are fakes, also.
There are obviously a number of ways to code a null <a> tag, and the above is
only the simplest. I would suggest that the html renderer learn about A tag
visisble content (if it doesn't know that already), and be tied to the URL
stripper, in such a way that a URL can be tagged if it is completely useless if
the A tag content is completely null (not just spaces) after rendering. (Or
alternately if the rendered content is less than NxM pixels, where N,M < {small
number}).
There tagged URLs can then be deleted by the Spamcop plugin (or anyone else
that cares) rather than being sent out to steal bandwidth and load DNS servers
with useless bogus URLs.
Note I am NOT suggesting these null URLs be dropped completely! There can be
good reason to know about them both in normal rules and in plugins. I'm only
suggesting the addition of a "useless url" tag to each appropriate URL, and an
addition to the Spamcop plugin to drop such urls rather than processing them.
Ideally there would be some easy well-defined way to get to the 'useless url'
flag in conjunction with a url, even in normal rule processing. Perhaps some
different kind of rule tag to specify defunct or live urls for the rule, as
opposed to the current 'uri', which would pick up all urls, as it currently
does. Or, if you don't care for new rule bases, some kind of modifier on uri.
Or some such.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.