http://bugzilla.spamassassin.org/show_bug.cgi?id=3891

           Summary: Enhancement: do not pass urls from null <a> references
                    to SpamcopUri
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: enhancement
          Priority: P3
         Component: Rules (Eval Tests)
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: [EMAIL PROTECTED]


Observing a recent spam:

<A href="http://www.ethylene.org";></A>ail re<A 
href="http://www.divest.org";></A>mov<A 
href="http://www.tailor.org";></A>a<A 
href="http://www.english.org";></A>l, g<A 
href="http://www.stimuli.org";></A>o <A 
href="http://www.riterates.net/book.php";>here.</A></FONT></P>

Observe that all the above urls except the last are useless, because the A tag 
closes with no content.  Only the last URL is real and is the spam domain. 
Probably the leading urls are fakes, also.

There are obviously a number of ways to code a null <a> tag, and the above is 
only the simplest.  I would suggest that the html renderer learn about A tag 
visisble content (if it doesn't know that already), and be tied to the URL 
stripper, in such a way that a URL can be tagged if it is completely useless if 
the A tag content is completely null (not just spaces) after rendering.  (Or 
alternately if the rendered content is less than NxM pixels, where N,M < {small 
number}).

There tagged URLs can then be deleted by the Spamcop plugin (or anyone else 
that cares) rather than being sent out to steal bandwidth and load DNS servers 
with useless bogus URLs.

Note I am NOT suggesting these null URLs be dropped completely!  There can be 
good reason to know about them both in normal rules and in plugins.  I'm only 
suggesting the addition of a "useless url" tag to each appropriate URL, and an 
addition to the Spamcop plugin to drop such urls rather than processing them.

Ideally there would be some easy well-defined way to get to the 'useless url' 
flag in conjunction with a url, even in normal rule processing.  Perhaps some 
different kind of rule tag to specify defunct or live urls for the rule, as 
opposed to the current 'uri', which would pick up all urls, as it currently 
does.  Or, if you don't care for new rule bases, some kind of modifier on uri.  
Or some such.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to