http://bugzilla.spamassassin.org/show_bug.cgi?id=3891
Summary: Enhancement: do not pass urls from null <a> references to SpamcopUri Product: Spamassassin Version: SVN Trunk (Latest Devel Version) Platform: Other OS/Version: other Status: NEW Severity: enhancement Priority: P3 Component: Rules (Eval Tests) AssignedTo: dev@spamassassin.apache.org ReportedBy: [EMAIL PROTECTED] Observing a recent spam: <A href="http://www.ethylene.org"></A>ail re<A href="http://www.divest.org"></A>mov<A href="http://www.tailor.org"></A>a<A href="http://www.english.org"></A>l, g<A href="http://www.stimuli.org"></A>o <A href="http://www.riterates.net/book.php">here.</A></FONT></P> Observe that all the above urls except the last are useless, because the A tag closes with no content. Only the last URL is real and is the spam domain. Probably the leading urls are fakes, also. There are obviously a number of ways to code a null <a> tag, and the above is only the simplest. I would suggest that the html renderer learn about A tag visisble content (if it doesn't know that already), and be tied to the URL stripper, in such a way that a URL can be tagged if it is completely useless if the A tag content is completely null (not just spaces) after rendering. (Or alternately if the rendered content is less than NxM pixels, where N,M < {small number}). There tagged URLs can then be deleted by the Spamcop plugin (or anyone else that cares) rather than being sent out to steal bandwidth and load DNS servers with useless bogus URLs. Note I am NOT suggesting these null URLs be dropped completely! There can be good reason to know about them both in normal rules and in plugins. I'm only suggesting the addition of a "useless url" tag to each appropriate URL, and an addition to the Spamcop plugin to drop such urls rather than processing them. Ideally there would be some easy well-defined way to get to the 'useless url' flag in conjunction with a url, even in normal rule processing. Perhaps some different kind of rule tag to specify defunct or live urls for the rule, as opposed to the current 'uri', which would pick up all urls, as it currently does. Or, if you don't care for new rule bases, some kind of modifier on uri. Or some such. ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.