[Bug 5780] URI processing turns uuencoded strings into http URI's which then causes FPs

bugzilla-daemon Thu, 17 Jan 2008 14:11:35 -0800

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5780






------- Additional Comments From [EMAIL PROTECTED]  2008-01-17 14:11 -------
(In reply to comment #5)
> (In reply to comment #4)
> > Perhaps the answer is to find a way to keep the extraction of host names 
> > for RBL
> > lookups liberal, but make rules that use the full URL be stricter.
> 
> FWIW, we have some of this by the uri_detail data.  At least, we can tell if
the URL came from a HTML 
> tag (and which one) versus being parsed out of the text/plain, or both.  It
doesn't track being parsed w/ 
> protocol versus "likely domain", etc.
> 
> uri_detail is already used in the URIDNSBL plugin, for example, to give a
different lookup priority to 
> parsed domains versus HTML referenced domains.

+1 to making WIERD_PORT stricter so that it doesn't FP in this case.

if there's any changes made to the URI extraction algorithms, let's
ensure we come up with a set of test cases that it both should, and shouldn't
extract; test-driven development is the way to do this IMO.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5780] URI processing turns uuencoded strings into http URI's which then causes FPs

Reply via email to