http://bugzilla.spamassassin.org/show_bug.cgi?id=3427
------- Additional Comments From [EMAIL PROTECTED] 2004-09-28 15:57 -------
Subject: Re: [review] get_uri_list generates excess uris from this message
On Tue, Sep 28, 2004 at 05:57:41PM -0400, Theo Van Dinter wrote:
> The href version does not work in my testing at all, the MUA considers it a
> relatively link and that fails.
This comes down to a combination of: what MUA, and what browser. Most of the
MUAs will happily make text/plain "www..." and "http:..." into links, but then
it's up to the browser to figure out what to do from there. Firefox is the
most permissive, IE generally doesn't work on malformed ones, and Safari is
50/50.
In text/html mode, the "www..." ones are not links and only the href URIs are
followed. None of the browsers could follow href="www..." as it was
considered a relative URL, but they all at least tried the "http:..." ones.
Same general results as above.
Per my previous comments, I think 2381 fixes most of this, but it also strips
out the valid schemeless URLs (#anchor). I'd like to see the:
if ($nuri !~ /^[a-z0-9_-]+:/i) { # no scheme?
$nuri = "http://".$nuri; # assume HTTP, as a web browser would
[...]
section just removed. This will stop us malforming "#anchor" into
"http://#anchor", while preserving the fact that "#anchor" was included as a
link in the message. We want get_uri_list() to provide every actual link in
the message, as well as fixing malformations (http:www.spamassassin.org ->
http://www.spamassassin.org, etc,) as well as finding things that MUAs
typically consider a link (www.spamassassin.org).
Here's the details:
Y at front indicates "Yes, it's a link"
N at front indicates "Yes, it's a link, but it doesn't work"
blank means it's not specially considered a link at all
At the end, I specify which browser was brought up, if any.
Outlook Web Access (through Exchange via Safari on Mac OS X)
text/plain
spamassassin.org
www.spamassassin.org
Y http:spamassassin.org
Y http:www.spamassassin.org
N http:/spamassassin.org (safari turns into
http://localhost/spamassassin.org)
N http:/www.spamassassin.org (safari turns into
http://localhost/www.spamassassin.org)
Y http://spamassassin.org
Y http://www.spamassassin.org
text/html
N href-spamassassin.org (relative)
spamassassin.org
N href-www.spamassassin.org (relative)
www.spamassassin.org
Y href-http:spamassassin.org
http:spamassassin.org
Y href-http:www.spamassassin.org
http:www.spamassassin.org
N href-http:/spamassassin.org (see above)
http:/spamassassin.org
N href-http:/www.spamassassin.org (see above)
http:/www.spamassassin.org
Y href-http://spamassassin.org
http://spamassassin.org
Y href-http://www.spamassassin.org
http://www.spamassassin.org
Outlook Web Access (through Exchange via IE on Windows XP)
text/plain
spamassassin.org
www.spamassassin.org
N http:spamassassin.org (ie can't handle it)
N http:www.spamassassin.org (ditto)
N http:/spamassassin.org (ditto)
N http:/www.spamassassin.org (ditto)
Y http://spamassassin.org
Y http://www.spamassassin.org
text/html
N href-spamassassin.org (relative)
spamassassin.org
N href-www.spamassassin.org (relative)
www.spamassassin.org
N href-http:spamassassin.org (see above)
http:spamassassin.org
N href-http:www.spamassassin.org (see above)
http:www.spamassassin.org
N href-http:/spamassassin.org (see above)
http:/spamassassin.org
N href-http:/www.spamassassin.org (see above)
http:/www.spamassassin.org
Y href-http://spamassassin.org
http://spamassassin.org
Y href-http://www.spamassassin.org
http://www.spamassassin.org
Outlook Web Access (through Exchange via Firefox on Windows XP)
text/plain
spamassassin.org
www.spamassassin.org
Y http:spamassassin.org (ff rewrites to http://...)
Y http:www.spamassassin.org (ff rewrites to http://...)
Y http:/spamassassin.org (ff rewrites to http://...)
Y http:/www.spamassassin.org (ff rewrites to http://...)
Y http://spamassassin.org
Y http://www.spamassassin.org
text/html
N href-spamassassin.org
spamassassin.org
N href-www.spamassassin.org
www.spamassassin.org
Y href-http:spamassassin.org (see above)
http:spamassassin.org
Y href-http:www.spamassassin.org (see above)
http:www.spamassassin.org
Y href-http:/spamassassin.org (see above)
http:/spamassassin.org
Y href-http:/www.spamassassin.org (see above)
http:/www.spamassassin.org
Y href-http://spamassassin.org
http://spamassassin.org
Y href-http://www.spamassassin.org
http://www.spamassassin.org
Apple Mail (Safari is default browser)
text/plain
spamassassin.org
Y www.spamassassin.org (safari)
Y http:spamassassin.org (safari)
Y http:www.spamassassin.org (safari)
N http:/spamassassin.org (see above)
N http:/www.spamassassin.org (see above)
Y http://spamassassin.org (safari)
Y http://www.spamassassin.org (safari)
text/html
N href-spamassassin.org (relative)
spamassassin.org
N href-www.spamassassin.org (relative)
www.spamassassin.org
Y href-http:spamassassin.org (safari, firefox)
http:spamassassin.org
Y href-http:www.spamassassin.org (safari, firefox)
http:www.spamassassin.org
N href-http:/spamassassin.org (safari doesn't work, firefox does)
http:/spamassassin.org
N href-http:/www.spamassassin.org (safari doesn't work, firefox does)
http:/www.spamassassin.org
Y href-http://spamassassin.org (safari, firefox)
http://spamassassin.org
Y href-http://www.spamassassin.org (safari, firefox)
http://www.spamassassin.org
Outlook Express 6 (windows XP, firefox is default browser)
text/plain
spamassassin.org
Y www.spamassassin.org (firefox)
Y http:spamassassin.org (firefox)
Y http:www.spamassassin.org (firefox)
Y http:/spamassassin.org (firefox)
Y http:/www.spamassassin.org (firefox)
Y http://spamassassin.org (firefox)
Y http://www.spamassassin.org (firefox)
text/html (OE ignores HTML markup by default)
href-spamassassin.org
spamassassin.org
href-www.spamassassin.org
Y www.spamassassin.org (firefox)
href-http:spamassassin.org
Y http:spamassassin.org (firefox)
href-http:www.spamassassin.org
Y http:www.spamassassin.org (firefox)
href-http:/spamassassin.org
Y http:/spamassassin.org (firefox)
href-http:/www.spamassassin.org
Y http:/www.spamassassin.org (firefox)
href-http://spamassassin.org
Y http://spamassassin.org (firefox)
href-http://www.spamassassin.org
Y http://www.spamassassin.org (firefox)
text/html (forcing OE to interpret HTML)
N href-spamassassin.org (ie)
spamassassin.org
N href-www.spamassassin.org (ie)
www.spamassassin.org
N href-http:spamassassin.org (ie)
http:spamassassin.org
N href-http:www.spamassassin.org (ie)
http:www.spamassassin.org
N href-http:/spamassassin.org (ie)
http:/spamassassin.org
N href-http:/www.spamassassin.org (ie)
http:/www.spamassassin.org
Y href-http://spamassassin.org (firefox)
http://spamassassin.org
Y href-http://www.spamassassin.org (firefox)
http://www.spamassassin.org
I don't know why OE does IE sometimes and firefox others here at the end...
It's very strange, but my theory is that OE doesn't quite think it's an HTTP
link, so passes to IE so it can try to figure out what to do.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.