http://bugzilla.spamassassin.org/show_bug.cgi?id=3427





------- Additional Comments From [EMAIL PROTECTED]  2004-09-28 15:57 -------
Subject: Re:  [review] get_uri_list generates excess uris from this message

On Tue, Sep 28, 2004 at 05:57:41PM -0400, Theo Van Dinter wrote:
> The href version does not work in my testing at all, the MUA considers it a
> relatively link and that fails.

This comes down to a combination of: what MUA, and what browser.  Most of the
MUAs will happily make text/plain "www..." and "http:..." into links, but then
it's up to the browser to figure out what to do from there.  Firefox is the
most permissive, IE generally doesn't work on malformed ones, and Safari is
50/50.

In text/html mode, the "www..." ones are not links and only the href URIs are
followed.  None of the browsers could follow href="www..." as it was
considered a relative URL, but they all at least tried the "http:..." ones.
Same general results as above.

Per my previous comments, I think 2381 fixes most of this, but it also strips
out the valid schemeless URLs (#anchor).  I'd like to see the:

if ($nuri !~ /^[a-z0-9_-]+:/i) {    # no scheme?
  $nuri = "http://".$nuri;          # assume HTTP, as a web browser would
[...]

section just removed.  This will stop us malforming "#anchor" into
"http://#anchor";, while preserving the fact that "#anchor" was included as a
link in the message.  We want get_uri_list() to provide every actual link in
the message, as well as fixing malformations (http:www.spamassassin.org ->
http://www.spamassassin.org, etc,) as well as finding things that MUAs
typically consider a link (www.spamassassin.org).


Here's the details:

Y at front indicates "Yes, it's a link"
N at front indicates "Yes, it's a link, but it doesn't work"
blank means it's not specially considered a link at all

At the end, I specify which browser was brought up, if any.

Outlook Web Access (through Exchange via Safari on Mac OS X)
text/plain
  spamassassin.org
  www.spamassassin.org
Y http:spamassassin.org
Y http:www.spamassassin.org
N http:/spamassassin.org                (safari turns into 
http://localhost/spamassassin.org)
N http:/www.spamassassin.org            (safari turns into 
http://localhost/www.spamassassin.org)
Y http://spamassassin.org
Y http://www.spamassassin.org

text/html
N href-spamassassin.org                 (relative)
  spamassassin.org
N href-www.spamassassin.org             (relative)
  www.spamassassin.org
Y href-http:spamassassin.org
  http:spamassassin.org
Y href-http:www.spamassassin.org
  http:www.spamassassin.org
N href-http:/spamassassin.org           (see above)
  http:/spamassassin.org
N href-http:/www.spamassassin.org       (see above)
  http:/www.spamassassin.org
Y href-http://spamassassin.org
  http://spamassassin.org
Y href-http://www.spamassassin.org
  http://www.spamassassin.org

Outlook Web Access (through Exchange via IE on Windows XP)
text/plain
  spamassassin.org
  www.spamassassin.org
N http:spamassassin.org                 (ie can't handle it)
N http:www.spamassassin.org             (ditto)
N http:/spamassassin.org                (ditto)
N http:/www.spamassassin.org            (ditto)
Y http://spamassassin.org
Y http://www.spamassassin.org

text/html
N href-spamassassin.org                 (relative)
  spamassassin.org
N href-www.spamassassin.org             (relative)
  www.spamassassin.org
N href-http:spamassassin.org            (see above)
  http:spamassassin.org
N href-http:www.spamassassin.org        (see above)
  http:www.spamassassin.org
N href-http:/spamassassin.org           (see above)
  http:/spamassassin.org
N href-http:/www.spamassassin.org       (see above)
  http:/www.spamassassin.org
Y href-http://spamassassin.org
  http://spamassassin.org
Y href-http://www.spamassassin.org
  http://www.spamassassin.org

Outlook Web Access (through Exchange via Firefox on Windows XP)
text/plain
  spamassassin.org
  www.spamassassin.org
Y http:spamassassin.org         (ff rewrites to http://...)
Y http:www.spamassassin.org     (ff rewrites to http://...)
Y http:/spamassassin.org        (ff rewrites to http://...)
Y http:/www.spamassassin.org    (ff rewrites to http://...)
Y http://spamassassin.org
Y http://www.spamassassin.org

text/html
N href-spamassassin.org
  spamassassin.org
N href-www.spamassassin.org
  www.spamassassin.org
Y href-http:spamassassin.org            (see above)
  http:spamassassin.org
Y href-http:www.spamassassin.org        (see above)
  http:www.spamassassin.org
Y href-http:/spamassassin.org           (see above)
  http:/spamassassin.org
Y href-http:/www.spamassassin.org       (see above)
  http:/www.spamassassin.org
Y href-http://spamassassin.org
  http://spamassassin.org
Y href-http://www.spamassassin.org
  http://www.spamassassin.org

Apple Mail (Safari is default browser)
text/plain
  spamassassin.org
Y www.spamassassin.org                  (safari)
Y http:spamassassin.org                 (safari)
Y http:www.spamassassin.org             (safari)
N http:/spamassassin.org                (see above)
N http:/www.spamassassin.org            (see above)
Y http://spamassassin.org               (safari)
Y http://www.spamassassin.org           (safari)

text/html
N href-spamassassin.org                 (relative)
  spamassassin.org
N href-www.spamassassin.org             (relative)
  www.spamassassin.org
Y href-http:spamassassin.org            (safari, firefox)
  http:spamassassin.org
Y href-http:www.spamassassin.org        (safari, firefox)
  http:www.spamassassin.org
N href-http:/spamassassin.org           (safari doesn't work, firefox does)
  http:/spamassassin.org
N href-http:/www.spamassassin.org       (safari doesn't work, firefox does)
  http:/www.spamassassin.org
Y href-http://spamassassin.org          (safari, firefox)
  http://spamassassin.org
Y href-http://www.spamassassin.org      (safari, firefox)
  http://www.spamassassin.org

Outlook Express 6 (windows XP, firefox is default browser)
text/plain
  spamassassin.org
Y www.spamassassin.org                  (firefox)
Y http:spamassassin.org                 (firefox)
Y http:www.spamassassin.org             (firefox)
Y http:/spamassassin.org                (firefox)
Y http:/www.spamassassin.org            (firefox)
Y http://spamassassin.org               (firefox)
Y http://www.spamassassin.org           (firefox)

text/html (OE ignores HTML markup by default)
  href-spamassassin.org
  spamassassin.org
  href-www.spamassassin.org
Y www.spamassassin.org                  (firefox)
  href-http:spamassassin.org
Y http:spamassassin.org                 (firefox)
  href-http:www.spamassassin.org
Y http:www.spamassassin.org             (firefox)
  href-http:/spamassassin.org
Y http:/spamassassin.org                (firefox)
  href-http:/www.spamassassin.org
Y http:/www.spamassassin.org            (firefox)
  href-http://spamassassin.org
Y http://spamassassin.org               (firefox)
  href-http://www.spamassassin.org
Y http://www.spamassassin.org           (firefox)

text/html (forcing OE to interpret HTML)
N href-spamassassin.org                 (ie)
  spamassassin.org
N href-www.spamassassin.org             (ie)
  www.spamassassin.org
N href-http:spamassassin.org            (ie)
  http:spamassassin.org
N href-http:www.spamassassin.org        (ie)
  http:www.spamassassin.org
N href-http:/spamassassin.org           (ie)
  http:/spamassassin.org
N href-http:/www.spamassassin.org       (ie)
  http:/www.spamassassin.org
Y href-http://spamassassin.org          (firefox)
  http://spamassassin.org
Y href-http://www.spamassassin.org      (firefox)
  http://www.spamassassin.org

I don't know why OE does IE sometimes and firefox others here at the end...
It's very strange, but my theory is that OE doesn't quite think it's an HTTP
link, so passes to IE so it can try to figure out what to do.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to