Martin Panter added the comment:
The main backward compatibility consideration would be Issue 754016, but don’t
agree with the changes made, and would support reverting them. The original bug
reporter wanted urlparse("1.2.3.4:80", "http") to be treated as the URL
http://1.2.3.4:80, but the IP address was being parsed as a scheme, so the
default “http” scheme was ignored.
The original fix (r83701) affected any URL that had a digit 0–9 immediately
after the “scheme:” prefix. In such URLs, the scheme component was no longer
parsed. A test case for “path:80” was added, and a demonstration of not parsing
any scheme from www.cwi.nl:80/%7Eguido/Python.html was added in the
documentation.
Later, the logic was altered to test if the URL looked like an integer
(revision 495d12196487, Issue 11467). This restored proper parsing of
clsid:85bbd92o-42a0-1o69-a2e4-08002b30309d and mailto:[email protected],
although another URL given, javascript:123, remains misparsed. The
documentation was subsequently adjusted in Issue 16932 to just demonstrate
www.cwi.nl/%7Eguido/Python.html being parsed as a path.
The logic was watered down to its current form by revision 9f6b7576c08c, Issue
14072. Now it tests for a non-digit anywhere after the scheme, so that
tel:+31641044153 is again parsed properly. But it was pointed out that tel:1234
remains misparsed.
What’s the next step in the watering-down process? All the attempts so far
break valid URLs in favour of special-casing inputs that are not valid URLs.
----------
nosy: +martin.panter, orsenthil
versions: +Python 2.7, Python 3.5, Python 3.6
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue27657>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com