[issue754016] urlparse goes wrong with IP:port without scheme

Anthony Lenton Sat, 21 Jun 2008 11:46:02 -0700

Anthony Lenton <[EMAIL PROTECTED]> added the comment:

I agree with facundobatista that the patch is bad, but for a different
reason: it now breaks with:


>>> import urlparse
>>> urlparse.urlparse ('http:')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/anthony/svn/python26/Lib/urlparse.py", line 108, in urlparse
tuple = urlsplit(url, scheme, allow_fragments)
  File "/home/anthony/svn/python26/Lib/urlparse.py", line 148, in urlsplit
if i > 0 and not url[i+1].isdigit():
IndexError: string index out of range

I'm afraid that it it's not evident that the expected behavior isn't
evident.

Take for example:

>>> import urlparse
>>> urlparse.urlparse('some.com', 'http')
ParseResult(scheme='http', netloc='', path='some.com', params='',
query='', fragment='')

Is the url referring to the some.com domain or to a windows executable file?

If you're using urlparse to parse only absolute urls then probably you
want the first component to be considered a net_loc, but not if you're
thinking of accepting also relative urls.

It would probably be better to be explicit and raise an exception if the
url is invalid, so that the user can prepend a '//' and resubmit if
needed.  Also we'd probably stop seeing bugreports about this issue :)

----------
nosy: +elachuni

_______________________________________
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue754016>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue754016] urlparse goes wrong with IP:port without scheme

Reply via email to