New submission from JTMoon79 <jtm.moon.forum.user+pyt...@gmail.com>:

Copy of issue 10696
This issue is exactly the same as issue 10696 except it affects a different 
function, urllib.parse.urlparse (instead of urllib.parse.urlsplit).

urlparse function from urllib.parse.urlparse does not return the port field.
REPRO STEPS:
>>> import urllib
>>> import urllib.parse
>>> urllib.parse.urlparse(r'http://foo.bar.com:80/blarg?a=1&b=2')
RETURNS:
ParseResult(scheme='http', netloc='foo.bar.com:80', path='/blarg', params='', 
query='a=1&b=2', fragment='')
EXPECTED: 
ParseResult(scheme='http', netloc='foo.bar.com', path='/blarg', port='80', 
params='', query='a=1&b=2', fragment='')
END REPRO

The documentation at 
http://docs.python.org/py3k/library/urllib.parse.html#urllib.parse.urlsplit 
shows this as expected.  What is the purpose of a possible port parameter if 
that port parameter is not set?

According to RFC 1808 the syntatic components are 
<scheme>://<net_loc>/<path>;<params>?<query>#<fragment>
However, according to referenced RFC 1738 (referenced by RFC 1808)
http://tools.ietf.org/html/rfc1738#section-3.1
the <net_loc> can be further separated to <host> and <port>.

I guess a bigger more general complaint about this is, why not make urlparse 
more useful by separating <host> and <port>?
I imagine this is a common need of users.  I like standards.  And doing a 
little extra to work with standards make those standards even more useful.

----------
components: Library (Lib)
messages: 123898
nosy: JTMoon79
priority: normal
severity: normal
status: open
title: port not split in function urllib.parse.urlparse
type: behavior
versions: Python 3.1

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10697>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to