New submission from JTMoon79 <jtm.moon.forum.user+pyt...@gmail.com>:
Copy of issue 10696 This issue is exactly the same as issue 10696 except it affects a different function, urllib.parse.urlparse (instead of urllib.parse.urlsplit). urlparse function from urllib.parse.urlparse does not return the port field. REPRO STEPS: >>> import urllib >>> import urllib.parse >>> urllib.parse.urlparse(r'http://foo.bar.com:80/blarg?a=1&b=2') RETURNS: ParseResult(scheme='http', netloc='foo.bar.com:80', path='/blarg', params='', query='a=1&b=2', fragment='') EXPECTED: ParseResult(scheme='http', netloc='foo.bar.com', path='/blarg', port='80', params='', query='a=1&b=2', fragment='') END REPRO The documentation at http://docs.python.org/py3k/library/urllib.parse.html#urllib.parse.urlsplit shows this as expected. What is the purpose of a possible port parameter if that port parameter is not set? According to RFC 1808 the syntatic components are <scheme>://<net_loc>/<path>;<params>?<query>#<fragment> However, according to referenced RFC 1738 (referenced by RFC 1808) http://tools.ietf.org/html/rfc1738#section-3.1 the <net_loc> can be further separated to <host> and <port>. I guess a bigger more general complaint about this is, why not make urlparse more useful by separating <host> and <port>? I imagine this is a common need of users. I like standards. And doing a little extra to work with standards make those standards even more useful. ---------- components: Library (Lib) messages: 123898 nosy: JTMoon79 priority: normal severity: normal status: open title: port not split in function urllib.parse.urlparse type: behavior versions: Python 3.1 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue10697> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com