New submission from Tim Burke <[email protected]>:
Not sure if this is a documentation or behavior bug, but... the docs for
urllib.request.Request.set_proxy
(https://docs.python.org/3/library/urllib.request.html#urllib.request.Request.set_proxy)
say
> Prepare the request by connecting to a proxy server. *The host and type will
> replace those of the instance*, and the instance’s selector will be the
> original URL given in the constructor.
(Emphasis mine.) In practice, behavior is more nuanced than that:
>>> from urllib.request import Request
>>> req = Request('http://hostame:port/some/path')
>>> req.host, req.type
('hostame:port', 'http')
>>> req.set_proxy('proxy:other-port', 'https')
>>> req.host, req.type # So far, so good...
('proxy:other-port', 'https')
>>>
>>> req = Request('https://hostame:port/some/path')
>>> req.host, req.type
('hostame:port', 'https')
>>> req.set_proxy('proxy:other-port', 'http')
>>> req.host, req.type # Type doesn't change!
('proxy:other-port', 'https')
Looking at the source
(https://github.com/python/cpython/blob/v3.7.0/Lib/urllib/request.py#L397) it's
obvious why https is treated specially.
The behavior is consistent with how things worked on py2...
>>> from urllib2 import Request
>>> req = Request('http://hostame:port/some/path')
>>> req.get_host(), req.get_type()
('hostame:port', 'http')
>>> req.set_proxy('proxy:other-port', 'https')
>>> req.get_host(), req.get_type()
('proxy:other-port', 'https')
>>>
>>> req = Request('https://hostame:port/some/path')
>>> req.get_host(), req.get_type()
('hostame:port', 'https')
>>> req.set_proxy('proxy:other-port', 'http')
>>> req.get_host(), req.get_type()
('proxy:other-port', 'https')
... but only if you're actually inspecting host/type along the way!
>>> from urllib2 import Request
>>> req = Request('https://hostame:port/some/path')
>>> req.set_proxy('proxy:other-port', 'http')
>>> req.get_host(), req.get_type()
('proxy:other-port', 'http')
(FWIW, this came up while porting an application from py2 to py3; there was a
unit test expecting that last behavior of proxying a https connection through a
http proxy.)
----------
components: Library (Lib)
messages: 325449
nosy: tburke
priority: normal
severity: normal
status: open
title: urllib.request.Request.set_proxy doesn't (necessarily) replace type
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue34698>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com