On Mon, Feb 7, 2022 at 4:56 AM Steve Dower <steve.do...@python.org> wrote:
> On 2/6/2022 4:44 PM, Christian Heimes wrote: > > If I had the power and time, then I would replace urllib with a simpler, > > reduced HTTP client that uses platform's HTTP library under the hood > > (WinHTTP on Windows, NSURLSession (?) on macOS, Web API for Emscripten, > > maybe curl on Linux/BSD). For non-trivial HTTP requests, httpx or > > aiohttp are much better suited than urllib. > > I'm +1 on this, though I think it would have to be in place before the > "two releases until removal" kicked in for urllib.request. > Yes, we definitely couldn't deprecate anything regarding downloading over HTTP w/o having a replacement in place. I am not even considering deprecating urllib.parse. > > The stdlib can't get by without at least the basic functionality of curl > built in natively. But we can do this on most platforms without > vendoring OpenSSL, which is a HUGE win. Then our default behaviour could > correctly use proxies (including auto-config), CA certificate bundles, > integrated authentication, and other OS features that are currently > ignored by our core. > I also agree this is the best of the 2 options, although I would also accept Christian's other option of a more targeted, tight, standards-compliant solution if that would somehow lead to less maintenance overhead. And when I say "less maintenance overhead," I really mean it: I would question whether following redirects as an option is worth the overhead in this scenario. I'm very much thinking of this from a bootstrap/script/learning scenario and pushing people towards e.g. httpx for anything fancier. > > Chances are we could keep simple urlopen() calls in place, and use the > deprecation as a "potential change of behaviour" without necessarily > having to break the API. I'm yet to come across a case where making a > trivial urlopen() call _better_ would break things (the cases I've seen > that would break are things like "using an OpenSSL environment variable > to configure something that I wish had been automatic"). > We could try to get fancy and only raise DeprecationWarning in cases where things won't work to extend when we consider pushing people to the better API. > > The nature of network/internet access is that we have to break things > periodically anyway, because all the code that was written over the last > 30+ years is eventually going to be found to be exploitable. I'd be > quite happy to say "Python gives you what your OS gives you; update the > OS for fixes". > Exactly. My guideline for this whole idea would be that if it doesn't make sense in a beginner course that says to "download an HTML page and count all the anchor tags," then it's too fancy for the stdlib. And that should be enough to bootstrap installers which then get you httpx. Otherwise the networking stack moves too fast (from a security POV) and requires unique knowledge to get right that we have simply not kept up as much as we would like. I think it's okay to admit it might be time to trim with part of the stdlib down to something that we can manage easily (but we *cannot* drop the ability to download something over HTTPS).
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VP2WXOBWPGAX7UIH25DWRSYWFEDNINNU/ Code of Conduct: http://python.org/psf/codeofconduct/