On Sun, 6 Feb 2022 at 16:51, Christian Heimes <christ...@python.org> wrote:

> The urllib package -- and to some degree also the http package -- are
> constant source of security bugs. The code is old and the parsers for
> HTTP and URLs don't handle edge cases well. Python core lacks a true
> maintainer of the code. To be honest, we have to admit defeat and be up
> front that urllib is not up to the task for this decade. It was designed
> written during a more friendly, less scary time on the internet.
>
> If I had the power and time, then I would replace urllib with a simpler,
> reduced HTTP client that uses platform's HTTP library under the hood
> (WinHTTP on Windows, NSURLSession (?) on macOS, Web API for Emscripten,
> maybe curl on Linux/BSD). For non-trivial HTTP requests, httpx or
> aiohttp are much better suited than urllib.
>
> The second best option is to reduce the feature set of urllib to core
> HTTP (no ftp, proxy, HTTP auth) and a partial rewrite with stricter,
> more standard conform parsers for urls, query strings, and RFC 2822
> instead of RFC 822 for headers.

I'd likely be fine with either of these two options. I'm not worried
about supporting "advanced" uses. But having no way of getting a file
from the internet without relying on 3rd party packages seems like a
huge gap in functionality for a modern language. And having to use a
3rd party library to parse URLs will simply push more people to use
home-grown regexes rather than something safe and correct. Remember
that a lot of Python users are not professional software developers,
but scientists, data analysts, and occasional users, for whom the
existence of something in the stdlib is the *only* reason they have
any idea that URLs need specialised parsing in the first place.

And while we all like to say 3rd party modules are great, the reality
is that they provide a genuine problem for many of these
non-specialist users - and I say that as a packaging specialist and
pip maintainer. The packaging ecosystem is *not* newcomer-friendly in
the way that core Python is, much as we're trying to improve that
situation.

I've said it previously, but I'll reiterate - IMO this *must* have a
PEP, and that PEP must be clear that the intention is to *remove*
urllib, not simply to "deprecate and then think about it". That could
be making it part of PEP 594, or a separate PEP, but one way or
another it needs a PEP.

Paul
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KT6TGUBTLLETHES2OVVGZWSGYC5JCEKC/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to