Steven D'Aprano <steve+pyt...@pearwood.info> added the comment: > The “urllib.parse” module generally follows RFC 3986, which does not > allow a literal backslash in the “userinfo” part:
And yet the parse() function seems to allow arbitrary unescaped characters. This is from 3.8.0a0: py> from urllib.parse import urlparse py> urlparse(r'http://spam\eggs!cheese&aardv...@evil.com').netloc 'spam\\eggs!cheese&aardv...@evil.com' py> urlparse(r'http://spam\eggs!cheese&aardv...@evil.com').hostname 'evil.com' If that's a bug, it is a separate bug to this issue. Backslash doesn't seem relevant to the security issue of userinfo being used to mislead: py> urlparse('http://www.google....@evil.com').netloc 'www.google....@evil.com' py> urlparse('http://www.google....@evil.com').hostname 'evil.com' If it is relevant, can somebody explain to me how? ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue35748> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com