Giovanni Cappellotto <potoma...@gmail.com> added the comment:
What do you mean that urlparse act as unexpected? I tried your example and I think urlparse's behavior is correct. >From the RFC 1738: > Octets must be encoded if they have no corresponding graphic > character within the US-ASCII coded character set, if the use of the > corresponding character is unsafe, or if the corresponding character > is reserved for some other interpretation within the particular URL > scheme. Your example: ``` >>> from urllib.parse import urlparse >>> urlparse('http://user:pass#?[w...@example.com:80/path') ParseResult(scheme='http', netloc='user:pass', path='', params='', query='', fragment='?[w...@example.com:80/path') ``` Part of the password is parsed as the URL fragment because the character `#` has a special meaning: > The character "#" is unsafe and should > always be encoded because it is used in World Wide Web and in other > systems to delimit a URL from a fragment/anchor identifier that might > follow it. ---------- nosy: +potomak _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue37678> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com