[issue37678] Incorrect behaviour for user@password URI pattern in urlparse

Giovanni Cappellotto Sun, 28 Jul 2019 20:30:40 -0700


Giovanni Cappellotto <potoma...@gmail.com> added the comment:


What do you mean that urlparse act as unexpected?

I tried your example and I think urlparse's behavior is correct.

>From the RFC 1738:

> Octets must be encoded if they have no corresponding graphic
> character within the US-ASCII coded character set, if the use of the
> corresponding character is unsafe, or if the corresponding character
> is reserved for some other interpretation within the particular URL
> scheme.

Your example:

```
>>> from urllib.parse import urlparse
>>> urlparse('http://user:pass#?[w...@example.com:80/path')
ParseResult(scheme='http', netloc='user:pass', path='', params='', query='', 
fragment='?[w...@example.com:80/path')
```

Part of the password is parsed as the URL fragment because the character `#` 
has a special meaning:

> The character "#" is unsafe and should
> always be encoded because it is used in World Wide Web and in other
> systems to delimit a URL from a fragment/anchor identifier that might
> follow it.

----------
nosy: +potomak

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue37678>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37678] Incorrect behaviour for user@password URI pattern in urlparse

Reply via email to