[issue18140] urlparse, urlsplit confused when password includes fragment (#), query (?)

2020-12-23 Thread Senthil Kumaran


Senthil Kumaran  added the comment:

Not a bug. The message #msg375109 explains how to quote and unquote the '#' in 
the password field, and demonstrates how urllib parses it correctly.

I guess, it was set to open as a mistake. Closing it again.

--
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18140] urlparse, urlsplit confused when password includes fragment (#), query (?)

2020-12-18 Thread Senthil Kumaran


Change by Senthil Kumaran :


--
versions: +Python 3.10 -Python 2.7, Python 3.5, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18140] urlparse, urlsplit confused when password includes fragment (#), query (?)

2020-08-10 Thread david.six


david.six  added the comment:

tl;dr: '#', '?' and a few other characters should be URL-encoded/%-encoded when 
they appear in userinfo which will already parse correctly.

---

Following up on what Martin said, RFC 3986 has the specifications for how these 
examples should be parsed.

userinfo  = *( unreserved / pct-encoded / sub-delims / ":" )

unreserved= ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded   = "%" HEXDIG HEXDIG
sub-delims= "!" / "$" / "&" / "'" / "(" / ")"
 / "*" / "+" / "," / ";" / "="

Notably, gen-delims are _not_ included in the allowed characters, nor are 
non-ASCII characters.

gen-delims= ":" / "/" / "?" / "#" / "[" / "]" / "@"

These and other characters not mentioned should be URL-encoded/%-encoded if 
they appear in the password.

Taking the first example:

>>> from urllib.parse import urlparse
>>> u = 'http://auser:secr%23et@192.168.0.1:8080/a/b/c.html'
>>> urlparse(u)
ParseResult(scheme='http', netloc='auser:secr%23et@192.168.0.1:8080', 
path='/a/b/c.html', params='', query='', fragment='')
>>> unquote(urlparse(u).password)
'secr#et'

--
nosy: +david.six
status: pending -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18140] urlparse, urlsplit confused when password includes fragment (#), query (?)

2019-01-18 Thread Martin Panter

Martin Panter  added the comment:

Today I read RFC 3986, and I think the URLs in the bug reports are valid, and 
are already parsed correctly. The path is allowed to have a literal “at” symbol:

path-abempty = *( "/" segment )
segment = *pchar
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"

The query and fragment are allowed to have “at” and question marks:

query = *( pchar / "/" / "?" )
fragment = *( pchar / "/" / "?" )

So I think this could be closed because the parsing is working correctly.

--
resolution:  -> not a bug
status: open -> pending

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18140] urlparse, urlsplit confused when password includes fragment (#), query (?)

2017-06-16 Thread Martin Panter

Changes by Martin Panter :


--
dependencies: +[security] urllib connects to a wrong host

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue18140] urlparse, urlsplit confused when password includes fragment (#), query (?)

2016-03-19 Thread Martin Panter

Changes by Martin Panter :


--
title: urlparse.urlsplit confused to fragment when password include # -> 
urlparse, urlsplit confused when password includes fragment (#), query (?)
versions: +Python 3.5, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com