[issue29651] Inconsistent/undocumented urlsplit/urlparse behavior on invalid inputs

2017-05-15 Thread Senthil Kumaran

Changes by Senthil Kumaran :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29651] Inconsistent/undocumented urlsplit/urlparse behavior on invalid inputs

2017-05-15 Thread Senthil Kumaran

Senthil Kumaran added the comment:


New changeset 75b8a54bcad70806d9dcbbe20786f4d9092ab39c by Senthil Kumaran in 
branch '3.6':
 bpo-29651 - Cover edge case of square brackets in urllib docs (#1128) (#1596)
https://github.com/python/cpython/commit/75b8a54bcad70806d9dcbbe20786f4d9092ab39c


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29651] Inconsistent/undocumented urlsplit/urlparse behavior on invalid inputs

2017-05-15 Thread Senthil Kumaran

Senthil Kumaran added the comment:


New changeset 72e5aa1ef812358b3b113e784e7365fec13dfd69 by Senthil Kumaran in 
branch '3.5':
 bpo-29651 - Cover edge case of square brackets in urllib docs (#1128) (#1597)
https://github.com/python/cpython/commit/72e5aa1ef812358b3b113e784e7365fec13dfd69


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29651] Inconsistent/undocumented urlsplit/urlparse behavior on invalid inputs

2017-05-15 Thread Senthil Kumaran

Changes by Senthil Kumaran :


--
pull_requests: +1691

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29651] Inconsistent/undocumented urlsplit/urlparse behavior on invalid inputs

2017-05-15 Thread Senthil Kumaran

Changes by Senthil Kumaran :


--
pull_requests: +1690

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29651] Inconsistent/undocumented urlsplit/urlparse behavior on invalid inputs

2017-05-15 Thread Senthil Kumaran

Senthil Kumaran added the comment:


New changeset f6e863d868a621594df2a8abe072b5d4766e7137 by Senthil Kumaran 
(Howie Benefiel) in branch 'master':
 bpo-29651 - Cover edge case of square brackets in urllib docs (#1128)
https://github.com/python/cpython/commit/f6e863d868a621594df2a8abe072b5d4766e7137


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29651] Inconsistent/undocumented urlsplit/urlparse behavior on invalid inputs

2017-04-14 Thread Berker Peksag

Changes by Berker Peksag :


--
stage: needs patch -> patch review
versions: +Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29651] Inconsistent/undocumented urlsplit/urlparse behavior on invalid inputs

2017-04-13 Thread Roundup Robot

Changes by Roundup Robot :


--
pull_requests: +1263

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29651] Inconsistent/undocumented urlsplit/urlparse behavior on invalid inputs

2017-04-13 Thread Howie Benefiel

Howie Benefiel added the comment:

I'm going to make a note in the documentation. I should have a PR for it in 
about 1 day.

--
nosy: +Howie Benefiel

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29651] Inconsistent/undocumented urlsplit/urlparse behavior on invalid inputs

2017-03-03 Thread Raymond Hettinger

Raymond Hettinger added the comment:

A note in the docs would be useful.  This API is far too well established to 
make any behavioral changes at this point.

--
nosy: +rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29651] Inconsistent/undocumented urlsplit/urlparse behavior on invalid inputs

2017-03-03 Thread Terry J. Reedy

Changes by Terry J. Reedy :


--
nosy: +orsenthil
stage:  -> needs patch
versions:  -Python 3.3, Python 3.4, Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29651] Inconsistent/undocumented urlsplit/urlparse behavior on invalid inputs

2017-02-25 Thread Vasiliy Faronov

New submission from Vasiliy Faronov:

There is a problem with the standard library's urlsplit and urlparse functions, 
in Python 2.7 (module urlparse) and 3.2+ (module urllib.parse).

The documentation for these functions [1] does not explain how they behave when 
given an invalid URL.

One could try invoking them manually and conclude that they tolerate anything 
thrown at them:

>>> urlparse('http:::!!::!!++///')
ParseResult(scheme='http', netloc='', path='//::!!::!!++///',
params='', query='', fragment='')

>>> urlparse(os.urandom(32).decode('latin-1'))
ParseResult(scheme='', netloc='', path='\x7f¼â1gdä»6\x82', params='',
query='', fragment='\n\xadJ\x18+fli\x9cÛ\x9ak*ÄÅ\x02³F\x85Ç\x18')

Without studying the source code, it is impossible to know that there is a very 
narrow class of inputs on which they raise ValueError [2]:

>>> urlparse('http://[')
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3.5/urllib/parse.py", line 295, in urlparse
splitresult = urlsplit(url, scheme, allow_fragments)
  File "/usr/lib/python3.5/urllib/parse.py", line 345, in urlsplit
raise ValueError("Invalid IPv6 URL")
ValueError: Invalid IPv6 URL

This could be viewed as a documentation issue. But it could also be viewed as 
an implementation issue. Instead of raising ValueError on those square 
brackets, urlsplit could simply consider them *invalid* parts of an RFC 3986 
reg-name, and lump them into netloc, as it already does with other *invalid* 
characters:

>>> urlparse('http://\0\0æí\n/')
ParseResult(scheme='http', netloc='\x00\x00æí\n', path='/', params='',
query='', fragment='')

Note that the raising behavior was introduced in Python 2.7/3.2.

See also issue 8721 [3].


[1] https://docs.python.org/3/library/urllib.parse.html
[2] https://github.com/python/cpython/blob/e32ec93/Lib/urllib/parse.py#L406-L408
[3] http://bugs.python.org/issue8721

--
assignee: docs@python
components: Documentation, Library (Lib)
messages: 288577
nosy: docs@python, vfaronov
priority: normal
severity: normal
status: open
title: Inconsistent/undocumented urlsplit/urlparse behavior on invalid inputs
type: behavior
versions: Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com