[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2021-06-13 Thread Jacob Walls


Jacob Walls  added the comment:

Well, now I've looked at the CPython test failure more closely, and it's in 
`test.test_venv.EnsurePipTest` where we just download latest pip. 

Their release cadence suggests a new release in July, about 2-4 weeks from now. 
So I'll wait on adding the DeprecationWarning for passing `None` unless folks 
ask for it during review, and am happy to wait a few weeks for green CI.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2021-06-13 Thread Jacob Walls


Jacob Walls  added the comment:

Both this ticket and #19094 started from one method in urllib.parse and then 
generalized a proposal for the rest of the submodule to move away from 
duck-typing and to instead raise TypeErrors (or at least some error) for 
invalid types. The attached PR does that. But it broke pip, which was passing 
None to the `fragment` argument of `urlunsplit()`.

In a blaze of efficiency, Pip already merged my fix, but since the CPython test 
suite depends on the current version of pip -- perhaps there's an argument for 
at least raising a deprecation warning for ``None`` and then converting to '' 
or b''? In any case, green CI makes patches more reviewable, so I'm inclined to 
add such a warning/conversion for now and then see how feedback goes. Cheers, 
all.

BTW: perhaps we can close this as dupe of #19094?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2021-06-13 Thread Jacob Walls


Change by Jacob Walls :


--
nosy: +jacobtylerwalls
nosy_count: 7.0 -> 8.0
pull_requests: +25291
pull_request: https://github.com/python/cpython/pull/26687

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-05-06 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Thank you Luiz for your testing.

> __main__:1: DeprecationWarning: Use of '' is deprecated

It is bad that a warning is emitted for default value.

> Will bytes be deprecated if used as a default_schema?

No, only using empty bytes schema with string url is deprecated (because it 
works now). Using non-empty bytes schema with string url just causes an error.

> Shouldn't it complain that the types are different?

This special case is left for compatibility with wrappers.

> __main__:1: DeprecationWarning: Use of [] is deprecated

The warning should not be emitted for the value that the user did not provide.

If go by the way of strong deprecation, the patch needs reworking. But this is 
a way of overcomplication.

Since as pointed Antti, only using str, bytes and bytearray is documented, I 
think we can ignore the breakage for other types.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-28 Thread Antti Haapala

Antti Haapala added the comment:

I do not believe there is code that would depend on `urlparse(urlstring={})` 
*not* throwing an error, since `{}` obviously is neither a URL, nor a string.

Further down the documentation explicitly states that

> The URL parsing functions were originally designed to operate on 
> character strings only. In practice, it is useful to be able to 
> manipulate properly quoted and encoded URLs as sequences of ASCII 
> bytes. Accordingly, the URL parsing functions in this module all 
> operate on bytes and bytearray objects in addition to str objects.

As the documentation does not state that it should work on any other objects, 
there shouldn't be any code that should be deprecated. Furthermore even in 3.5, 
the `bool(datetime.time(0, 0)) == False` was removed without any deprecation 
cycle, despite it having been a documented feature for more than a decade 
(unlike this one).

And IMHO not giving an object of expected type should result in a TypeError.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-27 Thread Senthil Kumaran

Senthil Kumaran added the comment:

Serhiy:

I left review comments on the patch too. 
I agree to "tightening" of the input arg type in these urlparse functions. 

Before we for the next version, I think, it will be helpful to enumerate the 
behavior for wrong arg types for these functions that you would like to see.

1. Invalid formats like {}, [], None could just be TypeError.

2. The mix of bytes and str should be deprecated and we could give the 
suggestion for the encouraged single type in the deprecation warning.

3. Any thing else w.r.t to special rules for various parts of url.

In general, if we are going with the deprecation cycle, we would as well go 
with deciding on what to allow and present it in a simple way.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-26 Thread Luiz Poleto

Luiz Poleto added the comment:

As for urlparse_empty_bad_arg_disallow.patch, I didn't go too deep into testing 
it but I found that calling urlparse with different non-str args are producing 
different results:

urlparse({})
TypeError: unhashable type: 'slice'

urlparse([])
AttributeError: 'list' object has no attribute 'decode'

urlparse(())
AttributeError: 'tuple' object has no attribute 'decode'

I thought they should all raise a TypeError but again, I am not sure it is 
working as expected by the patch's author.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-26 Thread Martin Panter

Martin Panter added the comment:

Regarding urlparse(b'', ''). Currently the second parameter is “scheme”, which 
is documented as being an empty text string by default. If we deprecate this, 
we should update the documentation.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-26 Thread Luiz Poleto

Luiz Poleto added the comment:

I am seeing some results when running urlparse with patch 
urlparse_empty_bad_arg_deprecation2.patch applied:

>>> urllib.parse.urlparse({})
__main__:1: DeprecationWarning: Use of {} is deprecated
__main__:1: DeprecationWarning: Use of '' is deprecated
ParseResultBytes(scheme=b'', netloc=b'', path=b'', params=b'', query=b'', 
fragment=b'')

>>> urllib.parse.urlparse('', b'')
__main__:1: DeprecationWarning: Use of b'' is deprecated
/home/poleto/SCMws/python/latest/cpython/Lib/urllib/parse.py:378: 
DeprecationWarning: Use of b'' is deprecated
  splitresult = urlsplit(url, scheme, allow_fragments)
ParseResult(scheme=b'', netloc='', path='', params='', query='', fragment='')
Will bytes be deprecated if used as a default_schema?

>>> urllib.parse.urlparse(b'', '')
ParseResultBytes(scheme=b'', netloc=b'', path=b'', params=b'', query=b'', 
fragment=b'')
Shouldn't it complain that the types are different? In fact it does, if you 
don't provide empty strings:

>>> urllib.parse.urlparse(b'www.python.org', 'http')
Traceback (most recent call last):
  File "", line 1, in 
  File "(...)/cpython/Lib/urllib/parse.py", line 377, in urlparse
url, scheme, _coerce_result = _coerce_args(url, scheme)
  File "(...)/cpython/Lib/urllib/parse.py", line 120, in _coerce_args
raise TypeError("Cannot mix str and non-str arguments")
TypeError: Cannot mix str and non-str arguments

>>> urllib.parse.urlparse({'a' : 1})
__main__:1: DeprecationWarning: Use of '' is deprecated
Traceback (most recent call last):
  File "", line 1, in 
  File "(...)/cpython/Lib/urllib/parse.py", line 377, in urlparse
url, scheme, _coerce_result = _coerce_args(url, scheme)
  File "(...)/cpython/Lib/urllib/parse.py", line 128, in _coerce_args
return _decode_args(args) + (_encode_result,)
  File "(...)/cpython/Lib/urllib/parse.py", line 98, in _decode_args
return tuple(x.decode(encoding, errors) if x else '' for x in args)
  File "(...)/cpython/Lib/urllib/parse.py", line 98, in 
return tuple(x.decode(encoding, errors) if x else '' for x in args)
AttributeError: 'dict' object has no attribute 'decode'

>>> urllib.parse.urlparse(['a', 'b', 'c'])
__main__:1: DeprecationWarning: Use of [] is deprecated
Traceback (most recent call last):
  File "", line 1, in 
  File "(...)/cpython/Lib/urllib/parse.py", line 377, in urlparse
url, scheme, _coerce_result = _coerce_args(url, scheme)
  File "(...)/cpython/Lib/urllib/parse.py", line 128, in _coerce_args
return _decode_args(args) + (_encode_result,)
  File "(...)/cpython/Lib/urllib/parse.py", line 98, in _decode_args
return tuple(x.decode(encoding, errors) if x else '' for x in args)
  File "(...)/cpython/Lib/urllib/parse.py", line 98, in 
return tuple(x.decode(encoding, errors) if x else '' for x in args)
AttributeError: 'list' object has no attribute 'decode'

I thought about writing test cases but I wasn't a 100% sure if the above is 
working as expected so I thought I should ask first.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-26 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Updated patch addresses Martin's comment.

--
Added file: 
http://bugs.python.org/file42607/urlparse_empty_bad_arg_deprecation2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com




[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-26 Thread Anish Shah

Changes by Anish Shah :


--
nosy:  -anish.shah

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-26 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

And here is simpler patch that just disallows bad arguments without deprecation.

--
Added file: 
http://bugs.python.org/file42602/urlparse_empty_bad_arg_disallow.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-26 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is a patch that deprecates empty non-str and non-decodable arguments for 
urlparse, urlsplit, urlunparse, urlunsplit, urldefrag, and parse_qsl.

--
Added file: 
http://bugs.python.org/file42600/urlparse_empty_bad_arg_deprecation.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-26 Thread Martin Panter

Martin Panter added the comment:

Luiz: Your _36 patch looks like it adds an unconditional warning whenever 
urlparse() is called, but I would have expected it to depend on the type of the 
“url” parameter.

There are related functions that seem to accept false values like None in 
Python 3, but not in Python 2. Perhaps they should also be considered with any 
changes:

urlsplit(None)
parse_qs(None)
parse_qsl(None)
urldefrag(None)

Also, I wonder if we should continue to accept bytearray as well as bytes. 
Bytearray has a decode() method.

--
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-25 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

`x != ''` emits BytesWarning if x is bytes. 'encode' attribute is not needed 
for URL parsing. any() is slower that a `for` loop.

I would suggest to look at efficient os.path implementations.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-25 Thread Luiz Poleto

Luiz Poleto added the comment:

As discussed on the Mentors list, the attached patch (issue22234_37.patch) 
changes the urlparse function to handle non-str and non-bytes arguments and 
adds a new test case for it.

--
Added file: http://bugs.python.org/file42592/issue22234_37.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-25 Thread Luiz Poleto

Luiz Poleto added the comment:

As discussed on the Mentors list, the attached patch (issue22234_36.patch) 
includes the deprecation warning (and related test) on the urlparse function.

--
keywords: +patch
versions: +Python 3.6 -Python 3.4
Added file: http://bugs.python.org/file42591/issue22234_36.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-22 Thread R. David Murray

R. David Murray added the comment:

I just posted about this on the mentors list, where someone brought this issue 
up with a question about our policy on type checking.  The short version is the 
better (preserves duck-typing) and more backward compatibile fix is to change 
the test to be

   x != ''

That will result in an attribute error on 'decode' for values of the incorrect 
type.

But even that should go through a deperecation cycle, since there may be 
working programs depending on the current behavior.  It's worth fixing, though, 
because of the error propogation you report.  I also suggested a rewrite of 
_check_args to get a better error message that would indeed be a type error, 
and I'm anticipating someone from the mentors list will turn that into a patch 
here.

--
nosy: +r.david.murray

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-04-21 Thread Luiz Poleto

Changes by Luiz Poleto :


--
nosy: +luiz.poleto

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-02-22 Thread Anish Shah

Changes by Anish Shah :


--
nosy: +anish.shah

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2016-02-22 Thread Antti Haapala

Antti Haapala added the comment:

I believe `urlparse` should throw a `TypeError` if not isinstance(url, (str, 
bytes))

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2014-08-24 Thread Antti Haapala

Antti Haapala added the comment:

On Python 2.7 urlparse.urlparse, parsing None, () or 0 will throw 
AttributeError because these classes do not have any 'find' method. [] has the 
find method, but will fail with TypeError, because the built-in caching 
requires that the input be hashable.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22234
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2014-08-21 Thread Martin Panter

Changes by Martin Panter vadmium...@gmail.com:


--
nosy: +vadmium

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22234
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2014-08-20 Thread Antti Haapala

New submission from Antti Haapala:

Because of if x else '' in _decode_args 
(http://hg.python.org/cpython/file/3.4/Lib/urllib/parse.py#l96), 
urllib.parse.urlparse accepts any falsy value as an url, returning a 
ParseResultBytes with all members set to empty bytestrings.

Thus you get:

 urllib.parse.urlparse({})
ParseResultBytes(scheme=b'', netloc=b'', path=b'', params=b'', query=b'', 
fragment=b'')

which may result in some very confusing exceptions later on: I had a list of 
URLs that accidentally contained some Nones and got very confusing TypeErrors 
while processing the results expecting them to be strings.

If the `if x else ''` part were removed, such invalid falsy values would fail 
with `AttributeError: 'foo' object has no attribute 'decode'`, as happens with 
any truthy invalid value.

--
components: Library (Lib)
messages: 225566
nosy: Ztane
priority: normal
severity: normal
status: open
title: urllib.parse.urlparse accepts any falsy value as an url
type: behavior
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22234
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2014-08-20 Thread Berker Peksag

Changes by Berker Peksag berker.pek...@gmail.com:


--
nosy: +orsenthil

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22234
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22234] urllib.parse.urlparse accepts any falsy value as an url

2014-08-20 Thread Demian Brecht

Changes by Demian Brecht demianbre...@gmail.com:


--
nosy: +demian.brecht

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22234
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com