[issue35377] urlsplit scheme argument broken

2018-12-03 Thread devkral


devkral  added the comment:

ah, I get the idea behind urlunsplit. But still:
for e.g. http, http:/// is invalid. These urls are invalid for e.g. requests.
May I ask:
urllib.parse is for web protocol urls? If yes, then there should be an option 
which defaults to my version.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35377] urlsplit scheme argument broken

2018-12-03 Thread Steven D'Aprano


Steven D'Aprano  added the comment:

You haven't given a convincing reason that there is a problem that needs 
fixing, or if there is, that your patch is the right way to fix it.

You have already pointed out that there is at least one scheme where :/// is 
part of a valid URL: "file:///". Where there is one, there could be others, if 
not today, then in the future:

spam:///eggs.cheese

There are well over 200 registered schemes:

https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml

I don't think it will be practical to put in a whitelist or a blacklist of 
schemes that are, or aren't, permitted to include :/// after the scheme. It is 
the caller's responsibility to use the correct scheme, without adding extra 
characters to the end.

I asked you to justify why this should be considered a bug in the library, 
rather than a bug in your code. I'm not an expert on URLs, but the functions 
look correct to me. If you can't justify why this is a bug in the library that 
needs fixing, rather than user-error, we should close this or change it to a 
request for better documentation.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35377] urlsplit scheme argument broken

2018-12-03 Thread Roundup Robot


Change by Roundup Robot :


--
keywords: +patch
pull_requests: +10097
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35377] urlsplit scheme argument broken

2018-12-03 Thread devkral


devkral  added the comment:

Ok, then the problem is in unsplit.
Because: :/// is not a valid url.

Next try: in urlunsplit:


if netloc or (scheme and scheme in uses_netloc and url[:2] != '//'):
if url and url[:1] != '/' and scheme in ("file", ""): # my change
url = '/' + url

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35377] urlsplit scheme argument broken

2018-12-02 Thread Steven D'Aprano

Steven D'Aprano  added the comment:

I don't think this is broken, but I do think it could be documented better. You 
have to read the documentation for `urlparse` to see this:

[Quote]
Following the syntax specifications in RFC 1808, urlparse recognizes a netloc 
only if it is properly introduced by ‘//’. Otherwise the input is presumed to 
be a relative URL and thus to start with a path component.

https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlparse

so the function is correct. You're making two errors:

- providing a relative URL "httpbin.org" and expecting it to be treated as an 
absolute URL;

- specifying scheme+delimiter instead of the scheme alone.

So I don't think this is a bug. urlsplit rightly accepts any arbitrary string 
as a scheme (it used to have a whitelist of permitted schemes, and that was a 
problem), and we can't assume that :/// is ONLY valid for file protocols.

Unless you come up with a convincing argument for why this is a bug, I'm going 
to change it to a documentation issue. The docs could do with some improvement 
to make it more clear, although the examples are good.

--
nosy: +steven.daprano

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35377] urlsplit scheme argument broken

2018-12-02 Thread devkral


devkral  added the comment:

first a correction; there is one protocol where it make sense:
file
I would change:
...
elif scheme:
...
to
...
elif scheme and scheme != "file":
...


Second: no it is not a correct behaviour. Urlunsplit is supposed to create a 
valid url from a tuple created by urlsplit. :/// is not a valid url (except for 
the file protocol).

Third: rstrip could be simplified to rstrip(":/")

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35377] urlsplit scheme argument broken

2018-12-02 Thread SilentGhost


SilentGhost  added the comment:

While it might seem broken, it behaves according to documentation, it seems to 
me.

--
nosy: +SilentGhost, orsenthil
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35377] urlsplit scheme argument broken

2018-12-02 Thread devkral


New submission from devkral :

the scheme argument of urlsplit/urlparse is completely broken.
here two examples:

urlunsplit(urlsplit("httpbin.org", scheme="https://;))
'https://:httpbin.org'

urlunsplit(urlsplit("httpbin.org", scheme="https"))
'https:///httpbin.org'

Fix: change urlsplit logic like this:
...
url, scheme, _coerce_result = _coerce_args(url, scheme)
scheme = scheme.rstrip("://") # this removes ://
...
i = url.find('://') # harden against arbitrary :
if i > 0:
...
elif scheme:
netloc, url = _splitnetloc(url, 0)  # if scheme is specified, netloc is 
implied

sry too lazy to create a patch from this. Most probably are all python versions 
affected but I checked only 2.7 and 3.7 .

--
components: Library (Lib)
messages: 330884
nosy: devkral
priority: normal
severity: normal
status: open
title: urlsplit scheme argument broken
versions: Python 2.7, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com