Steven D'Aprano <steve+pyt...@pearwood.info> added the comment:
I'm changing the name to better describe the problem, and suggest a better solution. The urlparse.urlsplit and .urlunsplit functions currently don't validate the scheme argument, if given. According to the RFC: Scheme names consist of a sequence of characters. The lower case letters "a"--"z", digits, and the characters plus ("+"), period ("."), and hyphen ("-") are allowed. For resiliency, programs interpreting URLs should treat upper case letters as equivalent to lower case in scheme names (e.g., allow "HTTP" as well as "http"). https://www.ietf.org/rfc/rfc1738.txt If the scheme is specified, I suggest it should be normalised to lowercase and validated, something like this: # untested if scheme: # scheme_chars already defined in module badchars = set(scheme) - set(scheme_chars) if badchars: raise ValueError('"%c" is invalid in URL schemes' % badchars.pop()) scheme = scheme.lower() This will help avoid errors such as passing 'http://' as the scheme. ---------- keywords: -patch stage: patch review -> title: urlsplit scheme argument broken -> urlparse doesn't validate the scheme versions: +Python 3.8 -Python 2.7, Python 3.7 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue35377> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com