Demian Brecht added the comment:
>>>> urlsplit("////evil.com").netloc
> ''
>>>> urlsplit("////evil.com").has_netloc
> True
>>>> urlunsplit(urlsplit("////evil.com")) # Adds ā//ā back
> '////evil.com'
RFC 3986, section 3.3:
If a URI contains an authority component, then the path component
must either be empty or begin with a slash ("/") character. If a URI
does not contain an authority component, then the path cannot begin
with two slash characters ("//").
Because this is a backwards incompatible behavioural change and is just as
invalid as far as the RFC goes, I think that the current behaviour should be
preserved. Even though it's still incorrect, it won't break existing code if
left unchanged.
> ## _NetlocResultMixinBase abuse ##
>
> The _NetlocResultMixinBase class is a common class used by the four result
> classes Iām interested in. I probably should rename it to something like
> _SplitParseMixinBase, since it is the common base to both urlsplit() and
> urlparse() results.
I think I'm coming around to this and realizing that it's actually quite close
to my proposal, the major difference being the additional level of hierarchy in
mine. My issue was mostly due to the addition of the variadic signature in the
docs (i.e. line 407 here:
http://bugs.python.org/review/22852/diff/14176/Doc/library/urllib.parse.rst)
which led me to believe a nonsensical signature would be valid. After looking
at it again, __new__ is still bound to the tuple's signature, so you still get
the following:
>>> SplitResult('scheme','authority','path','query','fragment','foo','bar','baz')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Volumes/src/p/cpython/Lib/urllib/parse.py", line 137, in __new__
self = super().__new__(type, *pos, **kw)
TypeError: __new__() takes 6 positional arguments but 9 were given
So I'm less opposed to this as-is. I would like to see the "*" removed from the
docs though as it's misleading in the context of each of (Split|Parse)Result. I
do agree that renaming _NetlocResultMixinBase would be helpful, but it might
also be nice (from a pedant's point of view) to remove "mixin" altogether if
the __new__ implementation stays as-is.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue22852>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com