[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2022-02-12 Thread Éric Araujo
Éric Araujo added the comment: See also #46337 -- nosy: +eric.araujo versions: +Python 3.11 -Python 3.5 ___ Python tracker ___ ___

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2020-06-10 Thread Open Close
Open Close added the comment: I found another related issue (issue37969). I also filed one myself (issue 40938). --- One thing against the 'has_netloc' etc. solution is that while it guarantees round-trips (urlunsplit(urlsplit('...')) etc.), it is conditional on 'urlunsplit' getting

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2018-08-04 Thread Martin Panter
Martin Panter added the comment: I like this option. I suppose choosing which option to take is a compromise between compatiblity and simplicity. In the short term, the “allows_none” option requires user code to be updated. In the long term it may break compatibility. But the “has_netloc”

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2018-07-31 Thread Piotr Dobrogost
Change by Piotr Dobrogost : -- nosy: +piotr.dobrogost ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2018-07-31 Thread Chris Jerdonek
Chris Jerdonek added the comment: I just learned of this issue. Rather than adding has_netloc, etc. attributes, why not use None to distinguish missing values as is preferred above, but add a new boolean keyword argument to urlparse(), etc. to get the new behavior (e.g. "allow_none" to

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2015-08-02 Thread Robert Collins
Robert Collins added the comment: See also issue 6631 -- nosy: +rbcollins ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22852 ___ ___

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2015-05-31 Thread Berker Peksag
Changes by Berker Peksag berker.pek...@gmail.com: -- nosy: +berker.peksag ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22852 ___ ___

[issue22852] urllib.parse wrongly strips empty #fragment, ?query, //netloc

2015-05-30 Thread Martin Panter
Martin Panter added the comment: Anyone want to review my new patch? This is a perennial issue; see all the duplicates I just linked. -- keywords: +needs review title: urllib.parse wrongly strips empty #fragment - urllib.parse wrongly strips empty #fragment, ?query, //netloc

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-22 Thread Martin Panter
Martin Panter added the comment: Posting patch v2 with these changes: * Split out scheme documentation fixes to Issue 23684. * Renamed _NetlocResultMixinBase → _SplitParseBase * Explained the default values of the flags better, and what None means * Changed to Demian’s forward-looking “version

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-17 Thread Demian Brecht
Demian Brecht added the comment: I cannot imagine some existing code (other than an exploit) that would be broken by restoring the empty “//” component; do you have an example? You're likely right about the usage (I can't think of a plausible use case at any rate). At first read of #23505,

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-16 Thread Demian Brecht
Demian Brecht added the comment: urlsplit(evil.com).netloc '' urlsplit(evil.com).has_netloc True urlunsplit(urlsplit(evil.com)) # Adds “//” back 'evil.com' RFC 3986, section 3.3: If a URI contains an authority component, then the path component must either be empty

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-16 Thread Martin Panter
Martin Panter added the comment: Regarding unparsing of evil.com, see Issue 23505, where the invalid behaviour is pointed out as a security issue. This was one of the bugs that motivated me to make this patch. I cannot imagine some existing code (other than an exploit) that would be

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-16 Thread Demian Brecht
Demian Brecht added the comment: I've done an initial pass in Rietveld and left some comments, mostly around docs. Here are some additional questions though: Given has_* flags can be inferred during instantiation of *Result classes, is there a reason to have them writable, meaning is there a

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-16 Thread Demian Brecht
Demian Brecht added the comment: I avoided making them positional parameters, as they are not part of the underlying tuple object. Ignore me, I was off my face and you're absolutely correct. -- ___ Python tracker rep...@bugs.python.org

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-16 Thread Martin Panter
Martin Panter added the comment: ## Inferring flags ## The whole reason for the has_netloc etc flags is that I don’t think we can always infer their values, so we have to explicitly remember them. Consider the following two URLs, which I think should both have empty “netloc” strings for

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-13 Thread Demian Brecht
Changes by Demian Brecht demianbre...@gmail.com: -- stage: - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22852 ___ ___

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-12 Thread Martin Panter
Martin Panter added the comment: There have been a few recent bug reports (Issue 23505, Issue 23636) that may be solved by the has_netloc proposal. So I am posting a patch implementing it. The changes were a bit more involved than I anticipated, but should still be usable. I reused some of

[issue22852] urllib.parse wrongly strips empty #fragment

2015-03-06 Thread Demian Brecht
Changes by Demian Brecht demianbre...@gmail.com: -- nosy: +demian.brecht ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22852 ___ ___

[issue22852] urllib.parse wrongly strips empty #fragment

2015-02-08 Thread Martin Panter
Martin Panter added the comment: I also liked the idea of returning None to distinguish a missing URL component from an empty-but-present component, and it would make them more consistent with the “username” and “password” fields. But I agree it would break backwards compabitility too much.

[issue22852] urllib.parse wrongly strips empty #fragment

2014-11-13 Thread Stian Soiland-Reyes
Stian Soiland-Reyes added the comment: I tried to make a patch for this, but I found it quite hard as the urllib/parse.py is fairly low-level, e.g. it is constantly encoding/decoding bytes and strings within each URI component. Basically the code assumes there are tuples of strings, with

[issue22852] urllib.parse wrongly strips empty #fragment

2014-11-12 Thread Stian Soiland-Reyes
New submission from Stian Soiland-Reyes: urllib.parse can't handle URIs with empty #fragments. The fragment is removed and not reconsituted. http://tools.ietf.org/html/rfc3986#section-3.5 permits empty fragment strings: URI-reference = [ absoluteURI | relativeURI ] [ # fragment ]

[issue22852] urllib.parse wrongly strips empty #fragment

2014-11-12 Thread Georg Brandl
Changes by Georg Brandl ge...@python.org: -- nosy: +orsenthil ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22852 ___ ___ Python-bugs-list mailing

[issue22852] urllib.parse wrongly strips empty #fragment

2014-11-12 Thread Martin Panter
Changes by Martin Panter vadmium...@gmail.com: -- nosy: +vadmium ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue22852 ___ ___ Python-bugs-list