[issue32779] urljoining an empty query string doesn't clear query string

2021-05-28 Thread Paul Fisher


Paul Fisher  added the comment:

Reading more into this, from section 5.2,1:

> A component is undefined if its associated delimiter does not appear in the 
> URI reference

So you could say that since there is a '?', the query component is *defined*, 
but *empty*. This would mean that assigning the target query to be '' has the 
desired effect as implemented by browsers and other languages' standard 
libraries.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32779] urljoining an empty query string doesn't clear query string

2021-05-28 Thread Irit Katriel


Irit Katriel  added the comment:

Sorry, urlparse returns '' rather than None when there is no query.
So we indeed need to check something like 
if '?' not in url:
or what's in Paul's patch. 

However, my main point was to question whether fixing this is actually in 
contradiction with the RFC.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32779] urljoining an empty query string doesn't clear query string

2021-05-28 Thread Irit Katriel


Irit Katriel  added the comment:

The relevant part in the RFC pseudo code is 

   if defined(R.query) then
  T.query = R.query;
   else
  T.query = Base.query;
   endif;

which is implemented in urljoin as:

if not query:
query = bquery


Is this correct? Should the code not say "if query is not None"?
(I can't see in the RFC a definition of defined()).

--
nosy: +iritkatriel
versions: +Python 3.10, Python 3.11, Python 3.9 -Python 2.7, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32779] urljoining an empty query string doesn't clear query string

2018-02-15 Thread Paul Fisher

Paul Fisher  added the comment:

In this case, the RFC is mismatched from the actual behaviour of browsers (as 
described and codified by WhatWG).  It was surprising to me that urljoin() 
didn't do what I percieved as "the right thing" (and I expect other users would 
too).

I would personally expect urljoin to do "the thing that everybody else does".  
Is there a sensible way to reduce this mismatch?

For reference, Java's stdlib does what I would expect here:

URI base = URI.create("https://example.com/?a=b;);
URI rel = base.resolve("?");
System.out.println(rel);

https://example.com/?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32779] urljoining an empty query string doesn't clear query string

2018-02-15 Thread Andrew Svetlov

Andrew Svetlov  added the comment:

Python follows not WhatWG but RFC.
https://tools.ietf.org/html/rfc3986#section-5.2.2 is proper definition for url 
joining algorithm.

--
nosy: +asvetlov

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32779] urljoining an empty query string doesn't clear query string

2018-02-12 Thread Roundup Robot

Change by Roundup Robot :


--
keywords: +patch
pull_requests: +5446
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32779] urljoining an empty query string doesn't clear query string

2018-02-09 Thread Paul Fisher

Paul Fisher  added the comment:

I'm working on a patch for this and can have one up in the next week or so, 
once I get the CLA signed and other boxes ticked.  I'm new to the Github 
process but hopefully it will be a good start for the discussion.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32779] urljoining an empty query string doesn't clear query string

2018-02-09 Thread Terry J. Reedy

Change by Terry J. Reedy :


--
nosy: +orsenthil

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32779] urljoining an empty query string doesn't clear query string

2018-02-05 Thread Paul Fisher

New submission from Paul Fisher :

urljoining with '?' will not clear a query string:

ACTUAL:
>>> import urllib.parse
>>> urllib.parse.urljoin('http://a/b/c?d=e', '?')
'http://a/b/c?d=e'

EXPECTED:
'http://a/b/c' (optionally, with a ? at the end)

WhatWG's URL standard expects a relative URL consisting of only a ? to replace 
a query string:

https://url.spec.whatwg.org/#relative-state

Seen in versions 3.6 and 2.7, but probably also affects later versions.

--
components: Library (Lib)
messages: 311704
nosy: Paul Fisher
priority: normal
severity: normal
status: open
title: urljoining an empty query string doesn't clear query string
type: behavior
versions: Python 2.7, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com