[bug #60494] Percent character in filename gets escaped twice

2021-05-16 Thread Kode Charlie
Follow-up Comment #7, bug #60494 (project wget):

My $0.02:  there's (a) the specification, and then there's (b) the reality of
what's out there in the wild.  Inline docs for *wget* make it clear that at
least part of the logic is concerned with (b).

An inefficient but possibly easy fix to this bug might simply be to make a
deep copy of the *struct url* each time *url_escape()* is called. This issue,
after all, has grown out of an optimization (ie, in-place string-edits) that
isn't working quite as expected.


___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[bug #60494] Percent character in filename gets escaped twice

2021-05-16 Thread Petr Pisar
Follow-up Comment #6, bug #60494 (project wget):

You cannot state a question like that because a random string is ambiguous by
it's nature.

According to the specification
 there is nothing
as an unescaped URI. URI is always escaped by the definition.

Look at the original report: You have a file name
"qtwebengine-everywhere-src-5.15.2-%231904652.patch.gz". It's a file name. Not
an URI. If you construct a URL for the file name using an
"https://mirrors.slackware.com/slackware/slackware64-current/source/l/qt5/patches/;
base URL, then you need to escape the file name string and then append it it
after a path delimiter of the base URL. I.e. you convert the file name to
"qtwebengine-everywhere-src-5.15.2-%25231904652.patch.gz" and then append it
to the base resulting into
"https://mirrors.slackware.com/slackware/slackware64-current/source/l/qt5/patches/qtwebengine-everywhere-src-5.15.2-%25231904652.patch.gz;
URL. This URL is passed to to wget command. Thus wget should not escape it
again. It could validate and report an error. But not escape it.

I will quote the specification here:

   Under normal circumstances, the only time when octets within a URI
   are percent-encoded is during the process of producing the URI from
   its component parts.  This is when an implementation determines which
   of the reserved characters are to be used as subcomponent delimiters
   and which can be safely used as data.  Once produced, a URI is always
   in its percent-encoded form.

Please, pay attention to the last sentence.

Of course wget could state that its argument is a byte stream without any
other constrains. But a manual of wget(1) reads something different, it states
it's a URL:

SYNOPSIS
   wget [option]... [URL]...

Hence wget should not attempt any escaping.

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[bug #60617] POST not continued after 301, 302

2021-05-16 Thread Frank Heckenbach
URL:
  

 Summary: POST not continued after 301, 302
 Project: GNU Wget
Submitted by: frank
Submitted on: Sun 16 May 2021 03:43:23 PM UTC
Category: Program Logic
Severity: 3 - Normal
Priority: 5 - Normal
  Status: None
 Privacy: Public
 Assigned to: None
 Originator Name: 
Originator Email: 
 Open/Closed: Open
 Release: 1.20
 Discussion Lock: Any
Operating System: GNU/Linux
 Reproducibility: Every Time
   Fixed Release: None
 Planned Release: None
  Regression: None
   Work Required: None
  Patch Included: Yes

___

Details:

According to the man page: "In case of a 301 Moved Permanently, 302 Moved
Temporarily or 307 Temporary Redirect, Wget will, in accordance with RFC2616,
continue to send a POST request."

That doesn't seem to work after 301 and 302, see debug.output.

It seems the comparison is inverted. It should keep POST if it was POST ("==",
not "!="), see attached patch which seems to fix the problem for me.




___

File Attachments:


---
Date: Sun 16 May 2021 03:43:23 PM UTC  Name: debug.output  Size: 3KiB   By:
frank


---
Date: Sun 16 May 2021 03:43:23 PM UTC  Name: redirect.patch  Size: 767B   By:
frank



___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[bug #60494] Percent character in filename gets escaped twice

2021-05-16 Thread Tim Ruehsen
Follow-up Comment #5, bug #60494 (project wget):

The main Q is how does Wget know if the input URL is already escaped or not ?

Did you read this comment ?
https://gitlab.com/gnuwget/wget/-/blob/master/src/url.c#L329

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/