[bug #60494] Percent character in filename gets escaped twice

2021-05-16 Thread Kode Charlie
Follow-up Comment #7, bug #60494 (project wget):

My $0.02:  there's (a) the specification, and then there's (b) the reality of
what's out there in the wild.  Inline docs for *wget* make it clear that at
least part of the logic is concerned with (b).

An inefficient but possibly easy fix to this bug might simply be to make a
deep copy of the *struct url* each time *url_escape()* is called. This issue,
after all, has grown out of an optimization (ie, in-place string-edits) that
isn't working quite as expected.


___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[bug #60494] Percent character in filename gets escaped twice

2021-05-15 Thread Kode Charlie
Follow-up Comment #4, bug #60494 (project wget):

Here's one proposed change (see patch below).  It likely is not comprehensive
enough.  

The broader issue seems to be that there are circumstances in *./src/retr.c*
when *retrieve_url()* calls *url_parse()*, which already was called in
*./src/main.c*.  This can result in the *file* property in *struct url* being
modified in-place more than once (triggering errors as the one originally
reported in this issue).

For now, I'm going to defer making these changes (including suggested patch
below) to those more familiar with the *wget* repo than I.


diff --git a/src/http.c b/src/http.c
index d25762fa..a9fa237f 100644
--- a/src/http.c
+++ b/src/http.c
@@ -4280,8 +4280,10 @@ http_loop (const struct url *u, struct url
*original_url, char **newloc,
 }
   else if (!opt.content_disposition)
 {
-  hstat.local_file =
-url_file_name (opt.trustservernames ? u : original_url, NULL);
+  if (opt.trustservernames)
+hstat.local_file = url_file_name (u, NULL);
+  else
+hstat.local_file = original_url->file;
   got_name = true;
 }


___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[bug #60494] Percent character in filename gets escaped twice

2021-05-15 Thread Kode Charlie
Follow-up Comment #3, bug #60494 (project wget):

I have an explanation but as yet, no fix.

In *./src/url.c*, the routine *url_unescape()* is called twice for the same
path component (ie, the basename).  Since *url_unescape_1()* modifies the path
component in-place, the effect of processing:

qtwebengine-everywhere-src-5.15.2-%25231904652.patch.gz

twice is that after the first pass, the substring *%2523* is converted to
*%23*, and then after the second pass to *#*.

Hence, the resulting filename:

qtwebengine-everywhere-src-5.15.2-#1904652.patch.gz


I'm looking into a fix.

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[bug #60494] Percent character in filename gets escaped twice

2021-05-15 Thread Kode Charlie
Follow-up Comment #2, bug #60494 (project wget):

Correction: based on debug-logs, quoted logic does seem ok. I am looking into
other logic _before_ convert_fname() (which calls _iconv()_) is called.

[comment #1 comment #1:]
> system: macOS Big Sur v. 11.2.3
> wget: master branch
> 
> Offending logic from *./src/url.c* is shown below. I believe the issue is in
*iconv()*.
> 
> 
>   if (iconv (cd, (ICONV_CONST char **) , , , ) ==
0
>   && iconv (cd, NULL, NULL, , ) == 0)
> {
>   *(converted_fname + len - outlen - done) = '\0';
>   iconv_close (cd);
>   DEBUGP (("Converted file name '%s' (%s) -> '%s' (%s)\n",
>orig_fname, from_encoding, converted_fname,
to_encoding));
>   xfree (orig_fname);
>   return converted_fname;
> }
> 

___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/




[bug #60494] Percent character in filename gets escaped twice

2021-05-15 Thread Kode Charlie
Follow-up Comment #1, bug #60494 (project wget):

system: macOS Big Sur v. 11.2.3
wget: master branch

Offending logic from *./src/url.c* is shown below. I believe the issue is in
*iconv()*.


  if (iconv (cd, (ICONV_CONST char **) , , , ) == 0
  && iconv (cd, NULL, NULL, , ) == 0)
{
  *(converted_fname + len - outlen - done) = '\0';
  iconv_close (cd);
  DEBUGP (("Converted file name '%s' (%s) -> '%s' (%s)\n",
   orig_fname, from_encoding, converted_fname, to_encoding));
  xfree (orig_fname);
  return converted_fname;
}


___

Reply to this item at:

  

___
  Message sent via Savannah
  https://savannah.gnu.org/