The following code in url.c makes it impossible to request urls that
contain multiple slashes in a row in their query string:

        else if (*h == '/')
        {
          /* Ignore empty path elements.  Supporting them well is hard
             (where do you save "http://x.com///y.html";?), and they
             don't bring any practical gain.  Plus, they break our
             filesystem-influenced assumptions: allowing them would
             make "x/y//../z" simplify to "x/y/z", whereas most people
             would expect "x/z".  */
          ++h;
        }

Think of something like http://foo/bar/redirect.cgi?http://...
wget translates this into:

http://foo/bar/redirect.cgi?http:/...

and then the web server of course gives an error. Note that the
problem occurs even if the slashes were url escaped, since wget
unescapes them.

Removing the offending code fixes the problem, but I'm not sure if
this is the correct solution. I expect it would be more correct to
remove multiple slashes only before the first occurrance of ?, but not
afterwards.

Rich

Reply via email to