Hrvoje Niksic wrote:
Frank McCown <[EMAIL PROTECTED]> writes:
Earlier today I sent an email explaining that wget already handles
".." in the middle of a URL correctly, it just doesn't handle ".."
immediately after the domain name correctly.
But it does, at least according to rfc1808, which mandates leading
".." to be left alone. I don't know what the new rfc3986 says about
that, though.
Wget will currently convert the request of
http://foo.org/BLAH/../page.html to http://foo.org/page.html. What
it should also do is convert a request for
http://foo.org/../page.html to http://foo.org/page.html.
Not according to rfc1808.
As far as I can tell, the only time you'd want to encode ".." to
"%2E%2E" would be in a query string.
Are you referring to URL encoding or to file name encoding? As far as
I know, Wget converts ".." to "%2E%2E" only when doing file name
encoding, to make sure that malformed or malicious URLs don't write to
arbitrary portions of the file system.
According to rfc1808 sec 5.2, the ".." should be left at the beginning
of the URL path. But according to the new rfc3986 sec 5.4.2, the ".."
should be removed from the beginning of the URL path.
With this new behavior implemented, wget would never make a URL request
with ".." in it except in the query string. Therefore ".." would never
need to be encoded since a query string with "blah/../blah" would always
be encoded to "blah%2F..%2Fblah" and would not affect the file system path.
Frank