Mauro Tortonesi wrote:
Frank McCown wrote:

It would be nice if wget could handle these mal-adjusted URLs properly since they do appear from time to time. (In the case of www.merseyfire.gov.uk, they appear very frequently unfortunately.)


yes, it would be nice. but how are you exactly proposing to achieve this result? by adding yet another cryptic --ignore-double-dot option?


I don't think it would be necessary to add an option as this should be the expected behavior all the time.

Right now wget properly handles '..' if it occurs later in the URL.

Example:

wget -r http://www.merseyfire.gov.uk/BLAH/../pages/fire_auth/councillors.htm

correctly changes the URL to request

http://www.merseyfire.gov.uk/pages/fire_auth/councillors.htm

All that would be needed is to ignore all '..' in URLs that appear directly after the domain name.

wget -r http://www.foo.org/../../../../page.html

should simply request

http://www.foo.org/page.html

BTW, you guys have done a great job with wget and I hope you don't take any of my suggestions as criticism about your software.

Thanks,
Frank

Reply via email to