Recently I used the following wget command under a hosted linux account:

 $ wget -mirror <url> -o mirror.log

The web site contained files and virtual directories that contained spaces in the names.
URL encoding translated these spaces to %20.


wget correctly URL decoded the file names (creating file names containing spaces) but incorrectly failed to URL decode the directory names (creating directory paths containing %20 instead of spaces). The resulting mirror therefor contained broken links. Some hyper links were embedded inside flash graphics files so hyper link renaming was not an option. Personally, I would never put a space in a web hosted file or directory name but in this case I was migrating a web site that had been developed by someone else. I think that mirroring should work regardless in this case.

Example:
Original path:              abc def/xyz pqr.gif
After wget mirroring:   abc%20def/xyz pqr.gif           (broken link)

wget --version  is GNU Wget 1.8.2

Thanks for the invaluable wget.

Tony O'Hagan.



--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.13 - Release Date: 16/01/2005



Reply via email to