In replies to the post requesting support of the file:// scheme, requests were made for someone to provide a compelling reason to want to do this. Perhaps the following is such a reason.
I have a CD with HTML content (it is a CD of abstracts from a scientific conference), however for space reasons not all the content was included on the CD there remain links to figures and diagrams on a remote web site. Id like to create an archive of the complete content locally by having wget retrieve everything and convert the links to point to the retrieved material. Thus the wget functionality when retrieving the local files should work the same as if the files were retrieved from a web server (i.e. the input local file needs to be processed, both local and remote content retrieved, and the copies made of the local and remote files all need to be adjusted to now refer to the local copy rather than the remote content). A
simple shell script that runs cp or rsync on local files without any further processing would not achieve this aim.
Regarding to where the local files should be copied, I suggest a default scheme similar to current http functionality. For example, if the local source was /source/index.htm, and I ran something like:
wget.exe -m -np -k file:///source/index.htm
this could be retrieved to ./source/index.htm (assuming that I ran the command from anywhere other than the root directory). On Windows, if the local source file is c:\test.htm, then the destination could be .\c\test.htm. It would probably be fair enough for wget to throw up an error if the source and destination were the same file (and perhaps helpfully suggest that the user changes into a new subdirectory and retry the command).
One additional problem this scheme needs to deal with is when one or more /../ in the path specification results in the destination being above the current parent directory; then the destination would have to be adjusted to ensure the file remained within the parent directory structure. For example, if I am in /dir/dest/ and ran
wget.exe -m -np -k file://../../source/index.htm
this could be saved to ./source/index.htm (i.e. /dir/dest/source/index.htm)
-David.
On Yahoo!7
Socceroos Central: Latest news, schedule, blogs and videos.
