Hi, I'm working on a "Web-in-a-sandbox" project, trying to host shallow (-l 2) copies of several web sites on a server running in a private Internet "replica".
So far, httrack's "-K5" option (which they call "transparent proxy URL") appears to do what I need (see http://www.httrack.com/html/httrack.man.html): 1. rename script output of site.com/article.cgi?25 to ./site.com/articleDEADBEEF.css (note the difference from --adjust-extension) 2. rewrite the link in the referencing document as "http://site.com/articleDEADBEEF.css?25" which works perfectly when hosting both referencing and referenced sites in the sandbox. I unfortunately found httrack to be otherwise very fragile, and (the major dealbreaker for me) still unable to follow meta refresh links, so I'd like to see wget gain the ability to rename source links and target documents the way httrack's "-K5" flag works, as described above. With wget, I'm using "-k -E" (--convert-links and --adjust-extension) when mirroring these web sites, but would be interested in an alternative way of accomplishing --convert-links. As far as I was able to tell, --adjust-extension will append a .html or .css when saving script output, e.g. from something like http://site.com/article.cgi?25 to ./site.com/article.cgi?25.css but not also rewrite the referencing URL in the document which caused us to recurse and wget the output of this script. If I try to add --convert-links into the mix, the referencing link does get rewritten, but ends up looking like "../site.com/article.cgi?25.html" which is designed for offline viewing via "file://", and is unsuitable for actually hosting both the referencing and referenced sites as virtual servers in a web server within the sandbox. Am I missing something about wget's capatiblities that would allow me to get it to work in a way similar to httrack's -K5 option ? If not, assuming I can come up with a patch, would there be any interest in upstreaming this type of additional functionality ? Thanks much, --Gabriel
