Nigel Horne wrote: > Use your favourite browser to visit the page > http://web.archive.org/web/20080207072124/http://barry-white.members.beeb.net/ > and look at its source and you'll find URLs such as > http://web.archive.org/web/20080207072124/http://barry-white.members.beeb.net/registers/pr_birch_c1.pdf > > Now run wget -m -k -K -E > http://web.archive.org/web/20080207072124/http://barry-white.members.beeb.net > and look at the index.html that's been retrieved and > you'll find that the above URL has been changed to > http://barry-white.members.beeb.net.wstub.archive.org/registers/pr_birch_c1.pdf > which is entirely different.
Guess what? Wget's not doing that, archive.org is. In fact, if you look closer at those sources, you'll see that the html BASE tag is set as Wget sees it; archive.org inserts JavaScript to replace that tag after the fact. Please also see http://www.archive.org/about/faqs.php#28 -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer. Maintainer of GNU Wget and GNU Teseq http://micah.cowan.name/ -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected]

