Thank you for the great response. It is much appreciated - see below... On Tue, 7 Oct 2003, Hrvoje Niksic wrote:
> www.zorg.org/vsound/ contains this markup: > > <META NAME="ROBOTS" CONTENT="NOFOLLOW"> > > That explicitly tells robots, such as Wget, not to follow the links in > the page. Wget respects this and does not follow the links. You can > tell Wget to ignore the robot directives. For me, this works as > expected: > > wget -km -e robots=off http://www.zorg.org/vsound/ Perfect - thank you. > > At first it will act normally, just going over the site in question, but > > sometimes, you will come back to the terminal and see if grabbing all > > sorts of pages from totally different sites (!) > > The only way I've seen it happen is when it follows a redirection to a > different site. The redirection is followed because it's considered > to be part of the same download. However, further links on the > redirected site are not (supposed to be) followed. Ok, is there a way to tell wget not to follow redirects, so it will not ever do that at all ? Basically I am looking for a way to tell wget "don't ever get anything with a different FQDN than what I started you with" thanks.
