Hello,
I have noticed very unpredictable behavior from wget 1.8.2 - specifically
I have noticed two things:
a) sometimes it does not follow all of the links it should
b) sometimes wget will follow links to other sites and URLs - when the
command line used should not allow it to do that.
Here
Thank you for the great response. It is much appreciated - see below...
On Tue, 7 Oct 2003, Hrvoje Niksic wrote:
www.zorg.org/vsound/ contains this markup:
META NAME=ROBOTSCONTENT=NOFOLLOW
That explicitly tells robots, such as Wget, not to follow the links in
the page. Wget
Generally, I mirror an entire web site with:
wget --tries=inf -nH --no-parent --random-wait -r -l inf --convert-links
--html-extension www.example.com
But, that is if I am mirroring an _entire_ web site - where the URL looks
like;
www.example.com
BUT, how can I mirror a URL that looks like: