[nlug] checking for broken links with wget

KJ Tue, 01 Mar 2011 13:00:58 -0800

This will probably be one of those "d'oh" moments...

I was looking at building a log file to see what broken links might be
listed.  Wget works as I expect with -r and -m etc.  But when I add the
--spider option it doesn't work correctly, seems to stop on the index.html
file.  I can grab index.html by using wget w/out the spider option.


$ wget --spider -r -o logfile.txt www.domain.com

$ cat logfile.txt
--14:51:45--  http://www.domain.com/
           => `www.domain.com/index.html'
Resolving www.domain.com ... 10.52.129.200
Connecting to www.domain.com |10.52.129.200|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 34,133 (33K) [text/html]
200 OK

wwwdomain.com/index.html: No such file or directory

FINISHED --14:51:45--
Downloaded: 0 bytes in 0 files


So I looked around and thought maybe settig wgetrc to robots = off was the
catch, but NO dice!
Anyone have a clue on something I may be overlooking?

Cheers,
Ken

-- 
You received this message because you are subscribed to the Google Groups 
"NLUG" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/nlug-talk?hl=en

[nlug] checking for broken links with wget

Reply via email to