wget recursive too broad?

Jesse Peterson Thu, 15 Feb 2007 09:53:15 -0800

(Please cc me/reply all)

No matter what I try (including specifically limiting domains with -Hand -D) wget crawls sites that are not specified on the command line.


For example. A simple:
% wget -r -l 1 http://www.nytimes.com
[stop after 2 mins, and then]
% ls -1
homedelivery.nytimes.com/
jobmarket.nytimes.com/
listings.nytimes.com/
personal.fidelity.com/
schools.nyc.gov/
select.nytimes.com/
video.on.nytimes.com/
www.brownharrisstevens.com/
www.continental.com/
www.nytimes.com/

Shouldn't it be getting things *just* in nytimes.com? Also, it doesappear to be crawling those sites, not just single-links from a site,which appears to go against the -l switch.


Thanks,
- Jesse

wget recursive too broad?

Reply via email to