Hello,

I am running nutch with bin/nutch crawl urls -dir crawl -depth 3 -topN 3

and in my urls/sites file, I have two sites like:

http://www.mysite.com
http://www.mysite2.com

I would like to crawl those two sites to infinite depth, and just
index all the pages in these sites. But I dont want it to go to remote
sites, like facebook if there is a link from those sites.

How do I do it? I know this is a primitive question, but I have looked
all the documentation but could not figure it out.

Best Regards,
C.B.

Reply via email to