Hi all, I have been trying to run a crawl on a couple of different domains using nutch:
bin/nutch crawl urls -dir crawled -depth 3 Everytime I get the response: Stopping at depth=x - no more URLs to fetch. Sometimes a page or two at the first level get crawled and in most other cases, nothing gets crawled. I don't know if I have been making a mistake in the crawl-urlfilter.txt file. Here is how it looks for me: # accept hosts in MY.DOMAIN.NAME +^http://([a-z0-9]*\.)*blogspot.com/ (rest all other sections in the file have default values) My urllist.txt file has only one url: http://gmailblog.blogspot.com The only website where the crawl seems to be working properly is http://lucene.apache.org Any suggestions are appreciated. -- View this message in context: http://old.nabble.com/Stopping-at-depth%3D0---no-more-URLs-to-fetch-tp26310955p26310955.html Sent from the Nutch - User mailing list archive at Nabble.com.