Hi all I am trying to setup nutch 1.2 on Hadoop and used the instructions at http://wiki.apache.org/nutch/NutchHadoopTutorial, it has been very useful.
However, I find that when I execute the command: $bin/nutch crawl urls -dir crawl -depth 4 -topN 50 The crawler stops at the generator stage with the message: 2011-03-06 17:23:49,538 WARN crawl.Generator - Generator: 0 records selected for fetching, exiting ... I have configured the following plugins in nutch-site.xml protocol-http|parse-(text|html|js)|urlnormalizer-(pass|regex|basic)|urlfilter-regex|index-(basic|anchor) I am not using crawl-urlfilter.txt or regex-urlfilter.txt tp filter URLs. I initiated the crawl with 10 seed urls from popular sites on internet. Any pointers to what I am missing here? regards Chidu

