I've followed the tutorial on the Wiki site and have successfuly indexed a few pages on www.apache.com with the command
bin/nutch crawl /etc/opt/nutch/urls -dir /var/lib/nutch-crawls/test1 -depth 3 -topN 50 a query for "apache" on my local nutch/tomcat installation gives me 52 matching pages. Next I changed /usr/local/nutch/conf/crawl-urlfilter.txt to allow to www.circuitcity.com with +^http://www.circuitcity.com/. I also added the root page to /etc/opt/nutch/urls/circuitcity. I clear out my test run with rm /var/lib/nutch-crawls/test1/* -Rf and rerun my crawl bin/nutch crawl /etc/opt/nutch/urls -dir /var/lib/nutch-crawls/test1 -depth 3 -topN 50 I looks like it downloads plenty of pages (all from circuitcity). When I try searching for anything on the tomcat/nutch app I get 0 results all the time. I can switch back to apache and the index turns up results. Is there a config file I missed somewhere? Regards, Daniel Garcia
