Dear all, I am new to nutch, and recently is trying nutch1.7 and Solr4.4 to build a search engine. Here are some questions after trying for a while:
1. I use this command to start the crawling, as stated in the tutorial /bin/bash ./bin/crawl urls/seed.txt TestCrawl http://localhost:8983/solr/ 2 So when the crawled pages will be sent to Solr for indexing? As when I look at the Solr dashboard, the number of docs is not increasing when the crawling is in progress. 2. About error handling. If some java exceptions are thrown in the middle of crawling, how can I know the crawled data are indexed, and where will the crawling resume if I execute the above command? 3. Any advice about executing the crawling if I want to index those frequently updated pages, e.g. bbc news? Thanks. Regards, Patrick -- View this message in context: http://lucene.472066.n3.nabble.com/some-questions-about-nutch-from-a-new-user-tp4092548.html Sent from the Nutch - User mailing list archive at Nabble.com.

