Ian Can you please help with this? I have upgraded to Nutch 0.9. I am able to run Nutch in a standalone mode, ie without hadoop. But with hadoop I get the error "Generator: 0 records selected for fetching, exiting ...". I have performed this step - bin/hadoop dfs -put urls urls. And upon running bin/hadoop dfs -ls, I see that urls is there in the dfs bin/hadoop dfs -ls Found 2 items /user/nutch/crawl <dir> /user/nutch/urls <dir>
Output of Crawl. crawl started in: crawl rootUrlDir = urls threads = 10 depth = 3 topN = 50 Injector: starting Injector: crawlDb: crawl/crawldb Injector: urlDir: urls Injector: Converting injected urls to crawl db entries. Injector: Merging injected urls into crawl db. Injector: done Generator: Selecting best-scoring urls due for fetch. Generator: starting Generator: segment: crawl/segments/20070419134155 Generator: filtering: false Generator: topN: 50 Generator: 0 records selected for fetching, exiting ... Stopping at depth=0 - no more URLs to fetch. No URLs to fetch - check your seed list and URL filters. crawl finished: crawl ************************************** See what's free at http://www.aol.com.
