After doing all the steps again, I am now getting this. Nutch 1.2
Getting closer! (I think) crawl started in: crawl rootUrlDir = urls threads = 10 depth = 3 indexer=lucene topN = 50 Injector: starting at 2010-08-16 23:19:52 Injector: crawlDb: crawl/crawldb Injector: urlDir: urls Injector: Converting injected urls to crawl db entries. *Skipping http://lucene.apache.org/nutch/:java.lang.NullPointerException* Injector: Merging injected urls into crawl db. Injector: finished at 2010-08-16 23:19:55, elapsed: 00:00:02 Generator: starting at 2010-08-16 23:19:55 Generator: Selecting best-scoring urls due for fetch. Generator: filtering: true Generator: normalizing: true Generator: topN: 50 Generator: jobtracker is 'local', generating exactly one partition. Generator: 0 records selected for fetching, exiting ... Stopping at depth=0 - no more URLs to fetch. No URLs to fetch - check your seed list and URL filters. crawl finished: crawl I will continue to investigate, but would really appreciate some help ;) J