Hello guys, I am pretty new with nutch so bear with me. I have been encountering an IOException during one of my test crawls. I am using nutch 1.6 with hadoop 0.20.2 (chose this version for windows compatibiliy in setting file access rights).
I am running nutch through eclipse. I followed this guide in importing nutch from an SVN: http://wiki.apache.org/nutch/RunNutchInEclipse My crawler's code is from this website: http://cmusphinx.sourceforge.net/2012/06/building-a-java-application-with-apache-nutch-and-solr/ Here is the system exception log: solrUrl is not set, indexing will be skipped... crawl started in: crawl rootUrlDir = urls threads = 1 depth = 1 solrUrl=null topN = 1 Injector: starting at 2013-03-31 23:51:11 Injector: crawlDb: crawl/crawldb Injector: urlDir: urls Injector: Converting injected urls to crawl db entries. *java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) at org.apache.nutch.crawl.Injector.inject(Injector.java:218) at org.apache.nutch.crawl.Crawl.run(Crawl.java:127) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at rjpb.sp.crawler.CrawlerTest.main(CrawlerTest.java:51)* I see these calls involving paths before #Injector.inject() in Crawl.java *Path crawlDb = new Path(dir + "/crawldb"); Path linkDb = new Path(dir + "/linkdb"); Path segments = new Path(dir + "/segments"); Path indexes = new Path(dir + "/indexes"); Path index = new Path(dir + "/index");* Currently I my eclipse project does not include the folders crawldb,linkdb,segments... I think my problem is I have not set all the necessary files for crawling. I have only set nutch-site.xml,regex-urlfilter.txt, and urls/seed.txt. Any advice on the matter will be of great help. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/IOException-during-Crawl-run-JobClient-runJob-tp4052732.html Sent from the Nutch - Dev mailing list archive at Nabble.com.

