Hi all: Just FYI that I have solved the problem. I checked the hadoop.log file and found that the "plugin.folder" property was set incorrectly.
Thank you for your help. Andy On 11 April 2012 16:41, Andy Xue <[email protected]> wrote: > Hi Lewis: > > Thank you for the help. This is the (entire) output after I set the log4j > property to debug. > ============================================================== > crawl started in: crawl > rootUrlDir = urls > threads = 10 > depth = 2 > solrUrl=http://localhost:8983/solr/ > topN = 10 > Injector: starting at 2012-04-11 16:37:20 > Injector: crawlDb: crawl/crawldb > Injector: urlDir: urls > Injector: Converting injected urls to crawl db entries. > > Exception in thread "main" java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) > at org.apache.nutch.crawl.Injector.inject(Injector.java:217) > at org.apache.nutch.crawl.Crawl.run(Crawl.java:127) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.nutch.crawl.Crawl.main(Crawl.java:55) > ============================================================== > > And btw, The "urls" directory is correct and it does contain a txt file > of a list of urls. > > Regards > Andy > > > > On 10 April 2012 22:08, Lewis John Mcgibbney <[email protected]>wrote: > >> There is no more log information before the solrUrl stuff, no? >> >> try setting log4j.properties to debug in conf/ rebuild the project and see >> whats going on. >> >> On Tue, Apr 10, 2012 at 1:03 PM, Andy Xue <[email protected]> wrote: >> >> > Lewis: >> > Thanks for the reply. >> > However as far as I know, I don't have to set solrUrl unless I want to >> > index using solr. >> > >> > Correct. My fault. I just assumed that this was required. >> >> Lewis >> > >

