This is a newbie question. Please forgive me if this is already
answered somewhere.
I am trying to follow the nutch 0.8 version tutorial to run nutch
crawler over the web. I tried to bootstrap the crawldb by injecting the
urls obtained from dmoz using the command:
bin/nutch inject crawl/crawldb dmoz
The following exception occurred:
Injector: starting
Injector: crawlDb: crawl/crawldb
Injector: urlDir: ../devel/dmoz
Injector: Converting injected urls to crawl db entries.
Injector: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
at org.apache.nutch.crawl.Injector.inject(Injector.java:162)
at org.apache.nutch.crawl.Injector.run(Injector.java:192)
at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
at org.apache.nutch.crawl.Injector.main(Injector.java:182)
The exception didn't offer much details. Any help is highly
appreciated.
ND