Hi Team,

    Initially I followed the steps mentioned in the  nutch wiki
tutorial<http://wiki.apache.org/nutch/NutchTutorial>
to set up nutch from binary distribution. And it was successful undertook
crawling and indexing.


Now I am trying to set up nutch in eclipse and I am stuck at 1.4.3 step  (
Link <http://wiki.apache.org/nutch/RunNutchInEclipse#Configure_Nutch>)
 mentioned below

   - 1. see the Tutorial and follow all configuration steps, ensure that
   you DO NOT undertake any crawling. The directory structure for Nutch trunk
   enables us to edit nutch-site.xml.template, nutch-default.xml and
   regex-urlfilter.txt.template in our /conf directory, these properties will
   then be automatically built into our /runtime build folder.
   - 2. ensure that you change the property "plugin.folders" to
   "./src/plugin" on $NUTCH_HOME/conf/nutch-site.xml.


This step 1 is pointing to the same tutorial that I followed in step one
when I used nutch in binary version. My doubt is whether I should use same
setup(if yes, where do I need to mention in eclipse nutch project that
nutch_home is at particular location) or should I follow the same steps and
configure it in eclipse work space //trunk folder?

  I am getting job failed message, error java.lang.RuntimeException: Error
in configuring object
at
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265)
at org.apache.nutch.crawl.Injector.inject(Injector.java:281)
at org.apache.nutch.crawl.Crawl.run(Crawl.java:127)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)

Regards
Rajani

Reply via email to