It looks to me like you have a mismatch in the version of hadoop you are using with Nutch. Nutch trunk is on 0.19. You might want to try building from SVN and then retrying.

Dennis

Shirley Cohen wrote:
Hi,

I found a workaround to this problem. I was able to run the fetcher with the nutch*.job command using the latest working nighly build from 12-28-2008.

Shirley

Shirley Cohen wrote:
Hi,

I'm new to nutch and am trying to run it on an existing hadoop 0.19.0 install. I'm using the command "hadoop jar nutch-2008-12-02_04-01-57.job", as suggested by Dennis Kubes in an earlier post. I've been able to crawl and generate segments successfully using the following commands:

hadoop dfs -put dmoz dmoz
bin/hadoop jar nutch-2008-12-02_04-01-57.job org.apache.nutch.crawl.Injector crawl/crawldb dmoz bin/hadoop jar nutch-2008-12-02_04-01-57.job org.apache.nutch.crawl.Generator crawl/crawldb crawl/segments

However, when I try to run the fetcher using the command:

bin/hadoop jar nutch-2008-12-02_04-01-57.job org.apache.nutch.fetcher.Fetcher crawl/segments/20090104094558

I get the following error:

09/01/04 10:20:31 INFO fetcher.Fetcher: Fetcher: starting
09/01/04 10:20:31 INFO fetcher.Fetcher: Fetcher: segment: crawl/segments/20090104094558
****calling init JobTracker*****
java.lang.NoSuchMethodError: org.apache.nutch.fetcher.Fetcher$InputFormat.listPaths(Lorg/apache/hadoop/mapred/JobConf;)[Lorg/apache/hadoop/fs/Path; at org.apache.nutch.fetcher.Fetcher$InputFormat.getSplits(Fetcher.java:61) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:783)
       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1128)
       at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:530)
       at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:565)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
       at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:537)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
       at java.lang.reflect.Method.invoke(Unknown Source)
       at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
       at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
       at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

Note: The subdirectory "20090104094558" was created by the generator.

I'm running the 0.9 release of nutch downloaded from: http://mirrors.24-7-solutions.net/pub/apache/lucene/nutch/

Does anyone know what is going on?

Thanks in advance,

Shirley


Reply via email to