Hi,

I found a workaround to this problem. I was able to run the fetcher with the nutch*.job command using the latest working nighly build from 12-28-2008.

Shirley

Shirley Cohen wrote:
Hi,

I'm new to nutch and am trying to run it on an existing hadoop 0.19.0 install. I'm using the command "hadoop jar nutch-2008-12-02_04-01-57.job", as suggested by Dennis Kubes in an earlier post. I've been able to crawl and generate segments successfully using the following commands:

hadoop dfs -put dmoz dmoz
bin/hadoop jar nutch-2008-12-02_04-01-57.job org.apache.nutch.crawl.Injector crawl/crawldb dmoz bin/hadoop jar nutch-2008-12-02_04-01-57.job org.apache.nutch.crawl.Generator crawl/crawldb crawl/segments

However, when I try to run the fetcher using the command:

bin/hadoop jar nutch-2008-12-02_04-01-57.job org.apache.nutch.fetcher.Fetcher crawl/segments/20090104094558

I get the following error:

09/01/04 10:20:31 INFO fetcher.Fetcher: Fetcher: starting
09/01/04 10:20:31 INFO fetcher.Fetcher: Fetcher: segment: crawl/segments/20090104094558
****calling init JobTracker*****
java.lang.NoSuchMethodError: org.apache.nutch.fetcher.Fetcher$InputFormat.listPaths(Lorg/apache/hadoop/mapred/JobConf;)[Lorg/apache/hadoop/fs/Path; at org.apache.nutch.fetcher.Fetcher$InputFormat.getSplits(Fetcher.java:61) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:783)
       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1128)
       at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:530)
       at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:565)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
       at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:537)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
       at java.lang.reflect.Method.invoke(Unknown Source)
       at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
       at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
       at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)

Note: The subdirectory "20090104094558" was created by the generator.

I'm running the 0.9 release of nutch downloaded from: http://mirrors.24-7-solutions.net/pub/apache/lucene/nutch/

Does anyone know what is going on?

Thanks in advance,

Shirley


Reply via email to