Hi,

 

I'm trying to run Nutch 1.13 on Hadoop 2.8.0 in pseudo-distributed
distributed mode.

Running the command:

Deploy/bin/crawl urls crawl 2

The Injector and Generator run successfully, but in the Fetcher I get the
following error:

17/04/30 08:43:48 ERROR fetcher.Fetcher: Fetcher:
java.lang.IllegalArgumentException: Wrong FS:
hdfs://localhost:9000/user/root/crawl/segments/20170430084337/crawl_fetch,
expected: file:///

        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:665)

        at
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:8
6)

        at
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFile
System.java:630)

        at
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFi
leSystem.java:861)

        at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.jav
a:625)

        at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:43
5)

        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1436)

        at
org.apache.nutch.fetcher.FetcherOutputFormat.checkOutputSpecs(FetcherOutputF
ormat.java:55)

        at
org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:270)

        at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java
:141)

        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)

        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:422)

        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1807)

        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)

        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)

        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:422)

        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1807)

        at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)

        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)

        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:870)

        at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:486)

        at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:521)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)

        at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:495)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62
)

        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:43)

        at java.lang.reflect.Method.invoke(Method.java:498)

        at org.apache.hadoop.util.RunJar.run(RunJar.java:234)

        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)

 

Error running:

  /data/apache-nutch-1.13/runtime/deploy/bin/nutch fetch -D
mapreduce.job.reduces=2 -D mapred.child.java.opts=-Xmx1000m -D
mapreduce.reduce.speculative=false -D mapreduce.map.speculative=false -D
mapreduce.map.output.compress=true -D fetcher.timelimit.mins=180
crawl/segments/20170430084337 -noParsing -threads 50

Failed with exit value 255.

 

 

Any ideas how to fix this?

 

Thanks,

               Yossi.

Reply via email to