Hi,
Setting the MapReduce framework to YARN solved this issue.
Yossi.
From: Yossi Tamari [mailto:[email protected]]
Sent: 30 April 2017 17:04
To: [email protected]
Subject: Wrong FS exception in Fetcher
Hi,
I'm trying to run Nutch 1.13 on Hadoop 2.8.0 in pseudo-distributed
distributed mode.
Running the command:
Deploy/bin/crawl urls crawl 2
The Injector and Generator run successfully, but in the Fetcher I get the
following error:
17/04/30 08:43:48 ERROR fetcher.Fetcher: Fetcher:
java.lang.IllegalArgumentException: Wrong FS:
hdfs://localhost:9000/user/root/crawl/segments/20170430084337/crawl_fetch,
expected: <file:///> file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:665)
at
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:8
6)
at
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFile
System.java:630)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFi
leSystem.java:861)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.jav
a:625)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:43
5)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1436)
at
org.apache.nutch.fetcher.FetcherOutputFormat.checkOutputSpecs(FetcherOutputF
ormat.java:55)
at
org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:270)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java
:141)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1807)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1807)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:870)
at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:486)
at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:521)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:495)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62
)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Error running:
/data/apache-nutch-1.13/runtime/deploy/bin/nutch fetch -D
mapreduce.job.reduces=2 -D mapred.child.java.opts=-Xmx1000m -D
mapreduce.reduce.speculative=false -D mapreduce.map.speculative=false -D
mapreduce.map.output.compress=true -D fetcher.timelimit.mins=180
crawl/segments/20170430084337 -noParsing -threads 50
Failed with exit value 255.
Any ideas how to fix this?
Thanks,
Yossi.