Hi Yossi, strange error, indeed. Is it also reproducible in pseudo-distributed mode using Hadoop 2.7.2, the version Nutch depends on?n
Could you also add the line set -x to bin/nutch and run bin/crawl again to see how all steps are executed. Thanks, Sebastian On 04/30/2017 04:04 PM, Yossi Tamari wrote: > Hi, > > > > I'm trying to run Nutch 1.13 on Hadoop 2.8.0 in pseudo-distributed > distributed mode. > > Running the command: > > Deploy/bin/crawl urls crawl 2 > > The Injector and Generator run successfully, but in the Fetcher I get the > following error: > > 17/04/30 08:43:48 ERROR fetcher.Fetcher: Fetcher: > java.lang.IllegalArgumentException: Wrong FS: > hdfs://localhost:9000/user/root/crawl/segments/20170430084337/crawl_fetch, > expected: file:/// > > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:665) > > at > org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:8 > 6) > > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFile > System.java:630) > > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFi > leSystem.java:861) > > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.jav > a:625) > > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:43 > 5) > > at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1436) > > at > org.apache.nutch.fetcher.FetcherOutputFormat.checkOutputSpecs(FetcherOutputF > ormat.java:55) > > at > org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:270) > > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java > :141) > > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) > > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:422) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja > va:1807) > > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338) > > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575) > > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:422) > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja > va:1807) > > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570) > > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561) > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:870) > > at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:486) > > at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:521) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > > at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:495) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62 > ) > > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl > .java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > > at org.apache.hadoop.util.RunJar.run(RunJar.java:234) > > at org.apache.hadoop.util.RunJar.main(RunJar.java:148) > > > > Error running: > > /data/apache-nutch-1.13/runtime/deploy/bin/nutch fetch -D > mapreduce.job.reduces=2 -D mapred.child.java.opts=-Xmx1000m -D > mapreduce.reduce.speculative=false -D mapreduce.map.speculative=false -D > mapreduce.map.output.compress=true -D fetcher.timelimit.mins=180 > crawl/segments/20170430084337 -noParsing -threads 50 > > Failed with exit value 255. > > > > > > Any ideas how to fix this? > > > > Thanks, > > Yossi. > >

