Hi Yossi,

strange error, indeed. Is it also reproducible in pseudo-distributed mode using 
Hadoop 2.7.2,
the version Nutch depends on?n

Could you also add the line
  set -x
to bin/nutch and run bin/crawl again to see how all steps are executed.

Thanks,
Sebastian

On 04/30/2017 04:04 PM, Yossi Tamari wrote:
> Hi,
> 
>  
> 
> I'm trying to run Nutch 1.13 on Hadoop 2.8.0 in pseudo-distributed
> distributed mode.
> 
> Running the command:
> 
> Deploy/bin/crawl urls crawl 2
> 
> The Injector and Generator run successfully, but in the Fetcher I get the
> following error:
> 
> 17/04/30 08:43:48 ERROR fetcher.Fetcher: Fetcher:
> java.lang.IllegalArgumentException: Wrong FS:
> hdfs://localhost:9000/user/root/crawl/segments/20170430084337/crawl_fetch,
> expected: file:///
> 
>         at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:665)
> 
>         at
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:8
> 6)
> 
>         at
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFile
> System.java:630)
> 
>         at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFi
> leSystem.java:861)
> 
>         at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.jav
> a:625)
> 
>         at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:43
> 5)
> 
>         at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1436)
> 
>         at
> org.apache.nutch.fetcher.FetcherOutputFormat.checkOutputSpecs(FetcherOutputF
> ormat.java:55)
> 
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:270)
> 
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java
> :141)
> 
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
> 
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)
> 
>         at java.security.AccessController.doPrivileged(Native Method)
> 
>         at javax.security.auth.Subject.doAs(Subject.java:422)
> 
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
> va:1807)
> 
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)
> 
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
> 
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
> 
>         at java.security.AccessController.doPrivileged(Native Method)
> 
>         at javax.security.auth.Subject.doAs(Subject.java:422)
> 
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
> va:1807)
> 
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
> 
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
> 
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:870)
> 
>         at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:486)
> 
>         at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:521)
> 
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> 
>         at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:495)
> 
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62
> )
> 
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
> .java:43)
> 
>         at java.lang.reflect.Method.invoke(Method.java:498)
> 
>         at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
> 
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> 
>  
> 
> Error running:
> 
>   /data/apache-nutch-1.13/runtime/deploy/bin/nutch fetch -D
> mapreduce.job.reduces=2 -D mapred.child.java.opts=-Xmx1000m -D
> mapreduce.reduce.speculative=false -D mapreduce.map.speculative=false -D
> mapreduce.map.output.compress=true -D fetcher.timelimit.mins=180
> crawl/segments/20170430084337 -noParsing -threads 50
> 
> Failed with exit value 255.
> 
>  
> 
>  
> 
> Any ideas how to fix this?
> 
>  
> 
> Thanks,
> 
>                Yossi.
> 
> 

Reply via email to