Hi, Nutch 1.10 is supposed to run with Hadoop 1.2.0. 1.10 (to be released soon) will run with 2.4.0, and probably also with newer Hadoop versions.
If you need Nutch with a recent Hadoop version right now, you could build it by yourself from trunk. Cheers, Sebastian 2015-09-11 16:14 GMT+02:00 Imtiaz Shakil Siddique <[email protected]>: > Hi, > > I was trying to test nutch 1.10 with Hadoop-2.7.1 but during the inject > phase I came across with some errors. > > > I was executing $NUTCH_HOME/runtime/deploy/bin/crawl -i /home/nutch/urls > > /home/nutch/crawl/ 1 > > 15/09/10 19:41:17 ERROR crawl.Injector: Injector: > > java.lang.IllegalArgumentException: Wrong FS: > > hdfs://localhost:9000/user/root/inject-temp-875522145, expected: file:/// > > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:646) > > at > > > org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:82) > > at > > > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:601) > > at > > > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > > at > > > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > > at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1437) > > at > > > org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:506) > > at org.apache.nutch.crawl.CrawlDb.install(CrawlDb.java:168) > > at org.apache.nutch.crawl.Injector.inject(Injector.java:356) > > at org.apache.nutch.crawl.Injector.run(Injector.java:379) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > > at org.apache.nutch.crawl.Injector.main(Injector.java:369) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:497) > > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > > > my conf file (hadoop-2.7.1) is given below > -------- core-site.xml -------- > > <property> > <name>fs.default.name</name> > <value>hdfs://localhost:9000</value> > </property> > <property> > <name>hadoop.tmp.dir</name> > <value>/home/nutch/hadoopData/hadoopTmpDir</value> > </property> > > -------- hdfs-site.xml -------- > <property> > <name>dfs.namenode.name.dir</name> > <value>/home/nutch/hadoopData/nameNodeData</value> > </property> > > <property> > <name>dfs.datanode.data.dir</name> > <value>/home/nutch/hadoopData/dataNodeData</value> > </property> > > <property> > <name>dfs.replication</name> > <value>1</value> > </property> > -------- mapred-site.xml -------- > <property> > <name>mapred.job.tracker</name> > <value>localhost:9001</value> > </property> > <property> > > <name>mapred.system.dir</name> > > <value>/home/nutch/hadoopData/mapredJobTrackerData</value> > </property> > <property> > > <name>mapred.local.dir</name> > > <value>/home/nutch/hadoopData/mapredTaskTrackerData</value> > > </property> > > But the same command works successfully when I use Hadoop-1.2.1. > What is the preferred version of Hadoop that we should use with Apache > Nutch 1.10 > > > Thank you so much. > Imtiaz Shakil Siddique >

