Hi,
I was trying to test nutch 1.10 with Hadoop-2.7.1 but during the inject
phase I came across with some errors.
> I was executing $NUTCH_HOME/runtime/deploy/bin/crawl -i /home/nutch/urls
> /home/nutch/crawl/ 1
15/09/10 19:41:17 ERROR crawl.Injector: Injector:
> java.lang.IllegalArgumentException: Wrong FS:
> hdfs://localhost:9000/user/root/inject-temp-875522145, expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:646)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:82)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:601)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596)
> at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1437)
> at
> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:506)
> at org.apache.nutch.crawl.CrawlDb.install(CrawlDb.java:168)
> at org.apache.nutch.crawl.Injector.inject(Injector.java:356)
> at org.apache.nutch.crawl.Injector.run(Injector.java:379)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.nutch.crawl.Injector.main(Injector.java:369)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
my conf file (hadoop-2.7.1) is given below
-------- core-site.xml --------
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/nutch/hadoopData/hadoopTmpDir</value>
</property>
-------- hdfs-site.xml --------
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/nutch/hadoopData/nameNodeData</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/nutch/hadoopData/dataNodeData</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
-------- mapred-site.xml --------
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/home/nutch/hadoopData/mapredJobTrackerData</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/home/nutch/hadoopData/mapredTaskTrackerData</value>
</property>
But the same command works successfully when I use Hadoop-1.2.1.
What is the preferred version of Hadoop that we should use with Apache
Nutch 1.10
Thank you so much.
Imtiaz Shakil Siddique