Re: Nutch 1.6 on Windows

chethan Tue, 30 Apr 2013 06:00:12 -0700

The version of Hadoop that Nutch uses in 1.6 causes this issue on Windows.
This is resolved by changing the Hadoop dependency to *0.20.2* in
$NUTCH_HOME/ivy/ivy.xml:


               <dependency org="org.apache.hadoop" name="hadoop-core"
rev="0.20.2"
conf="*->default">
<exclude org="hsqldb" name="hsqldb" />
<exclude org="net.sf.kosmosfs" name="kfs" />
<exclude org="net.java.dev.jets3t" name="jets3t" />
<exclude org="org.eclipse.jdt" name="core" />
<exclude org="org.mortbay.jetty" name="jsp-*" />
<exclude org="ant" name="ant" />
</dependency>

This is what worked for me, although there could be a way of fixing this
without rolling hadoop back to 0.20.2 that I'm unaware of.

HTH
Thanks
Chethan


On Tue, Apr 30, 2013 at 6:15 PM, Benjamin Sznajder <[email protected]>wrote:

>
> Hi,
>
> What are the needed software for running Nutch 1.6 on Windows?
>
> - I downloaded Nutch.1.6
> - I am using JDK 1.7 from IBM
> - I installed Cygwin and am running Nutch from Cygwin.
>
> However, when launching the basic script, I am getting the following error.
>
> Are there specific steps that are necessary for running on Windows?
>
> Is Nutch supported to be run on Windows at all , i.e. not just for
> development, but also in production?
>
> Best regards
> Benjamin
>
> benjams@BENJAMS-TP /cygdrive/c/apache-nutch-1.6
> $ bin/nutch inject crawl/crawldb C:/temp/urls
> cygpath: can't convert empty path
> Injector: starting at 2013-04-30 15:42:09
> Injector: crawlDb: crawl/crawldb
> Injector: urlDir: C:/temp/urls
> Injector: Converting injected urls to crawl db entries.
> Injector: java.io.IOException: Failed to set permissions of path: \tmp
> \hadoop-benjams\mapred\staging\benjams-422849630\.staging to 0700
>         at org.apache.hadoop.fs.FileUtil.checkReturnValue
> (FileUtil.java:689)
>         at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
>         at org.apache.hadoop.fs.RawLocalFileSystem.setPermission
> (RawLocalFileSystem.java:509)
>         at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs
> (RawLocalFileSystem.java:344)
>         at org.apache.hadoop.fs.FilterFileSystem.mkdirs
> (FilterFileSystem.java:189)
>         at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir
> (JobSubmissionFiles.java:116)
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:918)
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:912)
>         at java.security.AccessController.doPrivileged
> (AccessController.java:314)
>         at javax.security.auth.Subject.doAs(Subject.java:572)
>         at org.apache.hadoop.security.UserGroupInformation.doAs
> (UserGroupInformation.java:1149)
>         at org.apache.hadoop.mapred.JobClient.submitJobInternal
> (JobClient.java:912)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:886)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1323)
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:281)
>         at org.apache.nutch.crawl.Injector.run(Injector.java:318)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.nutch.crawl.Injector.main(Injector.java:308)
>
>
>

Re: Nutch 1.6 on Windows

Reply via email to