Any resolution to this problem? I just tried installing on Windows and I'm getting the same problem.
Susam Pal wrote: > > I tried setting hadoop.tmp.dir to /cygdrive/d/tmp and it created > D:\cygdrive\d\tmp\mapred\temp\inject-temp-1365510909\_reduce_n7v9vq. > > The same error occurred:- > > 2008-02-15 10:19:22,833 WARN mapred.LocalJobRunner - job_local_1 > java.io.IOException: Target > file:/D:/cygdrive/d/tmp/mapred/temp/inject-temp-1365 > 510909/_reduce_n7v9vq/part-00000 already exists > at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:246) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:125) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:116) > at > org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:180) > at > org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:394) > at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:452) > at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:469) > at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:426) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:165) > > Regards, > Susam Pal > > On Thu, Feb 14, 2008 at 10:07 PM, Susam Pal <[EMAIL PROTECTED]> wrote: >> What I did try was setting hadoop.tmp.dir to /opt/tmp. I found the >> behavior strange. I had an /opt/tmp directory in my Cygwin >> installation (Absolute Windows path: D:\Cygwin\opt\tmp) and I was >> expecting Hadoop to use it. However, it created a new D:\opt\tmp and >> wrote the temp files there. Of course this failed with the same error. >> >> Right now I don't have a Windows system with me. I will try setting it >> as /cygdrive/d/tmp/ tomorrow when I again have access to a Windows >> system and then I'll update the mailing list with the observations. >> Thanks for the suggestion. >> >> Regards, >> Susam Pal >> >> >> >> On Thu, Feb 14, 2008 at 9:41 PM, Dennis Kubes <[EMAIL PROTECTED]> wrote: >> > I think what might be occurring is a file path issue with hadoop. I >> > have seen it in the past. Can you try on windows using the cygdrive >> > path and see if that works? For below it would be /cygdrive/D/tmp/ >> ... >> > >> > Dennis >> > >> > >> > >> > Susam Pal wrote: >> > > I can confirm this error as I just tried running the last revision >> of >> > > Nutch, rev-620818 on Debian as well as Cygwin on Windows. >> > > >> > > It works fine on Debian but fails on Cygwin with this error:- >> > > >> > > 2008-02-14 19:49:47,756 WARN regex.RegexURLNormalizer - can\'t >> find >> > > rules for scope \'inject\', using default >> > > 2008-02-14 19:49:48,381 WARN mapred.LocalJobRunner - job_local_1 >> > > java.io.IOException: Target >> > > >> file:/D:/tmp/hadoop-guest/mapred/temp/inject-temp-322737506/_reduce_bjm6rw/part-00000 >> > > already exists >> > > at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:246) >> > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:125) >> > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:116) >> > > at >> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:196) >> > > at >> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:394) >> > > at >> org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:452) >> > > at >> org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:469) >> > > at >> org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:426) >> > > at >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:165) >> > > 2008-02-14 19:49:49,225 FATAL crawl.Injector - Injector: >> > > java.io.IOException: Job failed! >> > > at >> org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:831) >> > > at org.apache.nutch.crawl.Injector.inject(Injector.java:162) >> > > at org.apache.nutch.crawl.Injector.run(Injector.java:192) >> > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> > > at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:54) >> > > at org.apache.nutch.crawl.Injector.main(Injector.java:182) >> > > >> > > Indeed the \'inject-temp-322737506\' is present in the specified >> > > folder of D drive and doesn\'t get deleted. >> > > >> > > Is this because multiple map/reduce is running and one of them is >> > > finding the directory to be present and therefore fails? >> > > >> > > So, I also tried setting this in \'conf/hadoop-site.xml\':- >> > > >> > > <property> >> > > <name>mapred.speculative.execution</name> >> > > <value>false</value> >> > > <description></description> >> > > </property> >> > > >> > > I wonder why the same issue doesn\'t occur in Linux. I am not well >> > > acquainted with the Hadoop code yet. Could someone throw light on >> what >> > > might be going wrong? >> > > >> > > Regards, >> > > Susam Pal >> > > >> > > On 2/7/08, DS jha <[EMAIL PROTECTED]> wrote: >> > > Hi - >> > >> Looks like latest trunk version of nutch is failing with the >> following >> > >> exception when trying to perform inject operation: >> > >> >> > >> java.io.IOException: Target >> > >> >> file:/tmp/hadoop-user/mapred/temp/inject-temp-1280136828/_reduce_dv90x0/part-00000 >> > >> already exists >> > >> at >> org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:246) >> > >> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:125) >> > >> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:116) >> > >> at >> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:196) >> > >> at >> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:394) >> > >> at >> org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:452) >> > >> at >> org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:469) >> > >> at >> org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:426) >> > >> at >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:165) >> > >> >> > >> Any thoughts? >> > >> >> > >> Thanks >> > >> Jha >> > >> >> > >> > > -- View this message in context: http://www.nabble.com/nutch-latest-build---inject-operation-failing-tp15328068p15726097.html Sent from the Nutch - Dev mailing list archive at Nabble.com.