I tried setting hadoop.tmp.dir to /cygdrive/d/tmp and it created
D:\cygdrive\d\tmp\mapred\temp\inject-temp-1365510909\_reduce_n7v9vq.

The same error occurred:-

2008-02-15 10:19:22,833 WARN  mapred.LocalJobRunner - job_local_1
java.io.IOException: Target file:/D:/cygdrive/d/tmp/mapred/temp/inject-temp-1365
510909/_reduce_n7v9vq/part-00000 already exists
       at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:246)
       at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:125)
       at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:116)
       at 
org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:180)
       at 
org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:394)
       at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:452)
       at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:469)
       at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:426)
       at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:165)

Regards,
Susam Pal

On Thu, Feb 14, 2008 at 10:07 PM, Susam Pal <[EMAIL PROTECTED]> wrote:
> What I did try was setting hadoop.tmp.dir to /opt/tmp. I found the
>  behavior strange. I had an /opt/tmp directory in my Cygwin
>  installation (Absolute Windows path: D:\Cygwin\opt\tmp) and I was
>  expecting Hadoop to use it. However, it created a new D:\opt\tmp and
>  wrote the temp files there. Of course this failed with the same error.
>
>  Right now I don't have a Windows system with me. I will try setting it
>  as /cygdrive/d/tmp/ tomorrow when I again have access to a Windows
>  system and then I'll update the mailing list with the observations.
>  Thanks for the suggestion.
>
>  Regards,
>  Susam Pal
>
>
>
>  On Thu, Feb 14, 2008 at 9:41 PM, Dennis Kubes <[EMAIL PROTECTED]> wrote:
>  > I think what might be occurring is a file path issue with hadoop.  I
>  >  have seen it in the past.  Can you try on windows using the cygdrive
>  >  path and see if that works?  For below it would be /cygdrive/D/tmp/ ...
>  >
>  >  Dennis
>  >
>  >
>  >
>  >  Susam Pal wrote:
>  >  > I can confirm this error as I just tried running the last revision of
>  >  > Nutch, rev-620818 on Debian as well as Cygwin on Windows.
>  >  >
>  >  > It works fine on Debian but fails on Cygwin with this error:-
>  >  >
>  >  > 2008-02-14 19:49:47,756 WARN  regex.RegexURLNormalizer - can\'t find
>  >  > rules for scope \'inject\', using default
>  >  > 2008-02-14 19:49:48,381 WARN  mapred.LocalJobRunner - job_local_1
>  >  > java.io.IOException: Target
>  >  > 
> file:/D:/tmp/hadoop-guest/mapred/temp/inject-temp-322737506/_reduce_bjm6rw/part-00000
>  >  > already exists
>  >  >       at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:246)
>  >  >       at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:125)
>  >  >       at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:116)
>  >  >       at 
> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:196)
>  >  >       at 
> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:394)
>  >  >       at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:452)
>  >  >       at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:469)
>  >  >       at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:426)
>  >  >       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:165)
>  >  > 2008-02-14 19:49:49,225 FATAL crawl.Injector - Injector:
>  >  > java.io.IOException: Job failed!
>  >  >       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:831)
>  >  >       at org.apache.nutch.crawl.Injector.inject(Injector.java:162)
>  >  >       at org.apache.nutch.crawl.Injector.run(Injector.java:192)
>  >  >       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>  >  >       at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:54)
>  >  >       at org.apache.nutch.crawl.Injector.main(Injector.java:182)
>  >  >
>  >  > Indeed the \'inject-temp-322737506\' is present in the specified
>  >  > folder of D drive and doesn\'t get deleted.
>  >  >
>  >  > Is this because multiple map/reduce is running and one of them is
>  >  > finding the directory to be present and therefore fails?
>  >  >
>  >  > So, I also tried setting this in \'conf/hadoop-site.xml\':-
>  >  >
>  >  > <property>
>  >  > <name>mapred.speculative.execution</name>
>  >  > <value>false</value>
>  >  > <description></description>
>  >  > </property>
>  >  >
>  >  > I wonder why the same issue doesn\'t occur in Linux. I am not well
>  >  > acquainted with the Hadoop code yet. Could someone throw light on what
>  >  > might be going wrong?
>  >  >
>  >  > Regards,
>  >  > Susam Pal
>  >  >
>  >  > On 2/7/08, DS jha <[EMAIL PROTECTED]> wrote:
>  >  > Hi -
>  >  >> Looks like latest trunk version of nutch is failing with the following
>  >  >> exception when trying to perform inject operation:
>  >  >>
>  >  >> java.io.IOException: Target
>  >  >> 
> file:/tmp/hadoop-user/mapred/temp/inject-temp-1280136828/_reduce_dv90x0/part-00000
>  >  >> already exists
>  >  >>         at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:246)
>  >  >>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:125)
>  >  >>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:116)
>  >  >>         at 
> org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:196)
>  >  >>         at 
> org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:394)
>  >  >>         at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:452)
>  >  >>         at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:469)
>  >  >>         at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:426)
>  >  >>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:165)
>  >  >>
>  >  >> Any thoughts?
>  >  >>
>  >  >> Thanks
>  >  >> Jha
>  >  >>
>  >
>

Reply via email to