Author: tomwhite Date: Sat Apr 21 11:55:38 2007 New Revision: 531084 URL: http://svn.apache.org/viewvc?view=rev&rev=531084 Log: HADOOP-1152. Fix race condition in MapOutputCopier.copyOutput file rename causing possible reduce task hang. Contributed by Tahir Hashmi.
Modified: lucene/hadoop/trunk/CHANGES.txt lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/ReduceTask.java Modified: lucene/hadoop/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/lucene/hadoop/trunk/CHANGES.txt?view=diff&rev=531084&r1=531083&r2=531084 ============================================================================== --- lucene/hadoop/trunk/CHANGES.txt (original) +++ lucene/hadoop/trunk/CHANGES.txt Sat Apr 21 11:55:38 2007 @@ -234,6 +234,10 @@ 70. HADOOP-1275. Fix misspelled job notification property in hadoop-default.xml. (Alejandro Abdelnur via tomwhite) +71. HADOOP-1152. Fix race condition in MapOutputCopier.copyOutput file + rename causing possible reduce task hang. + (Tahir Hashmi via tomwhite) + Release 0.12.3 - 2007-04-06 Modified: lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/ReduceTask.java URL: http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/ReduceTask.java?view=diff&rev=531084&r1=531083&r2=531084 ============================================================================== --- lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/ReduceTask.java (original) +++ lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/ReduceTask.java Sat Apr 21 11:55:38 2007 @@ -698,14 +698,17 @@ fs.delete(tmpFilename); return CopyResult.OBSOLETE; } + + bytes = fs.getLength(tmpFilename); // if we can't rename the file, something is broken (and IOException // will be thrown). if (!fs.rename(tmpFilename, finalFilename)) { fs.delete(tmpFilename); + bytes = -1; throw new IOException("failure to rename map output " + tmpFilename); } - bytes = fs.getLength(finalFilename); + LOG.info(reduceId + " done copying " + loc.getMapTaskId() + " output from " + loc.getHost() + "."); //Create a thread to do merges. Synchronize access/update to