Sandy Ryza created MAPREDUCE-4933:
-------------------------------------

             Summary: MR1 merger asks for length of file it just wrote before 
flushing it
                 Key: MAPREDUCE-4933
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4933
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: mrv1, task
    Affects Versions: 1.1.1
            Reporter: Sandy Ryza
            Assignee: Sandy Ryza


createKVIterator in ReduceTask contains the following code:
{code}

          try {
            Merger.writeFile(rIter, writer, reporter, job);
            addToMapOutputFilesOnDisk(fs.getFileStatus(outputPath));
          } catch (Exception e) {
            if (null != outputPath) {
              fs.delete(outputPath, true);
            }
            throw new IOException("Final merge failed", e);
          } finally {
            if (null != writer) {
              writer.close();
            }
          }
{code}

Merger#writeFile() does not close the file after writing it, so when 
fs.getFileStatus() is called on it, it may not return the correct length.  This 
causes bad accounting further down the line, which can lead to map output data 
being lost.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to