Sandy Ryza created MAPREDUCE-4933: ------------------------------------- Summary: MR1 merger asks for length of file it just wrote before flushing it Key: MAPREDUCE-4933 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4933 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1, task Affects Versions: 1.1.1 Reporter: Sandy Ryza Assignee: Sandy Ryza
createKVIterator in ReduceTask contains the following code: {code} try { Merger.writeFile(rIter, writer, reporter, job); addToMapOutputFilesOnDisk(fs.getFileStatus(outputPath)); } catch (Exception e) { if (null != outputPath) { fs.delete(outputPath, true); } throw new IOException("Final merge failed", e); } finally { if (null != writer) { writer.close(); } } {code} Merger#writeFile() does not close the file after writing it, so when fs.getFileStatus() is called on it, it may not return the correct length. This causes bad accounting further down the line, which can lead to map output data being lost. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira