Rob Reeves created MAPREDUCE-7500:
-------------------------------------

             Summary: Support optimistic file renames in the commit protocol
                 Key: MAPREDUCE-7500
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7500
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: client
            Reporter: Rob Reeves


During a commit in FileOutputCommitter, every file commit checks if a file or 
directory exists in the destination and if so deletes it before the rename. The 
FileSystem.getFileStatus can take a significant amount of the total commit 
time. However, the happy path is that no file exists in the destination so the 
getFileStatus call is wasted time. The commit protocol can avoid this time by 
optimistically assuming there is no file in the destination and only attempt to 
delete it if the rename fails. In our HDFS environment this change reduced the 
commit time by 70%.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Reply via email to