[ https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated MAPREDUCE-3902: ------------------------------------- Attachment: MAPREDUCE-3902.patch Ok, I spent a long (isolated) flight on this - it clearly needs more work, but it's a start. *smile* This patch improves the classic JVM re-use on both dimensions described in the jira. We need to pay more attention to the user interface, some options: # Allow user to specify actual number of map slots to be used (supported now, in the patch) # Allow user to specify a target block-size for maps (which is greater than real HDFS block size) i.e. get around the small-files problem. Thoughts? > MR AM should reuse containers for map tasks > ------------------------------------------- > > Key: MAPREDUCE-3902 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster, mrv2 > Reporter: Arun C Murthy > Assignee: Arun C Murthy > Attachments: MAPREDUCE-3902.patch > > > The MR AM is now in a great position to reuse containers across (map) tasks. > This is something similar to JVM re-use we had in 0.20.x, but in a > significantly better manner: > # Consider data-locality when re-using containers > # Consider the new shuffle - ensure that reduces fetch output of the whole > container at once (i.e. all maps) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira