[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-3902:
-------------------------------------

    Attachment: MAPREDUCE-3902.patch

Ok, I spent a long (isolated) flight on this - it clearly needs more work, but 
it's a start. *smile*

This patch improves the classic JVM re-use on both dimensions described in the 
jira.

We need to pay more attention to the user interface, some options:
# Allow user to specify actual number of map slots to be used (supported now, 
in the patch)
# Allow user to specify a target block-size for maps (which is greater than 
real HDFS block size) i.e. get around the small-files problem.

Thoughts?
                
> MR AM should reuse containers for map tasks
> -------------------------------------------
>
>                 Key: MAPREDUCE-3902
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster, mrv2
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>         Attachments: MAPREDUCE-3902.patch
>
>
> The MR AM is now in a great position to reuse containers across (map) tasks. 
> This is something similar to JVM re-use we had in 0.20.x, but in a 
> significantly better manner:
> # Consider data-locality when re-using containers
> # Consider the new shuffle - ensure that reduces fetch output of the whole 
> container at once (i.e. all maps) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to