Naganarasimha G R commented on MAPREDUCE-6749:

Hi [~devaraj.k],
Thanks for the detailed explanation As it would involve considerable 
modifications and changes in core code, hope we can create a new branch and get 
the things in there so that its easier for others to have a look before it gets 
into the trunk or main stream branches

bq. I think the limit configuration for no of map/reduce reuse containers would 
allow other applications to start running without waiting for the Job to be 
finished when reuse is enabled. If there is a big Job running which could 
occupy the entire cluster, and then any high priority application gets 
submitted this limit for maps/reduce container would probably give a room for 
high priority application to start running without preempting the containers of 
the previous Job. By default there is no limit for number of containers to be 
reused and if any user/Job wanted to have this constraint they can configure it.
Yes i understand it thanks for the explanation, but issue would be how the 
application knows whats the right configuration for these, in application per 
se they would think it would be always right to run all the tasks in the given 
container than launching more containers. If there was a way for admin to 
enforce it it would be usefull. If its just client level configuration it just 
adds into already long list of configurations and users will not be clear what 
to configure for it. And besides would it be better to have just how many tasks 
can reuse a given container and try to avoid for Map and Reduce seperately ?

Btw it could be also good to introduce a metric for number of Map Tasks or 
Reduce tasks which has reused the containers

bq. If you want to have a try this feature, you can apply MAPREDUCE-6773, 
MAPREDUCE-6781, MAPREDUCE-6784, MAPREDUCE-6785, MAPREDUCE-6786 and then try 
this feature. 
Sure Deva will try it over the weekend and update you, anyway started to take 
look at them

bq. Here we should note that the whole container log which is displaying for 
TaskAttempt is not applicable to the TaskAttempt and the log can be identified 
easily which part applicable to it.
This was the problem which we generally faced and difficult for the customers 
to understand that entire log is not for the task attempt, so was wondering to 
have any better approach to this. 

> MR AM should reuse containers for Map/Reduce Tasks
> --------------------------------------------------
>                 Key: MAPREDUCE-6749
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6749
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster, mrv2
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>         Attachments: MAPREDUCE-6749-Container Reuse-v0.pdf
> It is with the continuation of MAPREDUCE-3902, MR AM should reuse containers 
> for Map/Reduce Tasks similar to the JVM Reuse feature we had in MRv1.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to