[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14027123#comment-14027123
 ] 

Maysam Yabandeh commented on MAPREDUCE-5844:
--------------------------------------------

Thanks [~kasha] for reviewing it. About the unit test, I looked into it and it 
seems to be non-trivial to me: On one hand preemptReducesIfNeeded uses local 
fields and is not feasible to be tested separately via mocking. The alternative 
would be to test the entire RMContainerAllocator object; however to make sure 
that preemptReducesIfNeeded is exercised in the test RMContainerAllocator 
object should be fed with a complicated set of events: some mappers are not 
finished, but enough are finished to trigger reducer start, and finally mapper 
failure. The complexity of the unit test in this way would much more than that 
of the minor change introduced by the patch. I guess it would be possible to 
come up with unit tests with reasonable complexity if we make changes into the 
RMContainerAllocator to make it more testable, but I am not sure whether such 
changes are desirable as part of this jira.

> Reducer Preemption is too aggressive
> ------------------------------------
>
>                 Key: MAPREDUCE-5844
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Maysam Yabandeh
>            Assignee: Maysam Yabandeh
>         Attachments: MAPREDUCE-5844.patch
>
>
> We observed cases where the reducer preemption makes the job finish much 
> later, and the preemption does not seem to be necessary since after 
> preemption both the preempted reducer and the mapper are assigned 
> immediately--meaning that there was already enough space for the mapper.
> The logic for triggering preemption is at 
> RMContainerAllocator::preemptReducesIfNeeded
> The preemption is triggered if the following is true:
> {code}
> headroom +  am * |m| + pr * |r| < mapResourceRequest
> {code} 
> where am: number of assigned mappers, |m| is mapper size, pr is number of 
> reducers being preempted, and |r| is the reducer size.
> The original idea apparently was that if headroom is not big enough for the 
> new mapper requests, reducers should be preempted. This would work if the job 
> is alone in the cluster. Once we have queues, the headroom calculation 
> becomes more complicated and it would require a separate headroom calculation 
> per queue/job.
> So, as a result headroom variable is kind of given up currently: *headroom is 
> always set to 0* What this implies to the speculation is that speculation 
> becomes very aggressive, not considering whether there is enough space for 
> the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to