[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14112367#comment-14112367
 ] 

Jason Lowe commented on MAPREDUCE-6043:
---------------------------------------

For the first case you described, the AM currently kills tasks immediately 
after they said they have completed (either successfully or unsuccessfully).  
This behavior may change after MAPREDUCE-5465 but that's not in yet.  Did the 
kill not take place or otherwise failed in some fashion?  Killing a 
lingering-but-finished map task is better than preempting a reducer.

For the second case, could you elaborate on how the RM failed to report the 
completed container to the AM?  That sounds like a bug in YARN rather than 
MapReduce, but it would be good to know the circumstances in which it occurred.

Both of these sound like they could be cases of YARN failing to convey 
completed container status to the AM.  If so those seem like bugs in YARN and 
not MapReduce.  Also could you elaborate on how you're proposing to fix these 
exceptional corner cases in MapReduce?

> Reducer-preemption does not kick in
> -----------------------------------
>
>                 Key: MAPREDUCE-6043
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6043
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Maysam Yabandeh
>
> We have seen various cases that reducer-preemption does not kick in and the 
> scheduled mappers wait behind running reducers forever. Each time there seems 
> to be a different scenario. So far we have tracked down two of such cases and 
> the common element between them is that the variables in RMContainerAllocator 
> go out of sync since they only get updated when completed container is 
> reported by RM. However there are many corner cases that such report is not 
> received from RM and yet the MapReduce app moves forward. Perhaps one 
> possible fix would be to update such variables also after exceptional cases.
> The logic for triggering preemption is at 
> RMContainerAllocator::preemptReducesIfNeeded
> The preemption is triggered if the following is true:
> {code}
> headroom +  am * |m| + pr * |r| < mapResourceRequest
> {code} 
> where am: number of assigned mappers, |m| is mapper size, pr is number of 
> reducers being preempted, and |r| is the reducer size. Each of these 
> variables going out of sync will cause the preemption not to kick in. In the 
> following comment, we explain two of such cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to