[jira] [Resolved] (MAPREDUCE-6043) Lost messages from RM to MRAppMaster

Maysam Yabandeh (JIRA) Wed, 27 Aug 2014 11:14:32 -0700

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Maysam Yabandeh resolved MAPREDUCE-6043.
----------------------------------------

    Resolution: Invalid

> Lost messages from RM to MRAppMaster
> ------------------------------------
>
>                 Key: MAPREDUCE-6043
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6043
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Maysam Yabandeh
>
> We have seen various cases that reducer-preemption does not kick in and the 
> scheduled mappers wait behind running reducers forever. Each time there seems 
> to be a different scenario. So far we have tracked down two of such cases and 
> the common element between them is that the variables in RMContainerAllocator 
> go out of sync since they only get updated when completed container is 
> reported by RM. However there are many corner cases that such report is not 
> received from RM and yet the MapReduce app moves forward. Perhaps one 
> possible fix would be to update such variables also after exceptional cases.
> The logic for triggering preemption is at 
> RMContainerAllocator::preemptReducesIfNeeded
> The preemption is triggered if the following is true:
> {code}
> headroom +  am * |m| + pr * |r| < mapResourceRequest
> {code} 
> where am: number of assigned mappers, |m| is mapper size, pr is number of 
> reducers being preempted, and |r| is the reducer size. Each of these 
> variables going out of sync will cause the preemption not to kick in. In the 
> following comment, we explain two of such cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (MAPREDUCE-6043) Lost messages from RM to MRAppMaster

Reply via email to