[
https://issues.apache.org/jira/browse/MAPREDUCE-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maysam Yabandeh resolved MAPREDUCE-6043.
----------------------------------------
Resolution: Invalid
> Lost messages from RM to MRAppMaster
> ------------------------------------
>
> Key: MAPREDUCE-6043
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6043
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Maysam Yabandeh
>
> We have seen various cases that reducer-preemption does not kick in and the
> scheduled mappers wait behind running reducers forever. Each time there seems
> to be a different scenario. So far we have tracked down two of such cases and
> the common element between them is that the variables in RMContainerAllocator
> go out of sync since they only get updated when completed container is
> reported by RM. However there are many corner cases that such report is not
> received from RM and yet the MapReduce app moves forward. Perhaps one
> possible fix would be to update such variables also after exceptional cases.
> The logic for triggering preemption is at
> RMContainerAllocator::preemptReducesIfNeeded
> The preemption is triggered if the following is true:
> {code}
> headroom + am * |m| + pr * |r| < mapResourceRequest
> {code}
> where am: number of assigned mappers, |m| is mapper size, pr is number of
> reducers being preempted, and |r| is the reducer size. Each of these
> variables going out of sync will cause the preemption not to kick in. In the
> following comment, we explain two of such cases.
--
This message was sent by Atlassian JIRA
(v6.2#6252)