[ https://issues.apache.org/jira/browse/MAPREDUCE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13137122#comment-13137122 ]
Robert Joseph Evans commented on MAPREDUCE-3274: ------------------------------------------------ You are correct, I got confused by the JVM ID not lining up with the attempt/task ID. The r_000008 is confusing. I will keep digging. The deadlock has been fairly reproducible, but I end up with about 20GB of logs to go through afterwards so it has taken me a while to get through them all. > Race condition in MR App Master Preemtion can cause a dead lock > --------------------------------------------------------------- > > Key: MAPREDUCE-3274 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3274 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2, scheduler > Affects Versions: 0.23.0, 0.24.0 > Reporter: Robert Joseph Evans > Assignee: Robert Joseph Evans > Priority: Critical > Fix For: 0.23.0, 0.24.0 > > > There appears to be a race condition in the MR App Master in relation to > preempting reducers to let a mapper run. In the particular case that I have > been debugging a reducer was selected for preemption that did not have a > container assigned to it yet. When the container became available that reduce > started running and the previous TA_KILL event appears to have been ignored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira