[ https://issues.apache.org/jira/browse/MAPREDUCE-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268715#comment-17268715 ]
Eric Payne commented on MAPREDUCE-7314: --------------------------------------- [~BilwaST], on what version are you seeing this? > Job will hang if NM is restarted while its running > -------------------------------------------------- > > Key: MAPREDUCE-7314 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7314 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Reporter: Bilwa S T > Assignee: Bilwa S T > Priority: Major > > This is due to three different reasons > # PRIORITY_FAST_FAIL_MAP priority containers should be considered for reuse. > # Whenever CONTAINER_REMOTE_CLEANUP is fired for task attempt, it wont kill > current attempt which is assigned to container. That is because task attempt > is not updated in ContainerLauncherImpl#Container class. > # Container gets assigned to task attempt even when container has stopped > running ie Container completed event is processed. This is because we add > reuse container map to allocated list. Makeremoterequest gets the same > container in allocationResponse whereas RM has sent same container in > finished container list. To avoid this we need to make sure allocated list > doesnt have any containers which are finished. > Test credits : [~Rajshree] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org