[jira] [Commented] (MAPREDUCE-7314) Job will hang if NM is restarted while its running

Eric Payne (Jira) Wed, 20 Jan 2021 09:23:15 -0800


    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268715#comment-17268715
 ]


Eric Payne commented on MAPREDUCE-7314:
---------------------------------------

[~BilwaST], on what version are you seeing this?

> Job will hang if NM is restarted while its running
> --------------------------------------------------
>
>                 Key: MAPREDUCE-7314
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7314
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>            Reporter: Bilwa S T
>            Assignee: Bilwa S T
>            Priority: Major
>
> This is due to three different reasons
>  # PRIORITY_FAST_FAIL_MAP priority containers should be considered for reuse.
>  # Whenever CONTAINER_REMOTE_CLEANUP is fired for task attempt, it wont kill 
> current attempt which is assigned to container. That is because task attempt 
> is not updated in ContainerLauncherImpl#Container class. 
>  # Container gets assigned to task attempt even when container has stopped 
> running ie Container completed event is processed. This is because we add 
> reuse container map to allocated list. Makeremoterequest gets the same 
> container in allocationResponse whereas RM has sent same container in 
> finished container list. To avoid this we need to make sure allocated list 
> doesnt have any containers which are finished.
> Test credits : [~Rajshree]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-7314) Job will hang if NM is restarted while its running

Reply via email to