[jira] [Updated] (MAPREDUCE-7314) Job will hang if NM is restarted while its running

Bilwa S T (Jira) Wed, 20 Jan 2021 11:08:07 -0800


     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Bilwa S T updated MAPREDUCE-7314:
---------------------------------
    Attachment: MAPREDUCE-7314-MR-6749.001.patch

> Job will hang if NM is restarted while its running
> --------------------------------------------------
>
>                 Key: MAPREDUCE-7314
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7314
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>            Reporter: Bilwa S T
>            Assignee: Bilwa S T
>            Priority: Major
>         Attachments: MAPREDUCE-7314-MR-6749.001.patch
>
>
> This is due to three different reasons
>  # PRIORITY_FAST_FAIL_MAP priority containers should be considered for reuse.
>  # Whenever CONTAINER_REMOTE_CLEANUP is fired for task attempt, it wont kill 
> current attempt which is assigned to container. That is because task attempt 
> is not updated in ContainerLauncherImpl#Container class. 
>  # Container gets assigned to task attempt even when container has stopped 
> running ie Container completed event is processed. This is because we add 
> reuse container map to allocated list. Makeremoterequest gets the same 
> container in allocationResponse whereas RM has sent same container in 
> finished container list. To avoid this we need to make sure allocated list 
> doesnt have any containers which are finished.
> Test credits : [~Rajshree]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-7314) Job will hang if NM is restarted while its running

Reply via email to