[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated MAPREDUCE-7314:
---------------------------------
    Description: 
This is due to three different reasons
 # PRIORITY_FAST_FAIL_MAP priority containers should be considered for reuse.
 # Whenever CONTAINER_REMOTE_CLEANUP is fired for task attempt, it wont kill 
current attempt which is assigned to container. That is because task attempt is 
not updated in ContainerLauncherImpl#Container class. 
 # Container gets assigned to task attempt even when container has stopped 
running ie Container completed event is processed. This is because we add reuse 
container map to allocated list. Makeremoterequest gets the same container in 
allocationResponse whereas RM has sent same container in finished container 
list. To avoid this we need to make sure allocated list doesnt have any 
containers which are finished.

Test credits : [~Rajshree]

  was:
This is due to three different reasons
 # PRIORITY_FAST_FAIL_MAP priority containers should be considered for reuse.
 # Whenever CONTAINER_REMOTE_CLEANUP is fired for task attempt, it wont kill 
current attempt which is assigned to container. That is because task attempt is 
not updated in ContainerLauncherImpl#Container class. 
 # Container gets assigned to task attempt even when container has stopped 
running ie Container completed event is processed. This is because we add reuse 
container map to allocated list. Makeremoterequest gets the same container in 
allocationResponse whereas RM has sent same container in finished container 
list. To avoid this we need to make sure allocated list doesnt have any 
containers which are finished.


> Job will hang if NM is restarted while its running
> --------------------------------------------------
>
>                 Key: MAPREDUCE-7314
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7314
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>            Reporter: Bilwa S T
>            Assignee: Bilwa S T
>            Priority: Major
>
> This is due to three different reasons
>  # PRIORITY_FAST_FAIL_MAP priority containers should be considered for reuse.
>  # Whenever CONTAINER_REMOTE_CLEANUP is fired for task attempt, it wont kill 
> current attempt which is assigned to container. That is because task attempt 
> is not updated in ContainerLauncherImpl#Container class. 
>  # Container gets assigned to task attempt even when container has stopped 
> running ie Container completed event is processed. This is because we add 
> reuse container map to allocated list. Makeremoterequest gets the same 
> container in allocationResponse whereas RM has sent same container in 
> finished container list. To avoid this we need to make sure allocated list 
> doesnt have any containers which are finished.
> Test credits : [~Rajshree]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to