[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reopened MAPREDUCE-4992:
-----------------------------------


This is still occurring in a number of ways:

* If the task attempt that succeeded was attempt 1 but there is no completion 
event in the history file for attempt 0, it recovers only attempt 0 but is 
waiting for attempt 1 to complete.
* If two task attempts succeed simultaneously it only recovers attempt 0 but is 
waiting for attempt 1 to complete.
* If the prior AM attempt was backed up in event processing and launched 
speculative task attempts *after* a task attempt completed then it ends up 
waiting on them but they were never launched.
                
> AM hangs in RecoveryService when recovering tasks with speculative attempts
> ---------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4992
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4992
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: trunk, 2.0.2-alpha, 0.23.6
>            Reporter: Robert Parker
>            Assignee: Robert Parker
>            Priority: Critical
>             Fix For: 0.23.7, 2.0.5-beta
>
>         Attachments: MAPREDUCE-4992v1.patch, MAPREDUCE-4992v2.patch
>
>
> A job hung in the Recovery Service on an AM restart. There were four map 
> tasks events that were not processed and that prevented the complete task 
> count from reaching zero which exits the recovery service. All four tasks 
> were speculative

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to