[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned MAPREDUCE-7183:
------------------------------------------

    Assignee: Mikayla Konst

> Make app master recover history from latest history file that exists
> --------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7183
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7183
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: applicationmaster
>            Reporter: Mikayla Konst
>            Assignee: Mikayla Konst
>            Priority: Major
>         Attachments: MAPREDUCE-7183.patch
>
>
> When running a mapreduce job, when the original app master is killed, the new 
> app master normally attempts to recover by reading the jhist file that was 
> written by the app master from the previous app attempt (e.g. current app 
> attempt - 1).
> This is usually fine, but is a problem in the following situation:
>  # App master 1 writes history to jobid_1.jhist, then is killed
>  # App master 2 starts up but is killed before it has the chance to write any 
> history to jobid_2.jhist
>  # App master 3 attempts to recover, but it can't find jobid_2.jhist, so all 
> job progress is lost.
> This problem manifests as "Unable to parse prior job history, aborting 
> recovery" and "Could not parse the old history file. Will not have old 
> AMinfos" errors, all job progress being lost, and previous app attempts not 
> showing up in the job history UI.
> To fix this problem, if jobid_2.jhist is missing, app master 3 should just 
> recover using the history in jobid_1.jhist.
> Related JIRAs that mention this same problem:
> https://issues.apache.org/jira/browse/MAPREDUCE-4729
> https://issues.apache.org/jira/browse/MAPREDUCE-4767 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to