[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated MAPREDUCE-5641:
-------------------------------------

    Attachment: MAPREDUCE-5641.patch

I’ve attached a preliminary version of the patch.  Once we all agree on the 
specifics of the design, I can add unit tests.  
The patch follows the design I outlined before where the RM will write a file 
when it sees an AM die and the JHS see that and copies the jhist and similar 
files to the done_intermediate dir.  I have tested this by running jobs and 
killing the AM.  This results in incomplete information, as expected; however, 
in some cases some of the information won’t make 100% sense or is missing (e.g. 
no Finish Time if the AM didn’t actually finish).  I’ve put in some code to 
take care of these situations.  I’ve also attached a preliminary YARN patch to 
YARN-1731.  

{quote}
How will the JHS copy the file to the intermediate directory? It likely won't 
have access to the staging directory containing the jhist file.
{quote}
I modified the permissions from 0700 to 0701.

> History for failed Application Masters should be made available to the Job 
> History Server
> -----------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5641
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5641
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster, jobhistoryserver
>    Affects Versions: 2.2.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: MAPREDUCE-5641.patch
>
>
> Currently, the JHS has no information about jobs whose AMs have failed.  This 
> is because the History is written by the AM to the intermediate folder just 
> before finishing, so when it fails for any reason, this information isn't 
> copied there.  However, it is not lost as its in the AM's staging directory.  
> To make the History available in the JHS, all we need to do is have another 
> mechanism to move the History from the staging directory to the intermediate 
> directory.  The AM also writes a "Summary" file before exiting normally, 
> which is also unavailable when the AM fails.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to