[
https://issues.apache.org/jira/browse/MAPREDUCE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906448#comment-13906448
]
Vinod Kumar Vavilapalli commented on MAPREDUCE-5641:
----------------------------------------------------
bq. Vinod Kumar Vavilapalli, I'm a bit reluctant to get the JHS to depend on
the AHS at this point as the AHS is not fully cooked. I would prefer dropping
the JHS alltogether in favor of the AHS when the AHS is ready for prime time
with AM extensions.
The problem is that as I understand it, this JIRA requires corresponding
changes in YARN via YARN-1731. It doesn't make sense to add duplicate
functionality in YARN.
Instead of adding new functionality, can JHS simply ask RM about the
application-status. Why would that not work? Clearly if RM goes down and comes
back up, it may lose history, but for that you need to enable the state-store
anyways. But otherwise, it should work for the most part. Thoughts?
> History for failed Application Masters should be made available to the Job
> History Server
> -----------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5641
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5641
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: applicationmaster, jobhistoryserver
> Affects Versions: 2.2.0
> Reporter: Robert Kanter
> Assignee: Robert Kanter
> Attachments: MAPREDUCE-5641.patch, MAPREDUCE-5641.patch
>
>
> Currently, the JHS has no information about jobs whose AMs have failed. This
> is because the History is written by the AM to the intermediate folder just
> before finishing, so when it fails for any reason, this information isn't
> copied there. However, it is not lost as its in the AM's staging directory.
> To make the History available in the JHS, all we need to do is have another
> mechanism to move the History from the staging directory to the intermediate
> directory. The AM also writes a "Summary" file before exiting normally,
> which is also unavailable when the AM fails.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)