[
https://issues.apache.org/jira/browse/MAPREDUCE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901821#comment-13901821
]
Jason Lowe commented on MAPREDUCE-5641:
---------------------------------------
bq. Do you have any alternatives on how to allow the JHS to have access to
those files?
Outside of imposing new restrictions on where the staging directory can be and
how it has to be configured, no I don't know of an easy way to do that. To
allow the JHS to access these files, we'd minimally have to require the user
directories in the staging area to have their group set to the "hadoop" group
(see
http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-common/ClusterSetup.html#Running_Hadoop_in_Secure_Mode
for details on that group) and have permissions of 0750 all the way down to
the specific staging directory for a job. Read permission is required so the
history server can scan for the proper jhist file to grab, since a job with
multiple AM attempts means the JHS can't just know what the name of the correct
JHS file is -- it would have to scan to see which is the latest. That would
relax the permissions on a user's staging files to include the hadoop group.
That's probably OK and far better than letting everyone in, but I haven't
thought through all of the security ramifications of doing so.
bq. Or to somehow get those files into the done_intermediate dir?
A proper way to do this would be to have something run by the user of the job
do this, as that doesn't require any additional security beyond what's already
done today. However that probably involves adding the ability in YARN for a
specified task to run when an application is failed/killed to cleanup after the
unsuccessful run. It's a non-trivial task, but it would also help solve the
problem we have today where staging directories are leaked for applications
that are killed before the AM launches.
> History for failed Application Masters should be made available to the Job
> History Server
> -----------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5641
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5641
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: applicationmaster, jobhistoryserver
> Affects Versions: 2.2.0
> Reporter: Robert Kanter
> Assignee: Robert Kanter
> Attachments: MAPREDUCE-5641.patch
>
>
> Currently, the JHS has no information about jobs whose AMs have failed. This
> is because the History is written by the AM to the intermediate folder just
> before finishing, so when it fails for any reason, this information isn't
> copied there. However, it is not lost as its in the AM's staging directory.
> To make the History available in the JHS, all we need to do is have another
> mechanism to move the History from the staging directory to the intermediate
> directory. The AM also writes a "Summary" file before exiting normally,
> which is also unavailable when the AM fails.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)