[
https://issues.apache.org/jira/browse/MAPREDUCE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Kanter updated MAPREDUCE-6550:
-------------------------------------
Attachment: MAPREDUCE-6550.001.patch
The patch fixes the user problem by using UGI to proxy as the correct user. It
fixes the permissions problem by setting the correct permissions on the files.
Other than those changes, the bulk of the changes in the patch are due to
moving some things around and indenting.
I've updated the unit tests to check for the permissions and also verified in a
cluster that it behaves correctly.
> archive-logs tool changes log ownership to the Yarn user when using
> DefaultContainerExecutor
> --------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-6550
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6550
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 2.8.0
> Reporter: Robert Kanter
> Assignee: Robert Kanter
> Attachments: MAPREDUCE-6550.001.patch
>
>
> The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell
> app. When using the DistributedContainerExecutor, this means that the job
> will actually run as the Yarn user, so the resulting har files are owned by
> the Yarn user instead of the original owner. The permissions are also now
> world-readable.
> In the below example, the archived logs are owned by 'yarn' instead of 'paul'
> and are now world-readable:
> {noformat}
> [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs
> ...
> drwxrwx--- - paul hadoop 0 2015-10-02 13:24
> /tmp/logs/paul/logs/application_1443805425363_0005
> drwxr-xr-x - yarn hadoop 0 2015-10-02 13:24
> /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har
> -rw-r--r-- 3 yarn hadoop 0 2015-10-02 13:24
> /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS
> -rw-r--r-- 3 yarn hadoop 1256 2015-10-02 13:24
> /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index
> -rw-r--r-- 3 yarn hadoop 24 2015-10-02 13:24
> /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex
> -rw-r--r-- 3 yarn hadoop 8451177 2015-10-02 13:24
> /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0
> drwxrwx--- - paul hadoop 0 2015-10-02 13:24
> /tmp/logs/paul/logs/application_1443805425363_0006
> -rw-r----- 3 paul hadoop 1155 2015-10-02 13:24
> /tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041
> -rw-r----- 3 paul hadoop 4880 2015-10-02 13:24
> /tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041
> ...
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)