[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated MAPREDUCE-6550:
----------------------------------------
    Description: 
The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell 
app.  When using the DefaultContainerExecutor, this means that the job will 
actually run as the Yarn user, so the resulting har files are owned by the Yarn 
user instead of the original owner. The permissions are also now world-readable.

In the below example, the archived logs are owned by 'yarn' instead of 'paul' 
and are now world-readable:
{noformat}
[root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs
...
drwxrwx---   - paul  hadoop          0 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0005
drwxr-xr-x   - yarn  hadoop          0 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har
-rw-r--r--   3 yarn  hadoop          0 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS
-rw-r--r--   3 yarn  hadoop       1256 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index
-rw-r--r--   3 yarn  hadoop         24 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex
-rw-r--r--   3 yarn  hadoop    8451177 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0
drwxrwx---   - paul  hadoop          0 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0006
-rw-r-----   3 paul  hadoop       1155 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041
-rw-r-----   3 paul  hadoop       4880 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041
...
{noformat}

  was:
The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell 
app.  When using the DistributedContainerExecutor, this means that the job will 
actually run as the Yarn user, so the resulting har files are owned by the Yarn 
user instead of the original owner.  The permissions are also now 
world-readable.

In the below example, the archived logs are owned by 'yarn' instead of 'paul' 
and are now world-readable:
{noformat}
[root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs
...
drwxrwx---   - paul  hadoop          0 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0005
drwxr-xr-x   - yarn  hadoop          0 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har
-rw-r--r--   3 yarn  hadoop          0 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS
-rw-r--r--   3 yarn  hadoop       1256 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index
-rw-r--r--   3 yarn  hadoop         24 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex
-rw-r--r--   3 yarn  hadoop    8451177 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0
drwxrwx---   - paul  hadoop          0 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0006
-rw-r-----   3 paul  hadoop       1155 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041
-rw-r-----   3 paul  hadoop       4880 2015-10-02 13:24 
/tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041
...
{noformat}


> archive-logs tool changes log ownership to the Yarn user when using 
> DefaultContainerExecutor
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6550
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6550
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.8.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: MAPREDUCE-6550.001.patch
>
>
> The archive-logs tool added in MAPREDUCE-6415 leverages the Distributed Shell 
> app.  When using the DefaultContainerExecutor, this means that the job will 
> actually run as the Yarn user, so the resulting har files are owned by the 
> Yarn user instead of the original owner. The permissions are also now 
> world-readable.
> In the below example, the archived logs are owned by 'yarn' instead of 'paul' 
> and are now world-readable:
> {noformat}
> [root@gs28-centos66-5 ~]# sudo -u hdfs hdfs dfs -ls -R /tmp/logs
> ...
> drwxrwx---   - paul  hadoop          0 2015-10-02 13:24 
> /tmp/logs/paul/logs/application_1443805425363_0005
> drwxr-xr-x   - yarn  hadoop          0 2015-10-02 13:24 
> /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har
> -rw-r--r--   3 yarn  hadoop          0 2015-10-02 13:24 
> /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_SUCCESS
> -rw-r--r--   3 yarn  hadoop       1256 2015-10-02 13:24 
> /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_index
> -rw-r--r--   3 yarn  hadoop         24 2015-10-02 13:24 
> /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/_masterindex
> -rw-r--r--   3 yarn  hadoop    8451177 2015-10-02 13:24 
> /tmp/logs/paul/logs/application_1443805425363_0005/application_1443805425363_0005.har/part-0
> drwxrwx---   - paul  hadoop          0 2015-10-02 13:24 
> /tmp/logs/paul/logs/application_1443805425363_0006
> -rw-r-----   3 paul  hadoop       1155 2015-10-02 13:24 
> /tmp/logs/paul/logs/application_1443805425363_0006/gs-centos66-2.vpc.cloudera.com_8041
> -rw-r-----   3 paul  hadoop       4880 2015-10-02 13:24 
> /tmp/logs/paul/logs/application_1443805425363_0006/gs28-centos66-3.vpc.cloudera.com_8041
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to