[jira] [Updated] (MAPREDUCE-6792) Allow user's full principal name as owner of MapReduce staging directory in JobSubmissionFiles#JobStagingDir()

2016-10-25 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-6792:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   2.9.0
   Status: Resolved  (was: Patch Available)

I have commit the latest patch to trunk and branch-2. Thanks [~snayak] for 
contributing the patch!

> Allow user's full principal name as owner of MapReduce staging directory in 
> JobSubmissionFiles#JobStagingDir()
> --
>
> Key: MAPREDUCE-6792
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6792
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Santhosh G Nayak
>Assignee: Santhosh G Nayak
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: MAPREDUCE-6792.1.patch, MAPREDUCE-6792.2.patch, 
> MAPREDUCE-6792.3.patch
>
>
> Background - 
> Currently, {{JobSubmissionFiles#JobStagingDir()}} assumes that file owner 
> returned as part of {{FileSystem#getFileStatus()}} is always user's short 
> principal name, which is true for HDFS. But, some file systems which are HDFS 
> compatible like [Azure Data Lake Store (ADLS) 
> |https://azure.microsoft.com/en-in/services/data-lake-store/] and work in 
> multi tenant environment can have users with same names belonging to 
> different domains. For example, {{us...@company1.com}} and 
> {{us...@company2.com}}. It will be ambiguous, if 
> {{FileSystem#getFileStatus()}} returns only the user's short principal name 
> (without domain name) as the owner of the file/directory. 
> The following code block allows only short user principal name as owner. It 
> simply fails saying that ownership on the staging directory is not as 
> expected, if owner returned by the {{FileStatus#getOwner()}} is not equal to 
> short principal name of the current user.
> {code}
> String realUser;
> String currentUser;
> UserGroupInformation ugi = UserGroupInformation.getLoginUser();
> realUser = ugi.getShortUserName();
> currentUser = UserGroupInformation.getCurrentUser().getShortUserName();
> if (fs.exists(stagingArea)) {
>   FileStatus fsStatus = fs.getFileStatus(stagingArea);
>   String owner = fsStatus.getOwner();
>   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
>  throw new IOException("The ownership on the staging directory " +
>   stagingArea + " is not as expected. " +
>   "It is owned by " + owner + ". The directory must " +
>   "be owned by the submitter " + currentUser + " or " +
>   "by " + realUser);
>   }
> {code}
> The proposal is to remove the strict restriction on short principal name by 
> allowing the user's full principal name as owner of staging area directory in 
> {{JobSubmissionFiles#JobStagingDir()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6792) Allow user's full principal name as owner of MapReduce staging directory in JobSubmissionFiles#JobStagingDir()

2016-10-20 Thread Santhosh G Nayak (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Santhosh G Nayak updated MAPREDUCE-6792:

Attachment: MAPREDUCE-6792.3.patch

> Allow user's full principal name as owner of MapReduce staging directory in 
> JobSubmissionFiles#JobStagingDir()
> --
>
> Key: MAPREDUCE-6792
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6792
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Santhosh G Nayak
>Assignee: Santhosh G Nayak
> Attachments: MAPREDUCE-6792.1.patch, MAPREDUCE-6792.2.patch, 
> MAPREDUCE-6792.3.patch
>
>
> Background - 
> Currently, {{JobSubmissionFiles#JobStagingDir()}} assumes that file owner 
> returned as part of {{FileSystem#getFileStatus()}} is always user's short 
> principal name, which is true for HDFS. But, some file systems which are HDFS 
> compatible like [Azure Data Lake Store (ADLS) 
> |https://azure.microsoft.com/en-in/services/data-lake-store/] and work in 
> multi tenant environment can have users with same names belonging to 
> different domains. For example, {{us...@company1.com}} and 
> {{us...@company2.com}}. It will be ambiguous, if 
> {{FileSystem#getFileStatus()}} returns only the user's short principal name 
> (without domain name) as the owner of the file/directory. 
> The following code block allows only short user principal name as owner. It 
> simply fails saying that ownership on the staging directory is not as 
> expected, if owner returned by the {{FileStatus#getOwner()}} is not equal to 
> short principal name of the current user.
> {code}
> String realUser;
> String currentUser;
> UserGroupInformation ugi = UserGroupInformation.getLoginUser();
> realUser = ugi.getShortUserName();
> currentUser = UserGroupInformation.getCurrentUser().getShortUserName();
> if (fs.exists(stagingArea)) {
>   FileStatus fsStatus = fs.getFileStatus(stagingArea);
>   String owner = fsStatus.getOwner();
>   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
>  throw new IOException("The ownership on the staging directory " +
>   stagingArea + " is not as expected. " +
>   "It is owned by " + owner + ". The directory must " +
>   "be owned by the submitter " + currentUser + " or " +
>   "by " + realUser);
>   }
> {code}
> The proposal is to remove the strict restriction on short principal name by 
> allowing the user's full principal name as owner of staging area directory in 
> {{JobSubmissionFiles#JobStagingDir()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6792) Allow user's full principal name as owner of MapReduce staging directory in JobSubmissionFiles#JobStagingDir()

2016-10-20 Thread Santhosh G Nayak (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Santhosh G Nayak updated MAPREDUCE-6792:

Attachment: MAPREDUCE-6792.2.patch

[~djp],Thanks for reviewing the patch.

{quote}1. fileOwner.equalsIgnoreCase(currentUser.getUserName()) - I think our 
current assumption in hadoop is user name should be case sensitive, so user and 
USER are treated as different user. In AzureFS or other similar cloud based FS, 
do we change the assumption here especially for domain name? If not, we should 
keep case sensitive check here
{quote}

I would suggest the user's full principal name comparison to be case 
insensitive for below 2 reasons,

- [rfc4343|https://tools.ietf.org/html/rfc4343], talks about the case 
insensitivity of domain names as a standard and in most cases, Kerberos realm 
is domain name, in upper-case letters. Quoting the relevant line from 
[What-is-a-Kerberos-Principal|http://web.mit.edu/KERBEROS/krb5-1.5/krb5-1.5.4/doc/krb5-user/What-is-a-Kerberos-Principal_003f.html],
{quote} In most cases, your Kerberos realm is your domain name, in upper-case 
letters. For example, the machine daffodil.example.com would be in the realm 
EXAMPLE.COM.
{quote}
So, to be able to match the file owner {{u...@example.com}} coming from 
external cloud storage as part of {{FileSystem.getFileStatus()}} to 
{{u...@example.com}} from {{UserGroupInformation.getUserName()}}, it is 
necessary that we use case insensitive comparison.
- In Active Directory(AD) and Azure, the principal names {{u...@example.com}} 
and {{u...@example.com}} are same.

{quote}
2. The exception message include all possible usernames, it could be duplicated 
in case login user = real user (in case no proxy user get used). So we should 
do a quick check and only log both when login user != real user. Isn't it?
{quote}
Yes, I agree with you. Addressed it in patch #2.

{quote}
 3. It would be great if we can figure out some way to add unit test for use 
case that we are adding here.
{quote}
Added few unit tests in patch #2 for the use case we are adding.

> Allow user's full principal name as owner of MapReduce staging directory in 
> JobSubmissionFiles#JobStagingDir()
> --
>
> Key: MAPREDUCE-6792
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6792
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Santhosh G Nayak
>Assignee: Santhosh G Nayak
> Attachments: MAPREDUCE-6792.1.patch, MAPREDUCE-6792.2.patch
>
>
> Background - 
> Currently, {{JobSubmissionFiles#JobStagingDir()}} assumes that file owner 
> returned as part of {{FileSystem#getFileStatus()}} is always user's short 
> principal name, which is true for HDFS. But, some file systems which are HDFS 
> compatible like [Azure Data Lake Store (ADLS) 
> |https://azure.microsoft.com/en-in/services/data-lake-store/] and work in 
> multi tenant environment can have users with same names belonging to 
> different domains. For example, {{us...@company1.com}} and 
> {{us...@company2.com}}. It will be ambiguous, if 
> {{FileSystem#getFileStatus()}} returns only the user's short principal name 
> (without domain name) as the owner of the file/directory. 
> The following code block allows only short user principal name as owner. It 
> simply fails saying that ownership on the staging directory is not as 
> expected, if owner returned by the {{FileStatus#getOwner()}} is not equal to 
> short principal name of the current user.
> {code}
> String realUser;
> String currentUser;
> UserGroupInformation ugi = UserGroupInformation.getLoginUser();
> realUser = ugi.getShortUserName();
> currentUser = UserGroupInformation.getCurrentUser().getShortUserName();
> if (fs.exists(stagingArea)) {
>   FileStatus fsStatus = fs.getFileStatus(stagingArea);
>   String owner = fsStatus.getOwner();
>   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
>  throw new IOException("The ownership on the staging directory " +
>   stagingArea + " is not as expected. " +
>   "It is owned by " + owner + ". The directory must " +
>   "be owned by the submitter " + currentUser + " or " +
>   "by " + realUser);
>   }
> {code}
> The proposal is to remove the strict restriction on short principal name by 
> allowing the user's full principal name as owner of staging area directory in 
> {{JobSubmissionFiles#JobStagingDir()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands

[jira] [Updated] (MAPREDUCE-6792) Allow user's full principal name as owner of MapReduce staging directory in JobSubmissionFiles#JobStagingDir()

2016-10-17 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6792:
---
Target Version/s: 2.9.0, 3.0.0-alpha2  (was: 2.9.0, 3.0.0-alpha1)

> Allow user's full principal name as owner of MapReduce staging directory in 
> JobSubmissionFiles#JobStagingDir()
> --
>
> Key: MAPREDUCE-6792
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6792
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Santhosh G Nayak
>Assignee: Santhosh G Nayak
> Attachments: MAPREDUCE-6792.1.patch
>
>
> Background - 
> Currently, {{JobSubmissionFiles#JobStagingDir()}} assumes that file owner 
> returned as part of {{FileSystem#getFileStatus()}} is always user's short 
> principal name, which is true for HDFS. But, some file systems which are HDFS 
> compatible like [Azure Data Lake Store (ADLS) 
> |https://azure.microsoft.com/en-in/services/data-lake-store/] and work in 
> multi tenant environment can have users with same names belonging to 
> different domains. For example, {{us...@company1.com}} and 
> {{us...@company2.com}}. It will be ambiguous, if 
> {{FileSystem#getFileStatus()}} returns only the user's short principal name 
> (without domain name) as the owner of the file/directory. 
> The following code block allows only short user principal name as owner. It 
> simply fails saying that ownership on the staging directory is not as 
> expected, if owner returned by the {{FileStatus#getOwner()}} is not equal to 
> short principal name of the current user.
> {code}
> String realUser;
> String currentUser;
> UserGroupInformation ugi = UserGroupInformation.getLoginUser();
> realUser = ugi.getShortUserName();
> currentUser = UserGroupInformation.getCurrentUser().getShortUserName();
> if (fs.exists(stagingArea)) {
>   FileStatus fsStatus = fs.getFileStatus(stagingArea);
>   String owner = fsStatus.getOwner();
>   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
>  throw new IOException("The ownership on the staging directory " +
>   stagingArea + " is not as expected. " +
>   "It is owned by " + owner + ". The directory must " +
>   "be owned by the submitter " + currentUser + " or " +
>   "by " + realUser);
>   }
> {code}
> The proposal is to remove the strict restriction on short principal name by 
> allowing the user's full principal name as owner of staging area directory in 
> {{JobSubmissionFiles#JobStagingDir()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6792) Allow user's full principal name as owner of MapReduce staging directory in JobSubmissionFiles#JobStagingDir()

2016-10-13 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-6792:
--
Target Version/s: 3.0.0-alpha1, 2.9.0  (was: 3.0.0-alpha1)

> Allow user's full principal name as owner of MapReduce staging directory in 
> JobSubmissionFiles#JobStagingDir()
> --
>
> Key: MAPREDUCE-6792
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6792
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Santhosh G Nayak
>Assignee: Santhosh G Nayak
> Attachments: MAPREDUCE-6792.1.patch
>
>
> Background - 
> Currently, {{JobSubmissionFiles#JobStagingDir()}} assumes that file owner 
> returned as part of {{FileSystem#getFileStatus()}} is always user's short 
> principal name, which is true for HDFS. But, some file systems which are HDFS 
> compatible like [Azure Data Lake Store (ADLS) 
> |https://azure.microsoft.com/en-in/services/data-lake-store/] and work in 
> multi tenant environment can have users with same names belonging to 
> different domains. For example, {{us...@company1.com}} and 
> {{us...@company2.com}}. It will be ambiguous, if 
> {{FileSystem#getFileStatus()}} returns only the user's short principal name 
> (without domain name) as the owner of the file/directory. 
> The following code block allows only short user principal name as owner. It 
> simply fails saying that ownership on the staging directory is not as 
> expected, if owner returned by the {{FileStatus#getOwner()}} is not equal to 
> short principal name of the current user.
> {code}
> String realUser;
> String currentUser;
> UserGroupInformation ugi = UserGroupInformation.getLoginUser();
> realUser = ugi.getShortUserName();
> currentUser = UserGroupInformation.getCurrentUser().getShortUserName();
> if (fs.exists(stagingArea)) {
>   FileStatus fsStatus = fs.getFileStatus(stagingArea);
>   String owner = fsStatus.getOwner();
>   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
>  throw new IOException("The ownership on the staging directory " +
>   stagingArea + " is not as expected. " +
>   "It is owned by " + owner + ". The directory must " +
>   "be owned by the submitter " + currentUser + " or " +
>   "by " + realUser);
>   }
> {code}
> The proposal is to remove the strict restriction on short principal name by 
> allowing the user's full principal name as owner of staging area directory in 
> {{JobSubmissionFiles#JobStagingDir()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6792) Allow user's full principal name as owner of MapReduce staging directory in JobSubmissionFiles#JobStagingDir()

2016-10-13 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-6792:
--
Status: Patch Available  (was: Open)

Submit the patch for kick off Jenkins' test.
The patch looks good in overall. Several comments:
1. {{fileOwner.equalsIgnoreCase(currentUser.getUserName())}} - I think our 
current assumption in hadoop is user name should be case sensitive, so user and 
USER are treated as different user. In AzureFS or other similar cloud based FS, 
do we change the assumption here especially for domain name? If not, we should 
keep case sensitive check here.
2. The exception message include all possible usernames, it could be duplicated 
in case login user = real user (in case no proxy user get used). So we should 
do a quick check and only log both when login user != real user. Isn't it?
3. It would be great if we can figure out some way to add unit test for use 
case that we are adding here.

> Allow user's full principal name as owner of MapReduce staging directory in 
> JobSubmissionFiles#JobStagingDir()
> --
>
> Key: MAPREDUCE-6792
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6792
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Santhosh G Nayak
>Assignee: Santhosh G Nayak
> Attachments: MAPREDUCE-6792.1.patch
>
>
> Background - 
> Currently, {{JobSubmissionFiles#JobStagingDir()}} assumes that file owner 
> returned as part of {{FileSystem#getFileStatus()}} is always user's short 
> principal name, which is true for HDFS. But, some file systems which are HDFS 
> compatible like [Azure Data Lake Store (ADLS) 
> |https://azure.microsoft.com/en-in/services/data-lake-store/] and work in 
> multi tenant environment can have users with same names belonging to 
> different domains. For example, {{us...@company1.com}} and 
> {{us...@company2.com}}. It will be ambiguous, if 
> {{FileSystem#getFileStatus()}} returns only the user's short principal name 
> (without domain name) as the owner of the file/directory. 
> The following code block allows only short user principal name as owner. It 
> simply fails saying that ownership on the staging directory is not as 
> expected, if owner returned by the {{FileStatus#getOwner()}} is not equal to 
> short principal name of the current user.
> {code}
> String realUser;
> String currentUser;
> UserGroupInformation ugi = UserGroupInformation.getLoginUser();
> realUser = ugi.getShortUserName();
> currentUser = UserGroupInformation.getCurrentUser().getShortUserName();
> if (fs.exists(stagingArea)) {
>   FileStatus fsStatus = fs.getFileStatus(stagingArea);
>   String owner = fsStatus.getOwner();
>   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
>  throw new IOException("The ownership on the staging directory " +
>   stagingArea + " is not as expected. " +
>   "It is owned by " + owner + ". The directory must " +
>   "be owned by the submitter " + currentUser + " or " +
>   "by " + realUser);
>   }
> {code}
> The proposal is to remove the strict restriction on short principal name by 
> allowing the user's full principal name as owner of staging area directory in 
> {{JobSubmissionFiles#JobStagingDir()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6792) Allow user's full principal name as owner of MapReduce staging directory in JobSubmissionFiles#JobStagingDir()

2016-10-13 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-6792:
--
Assignee: Santhosh G Nayak

> Allow user's full principal name as owner of MapReduce staging directory in 
> JobSubmissionFiles#JobStagingDir()
> --
>
> Key: MAPREDUCE-6792
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6792
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Santhosh G Nayak
>Assignee: Santhosh G Nayak
> Attachments: MAPREDUCE-6792.1.patch
>
>
> Background - 
> Currently, {{JobSubmissionFiles#JobStagingDir()}} assumes that file owner 
> returned as part of {{FileSystem#getFileStatus()}} is always user's short 
> principal name, which is true for HDFS. But, some file systems which are HDFS 
> compatible like [Azure Data Lake Store (ADLS) 
> |https://azure.microsoft.com/en-in/services/data-lake-store/] and work in 
> multi tenant environment can have users with same names belonging to 
> different domains. For example, {{us...@company1.com}} and 
> {{us...@company2.com}}. It will be ambiguous, if 
> {{FileSystem#getFileStatus()}} returns only the user's short principal name 
> (without domain name) as the owner of the file/directory. 
> The following code block allows only short user principal name as owner. It 
> simply fails saying that ownership on the staging directory is not as 
> expected, if owner returned by the {{FileStatus#getOwner()}} is not equal to 
> short principal name of the current user.
> {code}
> String realUser;
> String currentUser;
> UserGroupInformation ugi = UserGroupInformation.getLoginUser();
> realUser = ugi.getShortUserName();
> currentUser = UserGroupInformation.getCurrentUser().getShortUserName();
> if (fs.exists(stagingArea)) {
>   FileStatus fsStatus = fs.getFileStatus(stagingArea);
>   String owner = fsStatus.getOwner();
>   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
>  throw new IOException("The ownership on the staging directory " +
>   stagingArea + " is not as expected. " +
>   "It is owned by " + owner + ". The directory must " +
>   "be owned by the submitter " + currentUser + " or " +
>   "by " + realUser);
>   }
> {code}
> The proposal is to remove the strict restriction on short principal name by 
> allowing the user's full principal name as owner of staging area directory in 
> {{JobSubmissionFiles#JobStagingDir()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6792) Allow user's full principal name as owner of MapReduce staging directory in JobSubmissionFiles#JobStagingDir()

2016-10-13 Thread Santhosh G Nayak (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Santhosh G Nayak updated MAPREDUCE-6792:

Attachment: MAPREDUCE-6792.1.patch

Attaching a patch containing the proposed changes.

> Allow user's full principal name as owner of MapReduce staging directory in 
> JobSubmissionFiles#JobStagingDir()
> --
>
> Key: MAPREDUCE-6792
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6792
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Reporter: Santhosh G Nayak
> Attachments: MAPREDUCE-6792.1.patch
>
>
> Background - 
> Currently, {{JobSubmissionFiles#JobStagingDir()}} assumes that file owner 
> returned as part of {{FileSystem#getFileStatus()}} is always user's short 
> principal name, which is true for HDFS. But, some file systems which are HDFS 
> compatible like [Azure Data Lake Store (ADLS) 
> |https://azure.microsoft.com/en-in/services/data-lake-store/] and work in 
> multi tenant environment can have users with same names belonging to 
> different domains. For example, {{us...@company1.com}} and 
> {{us...@company2.com}}. It will be ambiguous, if 
> {{FileSystem#getFileStatus()}} returns only the user's short principal name 
> (without domain name) as the owner of the file/directory. 
> The following code block allows only short user principal name as owner. It 
> simply fails saying that ownership on the staging directory is not as 
> expected, if owner returned by the {{FileStatus#getOwner()}} is not equal to 
> short principal name of the current user.
> {code}
> String realUser;
> String currentUser;
> UserGroupInformation ugi = UserGroupInformation.getLoginUser();
> realUser = ugi.getShortUserName();
> currentUser = UserGroupInformation.getCurrentUser().getShortUserName();
> if (fs.exists(stagingArea)) {
>   FileStatus fsStatus = fs.getFileStatus(stagingArea);
>   String owner = fsStatus.getOwner();
>   if (!(owner.equals(currentUser) || owner.equals(realUser))) {
>  throw new IOException("The ownership on the staging directory " +
>   stagingArea + " is not as expected. " +
>   "It is owned by " + owner + ". The directory must " +
>   "be owned by the submitter " + currentUser + " or " +
>   "by " + realUser);
>   }
> {code}
> The proposal is to remove the strict restriction on short principal name by 
> allowing the user's full principal name as owner of staging area directory in 
> {{JobSubmissionFiles#JobStagingDir()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org