[jira] [Commented] (TEZ-3894) Tez intermediate outputs implicitly rely on permissive umask for shuffle

2020-07-23 Thread Tarek Abouzeid (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163589#comment-17163589
 ] 

Tarek Abouzeid commented on TEZ-3894:
-

Hi,

an update to this ticket, in Hortonworks HDP, the umask settings for TEZ was 
being fetched from the HDFS service umask setting where it was 077, changing it 
to 022 fixed the problem.

Best Regards, 

> Tez intermediate outputs implicitly rely on permissive umask for shuffle
> 
>
> Key: TEZ-3894
> URL: https://issues.apache.org/jira/browse/TEZ-3894
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jason Darrell Lowe
>Assignee: Jason Darrell Lowe
>Priority: Major
> Fix For: 0.9.2
>
> Attachments: TEZ-3894.001.patch
>
>
> Tez does not explicitly set the permissions of intermediate output files for 
> shuffle. In a secure cluster the shuffle service is running as a different 
> user than the task, so the output files require group readability in order to 
> serve up the data during the shuffle phase. If the umask is too restrictive 
> (e.g.: 077) then the task's file.out and file.out.index permissions can be 
> too restrictive to allow the shuffle handler to access them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-3894) Tez intermediate outputs implicitly rely on permissive umask for shuffle

2019-05-17 Thread Jonathan Eagles (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16842776#comment-16842776
 ] 

Jonathan Eagles commented on TEZ-3894:
--

[~tarekabouzeid91], as this jira is fixed in 0.9.2 (you state you are running a 
vender version based on 0.9.1), I would consider upgrading versions of the tez 
to 0.9.2. Otherwise, it will be best to get debugging results through the 
vendor or through the tez user group (u...@tez.apache.org)

> Tez intermediate outputs implicitly rely on permissive umask for shuffle
> 
>
> Key: TEZ-3894
> URL: https://issues.apache.org/jira/browse/TEZ-3894
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Fix For: 0.9.2
>
> Attachments: TEZ-3894.001.patch
>
>
> Tez does not explicitly set the permissions of intermediate output files for 
> shuffle. In a secure cluster the shuffle service is running as a different 
> user than the task, so the output files require group readability in order to 
> serve up the data during the shuffle phase. If the umask is too restrictive 
> (e.g.: 077) then the task's file.out and file.out.index permissions can be 
> too restrictive to allow the shuffle handler to access them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3894) Tez intermediate outputs implicitly rely on permissive umask for shuffle

2019-05-17 Thread Tarek Abouzeid (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841984#comment-16841984
 ] 

Tarek Abouzeid commented on TEZ-3894:
-

i am using Tez 0.9.1 and the file.out and file.out.index in prod, are being 
created with different permissions
{code:java}
-rw---. 1 hive hadoop 28 May 16 16:17 file.out
-rw-r-. 1 hive hadoop 32 May 16 16:17 file.out.index
{code}
i am using hortonworks 3.1 , also in other environment using same version its 
working fine, also checked the umask for user (tez,hive,mapred) and all set to 
0022, any tips that can help solving this please ?

> Tez intermediate outputs implicitly rely on permissive umask for shuffle
> 
>
> Key: TEZ-3894
> URL: https://issues.apache.org/jira/browse/TEZ-3894
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Fix For: 0.9.2
>
> Attachments: TEZ-3894.001.patch
>
>
> Tez does not explicitly set the permissions of intermediate output files for 
> shuffle. In a secure cluster the shuffle service is running as a different 
> user than the task, so the output files require group readability in order to 
> serve up the data during the shuffle phase. If the umask is too restrictive 
> (e.g.: 077) then the task's file.out and file.out.index permissions can be 
> too restrictive to allow the shuffle handler to access them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3894) Tez intermediate outputs implicitly rely on permissive umask for shuffle

2018-02-09 Thread Kuhu Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358844#comment-16358844
 ] 

Kuhu Shukla commented on TEZ-3894:
--

Precommit Output:
{noformat}
{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12909299/TEZ-3894.001.patch
  against master revision 96c988c.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2725//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2725//console
{noformat}

> Tez intermediate outputs implicitly rely on permissive umask for shuffle
> 
>
> Key: TEZ-3894
> URL: https://issues.apache.org/jira/browse/TEZ-3894
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: TEZ-3894.001.patch
>
>
> Tez does not explicitly set the permissions of intermediate output files for 
> shuffle. In a secure cluster the shuffle service is running as a different 
> user than the task, so the output files require group readability in order to 
> serve up the data during the shuffle phase. If the umask is too restrictive 
> (e.g.: 077) then the task's file.out and file.out.index permissions can be 
> too restrictive to allow the shuffle handler to access them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3894) Tez intermediate outputs implicitly rely on permissive umask for shuffle

2018-02-09 Thread Kuhu Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358845#comment-16358845
 ] 

Kuhu Shukla commented on TEZ-3894:
--

The patch looks good to me. +1. Committing this shortly.

> Tez intermediate outputs implicitly rely on permissive umask for shuffle
> 
>
> Key: TEZ-3894
> URL: https://issues.apache.org/jira/browse/TEZ-3894
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: TEZ-3894.001.patch
>
>
> Tez does not explicitly set the permissions of intermediate output files for 
> shuffle. In a secure cluster the shuffle service is running as a different 
> user than the task, so the output files require group readability in order to 
> serve up the data during the shuffle phase. If the umask is too restrictive 
> (e.g.: 077) then the task's file.out and file.out.index permissions can be 
> too restrictive to allow the shuffle handler to access them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3894) Tez intermediate outputs implicitly rely on permissive umask for shuffle

2018-02-08 Thread Kuhu Shukla (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357081#comment-16357081
 ] 

Kuhu Shukla commented on TEZ-3894:
--

Will review this today. Thanks [~jlowe] for the patch.

> Tez intermediate outputs implicitly rely on permissive umask for shuffle
> 
>
> Key: TEZ-3894
> URL: https://issues.apache.org/jira/browse/TEZ-3894
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
> Attachments: TEZ-3894.001.patch
>
>
> Tez does not explicitly set the permissions of intermediate output files for 
> shuffle. In a secure cluster the shuffle service is running as a different 
> user than the task, so the output files require group readability in order to 
> serve up the data during the shuffle phase. If the umask is too restrictive 
> (e.g.: 077) then the task's file.out and file.out.index permissions can be 
> too restrictive to allow the shuffle handler to access them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3894) Tez intermediate outputs implicitly rely on permissive umask for shuffle

2018-01-31 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16347600#comment-16347600
 ] 

Jason Lowe commented on TEZ-3894:
-

This is the Tez equivalent of MAPREDUCE-7033.  This will become more of an 
issue with Hadoop 3.x since HADOOP-11347 fixed a bug in the local filesystem to 
have it honor the configured fs.permission.umask-mode property where it was 
ignored in 2.x and implicitly relied on the UNIX umask.

> Tez intermediate outputs implicitly rely on permissive umask for shuffle
> 
>
> Key: TEZ-3894
> URL: https://issues.apache.org/jira/browse/TEZ-3894
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Major
>
> Tez does not explicitly set the permissions of intermediate output files for 
> shuffle. In a secure cluster the shuffle service is running as a different 
> user than the task, so the output files require group readability in order to 
> serve up the data during the shuffle phase. If the umask is too restrictive 
> (e.g.: 077) then the task's file.out and file.out.index permissions can be 
> too restrictive to allow the shuffle handler to access them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)