[jira] [Updated] (SPARK-8427) Incorrect ACL checking for partitioned table in Spark SQL-1.4

2015-06-18 Thread Patrick Wendell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-8427:
---
Priority: Critical  (was: Blocker)

 Incorrect ACL checking for partitioned table in Spark SQL-1.4
 -

 Key: SPARK-8427
 URL: https://issues.apache.org/jira/browse/SPARK-8427
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.0
 Environment: CentOS 6  OS X 10.9.5, Hive-0.13.1, Spark-1.4, Hadoop 
 2.6.0
Reporter: Karthik Subramanian
Priority: Critical
  Labels: security

 Problem Statement:
 While doing query on a partitioned table using Spark SQL (Version 1.4.0), 
 access denied exception is observed on the partition the user doesn’t belong 
 to (The user permission is controlled using HDF ACLs). The same works 
 correctly in hive.
 Usercase: To address Multitenancy
 Consider a table containing multiple customers and each customer with 
 multiple facility. The table is partitioned by customer and facility. The 
 user belonging to on facility will not have access to other facility. This is 
 enforced using HDFS ACLs on corresponding directories. When querying on the 
 table as ‘user1’ belonging to ‘facility1’ and ‘customer1’ on the particular 
 partition (using ‘where’ clause) only the corresponding directory access 
 should be verified and not the entire table. 
 The above use case works as expected when using HIVE client, version 0.13.1  
 1.1.0. 
 The query used: select count(*) from customertable where customer=‘customer1’ 
 and facility=‘facility1’
 Below is the exception received in Spark-shell:
 org.apache.hadoop.security.AccessControlException: Permission denied: 
 user=user1, access=READ_EXECUTE, 
 inode=/data/customertable/customer=customer2/facility=facility2”:root:supergroup:drwxrwx---:group::r-x,group:facility2:rwx
   at 
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkAccessAcl(FSPermissionChecker.java:351)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:253)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:185)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6512)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6494)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6419)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:4954)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4915)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:826)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:612)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
   at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
   at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1971)
   at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1952)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:693)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755)
   at 
 

[jira] [Updated] (SPARK-8427) Incorrect ACL checking for partitioned table in Spark SQL-1.4

2015-06-17 Thread Karthik Subramanian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Subramanian updated SPARK-8427:
---
Environment: CentOS 6  OS X 10.9.5, Hive-0.13.1, Spark-1.4, Hadoop 2.6.0  
(was: CentOS 6, Hive-0.13.1, Spark-1.4, Hadoop 2.6.0)

 Incorrect ACL checking for partitioned table in Spark SQL-1.4
 -

 Key: SPARK-8427
 URL: https://issues.apache.org/jira/browse/SPARK-8427
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.0
 Environment: CentOS 6  OS X 10.9.5, Hive-0.13.1, Spark-1.4, Hadoop 
 2.6.0
Reporter: Karthik Subramanian
Priority: Blocker
  Labels: security

 Problem Statement:
 While doing query on a partitioned table using Spark SQL (Version 1.4.0), 
 access denied exception is observed on the partition the user doesn’t belong 
 to (The user permission is controlled using HDF ACLs). The same works 
 correctly in hive.
 Usercase: To address Multitenancy
 Consider a table containing multiple customers and each customer with 
 multiple facility. The table is partitioned by customer and facility. The 
 user belonging to on facility will not have access to other facility. This is 
 enforced using HDFS ACLs on corresponding directories. When querying on the 
 table as ‘user1’ belonging to ‘facility1’ and ‘customer1’ on the particular 
 partition (using ‘where’ clause) only the corresponding directory access 
 should be verified and not the entire table. 
 The above use case works as expected when using HIVE client, version 0.13.1  
 1.1.0. 
 The query used: select count(*) from customertable where customer=‘customer1’ 
 and facility=‘facility1’
 Below is the exception received in Spark-shell:
 org.apache.hadoop.security.AccessControlException: Permission denied: 
 user=user1, access=READ_EXECUTE, 
 inode=/data/customertable/customer=customer2/facility=facility2”:root:supergroup:drwxrwx---:group::r-x,group:facility2:rwx
   at 
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkAccessAcl(FSPermissionChecker.java:351)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:253)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:185)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6512)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6494)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6419)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:4954)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4915)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:826)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:612)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
   at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
   at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
   at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1971)
   at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1952)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:693)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
   at