Abhishek If you are using Ranger for Hive authorization, then we recommend that you allow users to access Hive using HiveServer2 (beeline or JDBC) only. In this way, you can set the access permissions or tables, columns or even row level from Ranger and it would be enforced at the HiveServer2 server.
On the HiveServer2 service, you need to set the configuration for doAs=false, which essentially means, it would use the (hive) service user credentials to access the underlying store (HDFS or S3). This has multiple advantages, including limiting the level of permissions you want to give the HDFS or S3 layer. The JIRA RANGER-1300 is primarily to enable authorization for tools which access the data layer directly, without any intermediate process. E.g. Apache Spark with LLAP. In this, the only logical enforcement point is at HDFS or S3. S3 is a shared service and hosted by AWS. And it doesn't provide any hook for 3rd party extension. This makes it difficult for Ranger to embed it's plugin within S3. Currently, the only option open is for Ranger to manage the S3 ACLs. This would require some work to be done on the Ranger side. If you have any suggestions for managing or enforcing permissions in S3, then let's discuss in RANGER-1300. It will be very helpful for everyone. Thanks Bosco On 2/2/17, 11:28 PM, "Abhishek Somani" <[email protected]> wrote: Hi, I am currently evaluating using Ranger for hive authorization for tables with data residing in s3. With reference to https://issues.apache.org/jira/browse/RANGER-1300, can someone please explain what is the current support for s3 in Ranger. Does Ranger(primarily focused on Hive Authorization) work at all for tables backed with data in s3? I am sorry but in my few searches, I have not been able to find relevant documentation. Thanks, Abhishek
