Abhishek

If you are using Ranger for Hive authorization, then we recommend that you 
allow users to access Hive using HiveServer2 (beeline or JDBC) only. In this 
way, you can set the access permissions or tables, columns or even row level 
from Ranger and it would be enforced at the HiveServer2 server.  

On the HiveServer2 service, you need to set the configuration for doAs=false, 
which essentially means, it would use the (hive) service user credentials to 
access the underlying store (HDFS or S3). This has multiple advantages, 
including limiting the level of permissions you want to give the HDFS or S3 
layer. 

The JIRA RANGER-1300 is primarily to enable authorization for tools which 
access the data layer directly, without any intermediate process. E.g. Apache 
Spark with LLAP. In this, the only logical enforcement point is at HDFS or S3.

S3 is a shared service and hosted by AWS. And it doesn't provide any hook for 
3rd party extension. This makes it difficult for Ranger to embed it's plugin 
within S3. Currently, the only option open is for Ranger to manage the S3 ACLs. 
This would require some work to be done on the Ranger side.

If you have any suggestions for managing or enforcing permissions in S3, then 
let's discuss in RANGER-1300. It will be very helpful for everyone.

Thanks

Bosco

On 2/2/17, 11:28 PM, "Abhishek Somani" <[email protected]> wrote:

    Hi,
    
    I am currently evaluating using Ranger for hive authorization for tables
    with data residing in s3. With reference to
    https://issues.apache.org/jira/browse/RANGER-1300, can someone please
    explain what is the current support for s3 in Ranger. Does Ranger(primarily
    focused on Hive Authorization) work at all for tables backed with data in
    s3? I am sorry but in my few searches, I have not been able to find
    relevant documentation.
    
    
    Thanks,
    Abhishek
    


Reply via email to