Re: Ranger with Hive & Knox/WebHCat

Don Bosco Durai Fri, 18 Aug 2017 08:52:04 -0700

James

Yes, this should address your requirement. My only other suggestion is,  
wherever possible, have most users go via HiveServer2 and only small set of 
users (mostly ETL service users) access via Hive CLI. If you are providedaccess 
**only** via JDBC/HiveServer2  and no HDFS level permission i.e. #3 in your 
list , then you should be also able to enforce the following now or in the 
future:
1. Column Level Access Control
2. Dynamic Masking of Columns based users and groups
3. Row Level Filtering


I assume you have already taken care of the HDFS umask and done the initial 
chmod. Recommended umask is 077 or 027.

This article has more description regarding HDFS level security settings. 
https://hortonworks.com/blog/best-practices-in-hdfs-authorization-with-apache-ranger/

Thanks

Bosco


On 8/18/17, 8:24 AM, "James Srinivasan" <[email protected]> wrote:

    Thanks very much for clarifying, here's what I have done:
    
    1) Locked down Hadoop HDFS permissions to /apps/hive/warehouse
    2) Added Ranger HDFS policy for hive user to access /apps/hive/warehouse/*
    3) Added Ranger HDFS policy for johndoe user to access
    /apps/hive/warehouse/mydatabase/*
    4) Added Ranger Hive policy for johndoe user to access mydatabase
    
    We are using Kerberos & LDAP (Windows AD implementation) throughout.
    
    Inside our network, for clients using beeline #1-4 apply.
    Inside our network, for clients using the hive command line #1-3 apply.
    Outside our network, clients come in via Knox, to WebHCat (not Hive
    JDBC), to the hive endpoint [1] for which #1-3 apply
    
    This gives the combination of internal and external security we
    require - many thanks!
    
    James
    
    [1] See 
https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Hive#WebHCatReferenceHive-CurlCommand
    
    On 15 August 2017 at 23:55, Don Bosco Durai <[email protected]> wrote:
    > Larry, you are correct. If it is not going via HiveServer2, then Hive 
policies will not be enforced. In this case HDFS policies need to be configured.
    >
    > I was assuming James had configured Knox to use JDBC/HiveServer2, which I 
feel should be the correct thing to do.
    >
    > Bosco
    >
    >
    > On 8/15/17, 3:50 PM, "Larry McCay" <[email protected]> wrote:
    >
    >     Hive access via WebHCat - via java, pig or whatever is probably not 
going to be protected by same policies that are set for HiveServer2 access.
    >     JDBC enforcement point is inside the HS2 server and WebHCat 
enforcement point must be closer to the actual resource.
    >
    >     @Bosco, please correct me if I am wrong.
    >
    >     > On Aug 15, 2017, at 6:45 PM, Don Bosco Durai <[email protected]> 
wrote:
    >     >
    >     > If you are using Knox, then it is just a pass through to connect to 
HiveServer2 via JDBC. So the policies should just work the same way as you will 
be connecting via beeline or any other JDBC client.
    >     >
    >     > The best way to validate is to see how Ranger is allowing it. You 
can check Ranger Audit logs and it will tell you which policy allowed and for 
which user.
    >     >
    >     > Bosco
    >     >
    >     >
    >     > On 8/15/17, 2:45 PM, "James Srinivasan" 
<[email protected]> wrote:
    >     >
    >     >    Does Ranger support the same fine grained access control when 
Hive is
    >     >    accessed via JDBC versus when Hive is accessed via Knox/WebHCat? 
Our
    >     >    experience is that it works fine in the former case, but in the 
latter
    >     >    case the fine grained access control set in our Hive policies 
seems to
    >     >    be ignored.
    >     >
    >     >    Many thanks
    >     >
    >     >
    >     >
    >
    >
    >
    >

Re: Ranger with Hive & Knox/WebHCat

Reply via email to