Billy

 

I was going through the Ranger emails and came across this and found it 
interesting.

 

You might have already figured out, but anyway let me answer from my point of 
view:

 

> We’d basically want to configure N master Kerberos with RW access and a list 
> of Hadoop groups (thousands) for the readers. All these would need to be able 
> to access encrypted files, the union of the groups need to read, the masters 
> need to write.

By design, it should just work as you are expecting. Previously we had UI 
issues for showing 1000s of groups in a single policy. But if you are using 
REST API, then it should work for you. Let us know otherwise.

 

> Can Ranger use the HDFS group provider as a source of groups for example?

The requirements are different. HDFS group provider will give the groups for 
the given user, while Ranger needs to get all the users and groups. There were 
discussion some time ago to get the users from LDAP, but do the group lookup 
using Hadoop UGI utility. This eliminates the requirement to configure group 
filtering in Ranger UserSync. Sailaja might be able to provide more insights 
regarding what happened to this suggestion. I am not able to find the JIRA for 
this.

 

Regardless, you should be able to sync the groups using your own from your 
custom flat file or even API. With this, you don’t have to depend on the groups 
from AD/LDAP.

 

Bosco

 

 

 

From: "Newport, Billy" <[email protected]>
Reply-To: <[email protected]>
Date: Wednesday, March 29, 2017 at 6:52 AM
To: "'[email protected]'" <[email protected]>
Subject: HDFS encryption zone policies with Ranger

 

We have a current setup with HW 2.5 using our own group provider for HDFS 
permissions. We did this because we manage permissions for thousands of users 
in thousands or groups which are recalculated every 15 minutes. We had namenode 
stability problems when we were using an LDAP provider (slow responses caused 
NN to fall over resulting in outages), once we switched to our group provider, 
we now get sub second response times and it’s stable.

 

My question is we want to add HDFS encryption to this cluster. Our ‘data lake’ 
implementation maintains meta data on all data that we keep in the cluster. 
Right now, we have a single master Kerberos which owns every file in our lake 
folder. Every dataset that we store in a folder tree is owned by that user and 
uses a dataset specific group to control read access.

 

We are moving to a ‘ring’ system where instead of having a single Kerberos, we 
have a Kerberos per ring. Each ring owns an exclusive set of datasets. So, it’s 
exactly the same but now ownership of the files is divided between the rings. 
The ring Kerberos must be a member of the dataset groups in order for chgrp to 
work.

 

Now, back to the question. We want to enable encryption. For now, simply using 
a single encryption zone for all rings is sufficient. How do we automatically 
configure the ranger side of this using the entitlement meta data in our lake. 
We’d basically want to configure N master Kerberos with RW access and a list of 
Hadoop groups (thousands) for the readers. All these would need to be able to 
access encrypted files, the union of the groups need to read, the masters need 
to write.

 

Is it possible to do this using REST APIs, efficiently. Ranger would be 
maintaining these information on behalf of HDFS as far as I can see. Can Ranger 
use the HDFS group provider as a source of groups for example? This would make 
this very easy.

 

Thanks for help

Billy

 

Reply via email to