We have a current setup with HW 2.5 using our own group provider for HDFS permissions. We did this because we manage permissions for thousands of users in thousands or groups which are recalculated every 15 minutes. We had namenode stability problems when we were using an LDAP provider (slow responses caused NN to fall over resulting in outages), once we switched to our group provider, we now get sub second response times and it's stable.
My question is we want to add HDFS encryption to this cluster. Our 'data lake' implementation maintains meta data on all data that we keep in the cluster. Right now, we have a single master Kerberos which owns every file in our lake folder. Every dataset that we store in a folder tree is owned by that user and uses a dataset specific group to control read access. We are moving to a 'ring' system where instead of having a single Kerberos, we have a Kerberos per ring. Each ring owns an exclusive set of datasets. So, it's exactly the same but now ownership of the files is divided between the rings. The ring Kerberos must be a member of the dataset groups in order for chgrp to work. Now, back to the question. We want to enable encryption. For now, simply using a single encryption zone for all rings is sufficient. How do we automatically configure the ranger side of this using the entitlement meta data in our lake. We'd basically want to configure N master Kerberos with RW access and a list of Hadoop groups (thousands) for the readers. All these would need to be able to access encrypted files, the union of the groups need to read, the masters need to write. Is it possible to do this using REST APIs, efficiently. Ranger would be maintaining these information on behalf of HDFS as far as I can see. Can Ranger use the HDFS group provider as a source of groups for example? This would make this very easy. Thanks for help Billy
