If we only have ALLOW that does not mean DENY for people that have not been 
explicitly allowed, it means NOT_SPECIFIED (or similar is what it's called in 
the code) and the HDFS ACLs are checked.
 

You are correct. This is by design so you can do chaining of authorization 
plugins. When the plugin doesn’t have explicit DENY or ALLOW, then it go to the 
next plugin for evaluation. In the case of HDFS and YARN, we fall back to 
native policies. YARN has a global switch to turn this off. HDFS is more 
tricky. In some cases it will be too much of a work to manage the policies in 
Ranger. E.g. policies in /tmp folder, service folders, etc. 

 

The JIRA Madhan mentioned would be a good to solve some specific use cases. 
Like the way you have setup (3 policies).

 

I feel, in the long run we should have something similar to Security Zone (or 
an option in SecurityZone itself), where we should identify certain resources. 
E.g. /user, /hive/warehouse, /data_folders, etc. (or inverse) to be managed 
exclusively by Ranger with no fall back. In that way, without Ranger policies 
the users won’t get access to resource. This might be a cleaner approach.

 

Bosco

 

 

 

 

From: Lars Francke <lars.fran...@gmail.com>
Reply-To: <user@ranger.apache.org>
Date: Friday, January 24, 2020 at 12:31 AM
To: <user@ranger.apache.org>
Subject: Re: Ranger policies best practices

 

Madhan,

 

thank you for the pointer. That looks promising! We'll try to get Ranger 2 
running to evaluate.

 

Cheers,

Lars

 

On Fri, Jan 24, 2020 at 9:03 AM Madhan Neethiraj <mad...@apache.org> wrote:

Lars,

 

The enhancement in RANGER-2507 introduced the notion of “DenyAllElse”, which 
denies access to specified resources unless explicitly allowed by the policy. 
This should help address your usecase. Please review.

 

Madhan

 

 

From: Lars Francke <lars.fran...@gmail.com>
Reply-To: "user@ranger.apache.org" <user@ranger.apache.org>
Date: Thursday, January 23, 2020 at 11:43 PM
To: "user@ranger.apache.org" <user@ranger.apache.org>
Subject: Re: Ranger policies best practices

 

Hi Bosco and thanks for the quick response!

 

Ranger policy definitions have evolved over time to address more complex use 
cases. Can you come with some real world use cases? We can try to come policies 
for them.

 

Relatively simple:

* If we have a policy for a resource (talking about HDFS) then we want to ALLOW 
only based on the Ranger policy and _not_ fall back on HDFS

* If we do not have a policy for a resource we want the fallback

 

At high level, here are key points;

 
Deny policy anywhere (tag/resource level) trumps. Exception would be 
conditional policies in Ranger 2.0
Allow policy is needed for providing access to resource. Allow policies are 
processed after all DENY policies are processed.
 

In the flow you gave, you only need “ALLOW” policy.

* add a ALLOW <group> policy

* add a DENY public group

* add a DENY EXCLUDE <group> policy

 

I believe that's not correct but would be happy to be wrong myself ;-)

But I think this was due to my earlier mail not being clear on what our 
requirements are (see above).

 

If we only have ALLOW that does not mean DENY for people that have not been 
explicitly allowed, it means NOT_SPECIFIED (or similar is what it's called in 
the code) and the HDFS ACLs are checked.

So to prevent HDFS checking we need the DENY "public" group but because that is 
checked before ALLOW we _also_ need DENY EXCLUDE.

 

To sum it up: We want the fallback to HDFS be configurable not just globally 
but per policy and until yesterday I always assumed this was already the case.

 

One example for DENY will be:

Your company is hosting interns over the summer and they will be doing some 
machine learning projects. The interns will need access to your dataset, but 
your company policy doesn’t allow them to view PII data. However, there is one 
intern name Julia as an exception and could access PII data.

 
Tag based policy: “DENY” all resources tagged as “PII” for group “INTERN”
Exclude user “Julia”
Now for PII resources you want Julia to access, you give “ALLOW” access to user 
“julia”
 

Note, Exclude from DENY doesn’t mean the user will get the permission. There 
should be explicit ALLOW for the excluded user/group to access the resource.

 

Cheers,

Lars

 

 

 

Bosco

 

 

From: Lars Francke <lars.fran...@gmail.com>
Reply-To: <user@ranger.apache.org>
Date: Thursday, January 23, 2020 at 4:49 AM
To: <user@ranger.apache.org>
Subject: Ranger policies best practices

 

Hi,

 

I'm wondering what the best practices for policies in Ranger are?

With Deny policies I'm not sure anymore.

 

The way I understand it I now need to

 

* add a ALLOW <group> policy

* add a DENY public group

* add a DENY EXCLUDE <group> policy

 

so that  I can allow access for people from the <group>. Those would be three 
rules for one ALLOW.

 

We can disable the HDFS fallback but it's global.

What I had assumed so far (wrongly) is that as soon as there is a policy that 
matches a resource it is authoritative i.e. if this policy doesn't allow access 
it'll not fall through and deny.

 

Is there anything I misunderstood and/or what are the best practices for 
policies in Ranger these days?

 

I know this Wiki page 
(<https://cwiki.apache.org/confluence/display/RANGER/How+Deny+Policies+Work+in+Apache+Ranger>)
 but that misses just those corner cases.

 

I assume (from my experience with customers) that quite a few people are 
actually using Ranger wrong if my understanding is correct.

 

Thanks for your help!

 

Cheers,

Lars

Reply via email to