Lars,
The enhancement in RANGER-2507 introduced the notion of “DenyAllElse”, which denies access to specified resources unless explicitly allowed by the policy. This should help address your usecase. Please review. Madhan From: Lars Francke <lars.fran...@gmail.com> Reply-To: "user@ranger.apache.org" <user@ranger.apache.org> Date: Thursday, January 23, 2020 at 11:43 PM To: "user@ranger.apache.org" <user@ranger.apache.org> Subject: Re: Ranger policies best practices Hi Bosco and thanks for the quick response! Ranger policy definitions have evolved over time to address more complex use cases. Can you come with some real world use cases? We can try to come policies for them. Relatively simple: * If we have a policy for a resource (talking about HDFS) then we want to ALLOW only based on the Ranger policy and _not_ fall back on HDFS * If we do not have a policy for a resource we want the fallback At high level, here are key points; Deny policy anywhere (tag/resource level) trumps. Exception would be conditional policies in Ranger 2.0 Allow policy is needed for providing access to resource. Allow policies are processed after all DENY policies are processed. In the flow you gave, you only need “ALLOW” policy. * add a ALLOW <group> policy * add a DENY public group * add a DENY EXCLUDE <group> policy I believe that's not correct but would be happy to be wrong myself ;-) But I think this was due to my earlier mail not being clear on what our requirements are (see above). If we only have ALLOW that does not mean DENY for people that have not been explicitly allowed, it means NOT_SPECIFIED (or similar is what it's called in the code) and the HDFS ACLs are checked. So to prevent HDFS checking we need the DENY "public" group but because that is checked before ALLOW we _also_ need DENY EXCLUDE. To sum it up: We want the fallback to HDFS be configurable not just globally but per policy and until yesterday I always assumed this was already the case. One example for DENY will be: Your company is hosting interns over the summer and they will be doing some machine learning projects. The interns will need access to your dataset, but your company policy doesn’t allow them to view PII data. However, there is one intern name Julia as an exception and could access PII data. Tag based policy: “DENY” all resources tagged as “PII” for group “INTERN” Exclude user “Julia” Now for PII resources you want Julia to access, you give “ALLOW” access to user “julia” Note, Exclude from DENY doesn’t mean the user will get the permission. There should be explicit ALLOW for the excluded user/group to access the resource. Cheers, Lars Bosco From: Lars Francke <lars.fran...@gmail.com> Reply-To: <user@ranger.apache.org> Date: Thursday, January 23, 2020 at 4:49 AM To: <user@ranger.apache.org> Subject: Ranger policies best practices Hi, I'm wondering what the best practices for policies in Ranger are? With Deny policies I'm not sure anymore. The way I understand it I now need to * add a ALLOW <group> policy * add a DENY public group * add a DENY EXCLUDE <group> policy so that I can allow access for people from the <group>. Those would be three rules for one ALLOW. We can disable the HDFS fallback but it's global. What I had assumed so far (wrongly) is that as soon as there is a policy that matches a resource it is authoritative i.e. if this policy doesn't allow access it'll not fall through and deny. Is there anything I misunderstood and/or what are the best practices for policies in Ranger these days? I know this Wiki page (<https://cwiki.apache.org/confluence/display/RANGER/How+Deny+Policies+Work+in+Apache+Ranger>) but that misses just those corner cases. I assume (from my experience with customers) that quite a few people are actually using Ranger wrong if my understanding is correct. Thanks for your help! Cheers, Lars