Hi Madhan,

Thanks for confirming that other 2 solution is also feasible. This are great 
insights for us. :)

SMIT SHAH
SDE, Big Data
Pronouns: he/him/his
[signature_164655020]<http://www.zillow.com/>


From: Madhan Neethiraj <[email protected]>
Date: Sunday, September 6, 2020 at 2:50 PM
To: Smit Shah <[email protected]>, "[email protected]" 
<[email protected]>
Cc: "[email protected]" <[email protected]>, "[email protected]" 
<[email protected]>
Subject: Re: Help: Tag based policy for non-Atlas solution

Smit,

I understand the reasoning to leverage existing Ranger tag-sync and tag-store 
implementation, instead of going with a custom context-enricher. While this is 
feasible, it will require use of internal APIs which could change in future 
releases. If you still want to provide an alternate source for tags, I suggest 
to consider extending org.apache.ranger.tagsync.model.AbstractTagSource, 
similar to AtlasTagSource, and register using with following configurations in 
ranger-tagsync-site.xml:
ranger.tagsync.source.<name-of-your-source>=true
ranger.tagsync.source.<name-of-your-source>.class=<implementation-class-name>

Hope this helps.

Madhan

From: Smit Shah <[email protected]>
Date: Tuesday, September 1, 2020 at 4:06 PM
To: Madhan Neethiraj <[email protected]>, "[email protected]" 
<[email protected]>
Cc: "[email protected]" <[email protected]>, "[email protected]" 
<[email protected]>
Subject: Re: Help: Tag based policy for non-Atlas solution

Hi Madhan,

Thank you for writing back with suggestion.

I would like to get some more insights on few options and general questions 
based on the suggestion provided and more investigation.

Option A: The solution you suggested (it’s really helpful)
With this we will not be leveraging ranger-tagsync process and all the tag 
related tables (ranger.x_tag*) that Ranger maintains. I can think of two 
challenges to tackle for us:

  1.  For our high request demand, the end-point which retrieves tags for 
resource needs to be highly available, faster and handle concurrent requests.
  2.  If incase the end-point or our tag store is down, it will fail and we 
have to either make the resource request deny/pass-through.

Option B: Leveraging ranger-tagsync process
Similar to how Ranger listens to Atlas’s Kafka topic, we can create an Apache 
Kafka topic for our tag stores change notification and let ranger-tagsync 
process listen to it. We can skip Option A.
Many of the property name defined inside 
install.properties<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FRANGER%2FTag%2BSynchronizer%2BInstallation%2Band%2BConfiguration&data=02%7C01%7Csmits%40zillowgroup.com%7C2858df2fb5de45d9d78608d852aee32b%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637350258386448431&sdata=J2xcsWW%2BPg6%2F7%2B5QOh8ISE8R0av1F%2BPZWpbNLjqcwmM%3D&reserved=0>
 are specific to Atlas. So, not sure if ranger-tagsync is designed specifically 
for Atlas.
Can you think of any challenges here?

Option C: Storing our tags directly inside Rangers internal tag store
There are 
end-points<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Franger.apache.org%2Fapidocs%2Findex.html&data=02%7C01%7Csmits%40zillowgroup.com%7C2858df2fb5de45d9d78608d852aee32b%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637350258386458429&sdata=sUdv888Bbcdg0PkKpvxin6SPFw%2BaghPEas187y2B73g%3D&reserved=0>
 provided by Ranger that we can leverage. So, instead of implementing content 
enricher (Option A), we can store our tags inside ranger tag-store and let 
Ranger work the normal way.
Can you think of any challenges here?


General question:
Does Ranger plugins also keep a cached version of the rangers internal 
tag-store apart from policy? Trying to see if there are benefits of putting our 
tag details inside rangers tag-store.


Overall, Option B seems like a better option to me if possible to implement.


SMIT SHAH
SDE, Big Data
Pronouns: he/him/his
[signature_810873024]<https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.zillow.com%2F&data=02%7C01%7Csmits%40zillowgroup.com%7C2858df2fb5de45d9d78608d852aee32b%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637350258386458429&sdata=yd0VDfGQLT1mSmgrr6901wqCQ37ZbI2p8cZ1AR3K7p4%3D&reserved=0>


From: Madhan Neethiraj <[email protected]>
Date: Monday, August 31, 2020 at 1:28 AM
To: Smit Shah <[email protected]>, "[email protected]" 
<[email protected]>
Cc: "[email protected]" <[email protected]>, "[email protected]" 
<[email protected]>, "[email protected]" <[email protected]>
Subject: Re: Help: Tag based policy for non-Atlas solution

Smit,

I suggest to consider implementing a context enricher that deals with 
retrieving tags from your tag store and sets tags for the resource in the 
request-context, with a call to 
RangerAccessRequestUtil.setRequestTagsInContext(context, tags). Tag 
service-def<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Franger%2Fblob%2Fmaster%2Fagents-common%2Fsrc%2Fmain%2Fresources%2Fservice-defs%2Franger-servicedef-tag.json%23L55&data=02%7C01%7Csmits%40zillowgroup.com%7C2858df2fb5de45d9d78608d852aee32b%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637350258386468423&sdata=Iw%2FCUzgg7IUZKhgGYE0MFg5Pd2V4G4d91XziuFo%2FUCA%3D&reserved=0>
 should be updated to register this context enricher, instead of current 
enricher implementation (RangerAdminTagRetriever).

Hope this helps.

Madhan



From: Smit Shah <[email protected]>
Date: Wednesday, August 26, 2020 at 3:59 PM
To: "[email protected]" <[email protected]>
Cc: "[email protected]" <[email protected]>, "[email protected]" 
<[email protected]>, "[email protected]" <[email protected]>
Subject: Help: Tag based policy for non-Atlas solution

cc: Team Members who created Confluence wiki pages that I have referred

Hi Apache Ranger Dev Team,

I am Smit Shah, working at Zillow<https://www.zillow.com/corp/About.htm> as a 
Data Engineer. My team is working on Data Governance around Apache Hive. We 
came across Apache Ranger and one of the key feature we like is Tag Based 
Policies, and really interested to leverage this. :)

Now, when going through the documentation for Tag Based 
Policies<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FRANGER%2FTag%2BBased%2BPolicies&data=02%7C01%7Csmits%40zillowgroup.com%7C2858df2fb5de45d9d78608d852aee32b%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637350258386478413&sdata=0C%2F0S22j2PPuHfTvAHJ%2FiAVnyQ5iPE9fAo2EcPAnkgs%3D&reserved=0>,
 I found that Tag Sync has native support for Apache Atlas. Now, our team 
already has our own tag store and trying to avoid adding another layer. So, 
checking with the team if there are any examples/blogs/documentation that you 
can share which can help to:
1. Store tags
2. How to make tag based policy work in Apache Ranger for non Apache Atlas 
solution

Some web-pages that I came across during my initial investigation:
1. Context 
enrichers<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FRANGER%2FDynamic%2BPolicy%2BHooks%2Bin%2BRanger%2B-%2BConfigure%2Band%2BUse&data=02%7C01%7Csmits%40zillowgroup.com%7C2858df2fb5de45d9d78608d852aee32b%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637350258386478413&sdata=x26roG5JR0Qd5gBUC3SQhMlVyAwhxZEyi4cjm%2B61E2Q%3D&reserved=0>
 – Not sure if this is important for my use-case
2. Installing Tag 
Synchronizer<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FRANGER%2FTag%2BSynchronizer%2BInstallation%2Band%2BConfiguration&data=02%7C01%7Csmits%40zillowgroup.com%7C2858df2fb5de45d9d78608d852aee32b%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637350258386488412&sdata=Ow6WivzqGueM1jmTi579iCrL9JHOht%2BtF%2FY5M6eNJWA%3D&reserved=0>
 – How to make this work for non-Atlas solution
3. Ranger 
API<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Franger.apache.org%2Fapidocs%2Findex.html&data=02%7C01%7Csmits%40zillowgroup.com%7C2858df2fb5de45d9d78608d852aee32b%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637350258386498410&sdata=O5HrAlCreLohvZupKxXmW80nEF9uMj2q44KvI6Eb11k%3D&reserved=0>
 – This might be needed for storing tags, like we can create service which 
calls this end-point which takes data from our tag store and store it in Ranger 
in required format.

You help/details will be really helpful to us. Sending email seemed like the 
best way to reach out to the team. Thank you very much in advance. :)

SMIT SHAH
SDE, Big Data
Pronouns: he/him/his
[signature_938899596]<https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.zillow.com%2F&data=02%7C01%7Csmits%40zillowgroup.com%7C2858df2fb5de45d9d78608d852aee32b%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637350258386498410&sdata=J2%2B0yzJD7VB39XLQ0Kbth6w89ZzawPVw4A%2Fb5qIuocM%3D&reserved=0>

Reply via email to