Hi All,

Ranger assumes that clients are tag unaware. So the Tag Enricher is dependent 
on a resource to tag mapping supplied externally by for example Apache Atlas. 
We found out that having tags available in Ranger can have a prohibitive delay. 
For example, data arrives at the platform and is being tagged programatically 
in Apache Atlas. Atlas then puts the data on Kafka and Ranger picks it up. The 
client (or another) needs to refresh its policies before the tagging info 
becomes available for evaluation. Typically, this can be too slow. Kafka 
introduces a lag and the policy refresh also introduces a lag (tested).

If the client is tag aware and it could supply this information to the plugin 
policy evaluation could continue. I have created 
https://issues.apache.org/jira/browse/RANGER-2302 
<https://issues.apache.org/jira/browse/RANGER-2302> to track this. I also have 
created an initial patch. The patch allows a client to set the special 
“RangerTagEnricher.KEY_CLIENT_TAGS” as a value in the access request. This will 
then be picked up by the Tag Enricher. Currently, client supplied tags 
overwrite the system supplied tags. The reason for this is that the client 
might have more recent information. Most likely this will need to be checked 
against the “updated” field in the tag itself, bit that wasn't readily 
available.

I am looking for feedback to see if we can have this in. Or are there other 
ways to solve this?

Cheers
Bolke


Reply via email to