Hi Bolke, My comments inline.
Thanks, -Abhay On 12/4/18, 1:07 PM, "Bolke de Bruin" <bdbr...@gmail.com> wrote: >Hi Abhay, > >Good point on #1 will take that into account if possible (can a enricher >call audit events?). > >On #2 yes, otherwise the resource matcher will stop working. Maybe proper >namespacing is the way to go here. Implementing it this way ensures >backwards compatibility. On a broader thought, I think Ranger is lacking >here. Context could also be provided by the client and there is no real >clean way of doing this at the moment. Abhay> I will need to take a look to figure out why resource matcher will not work. However, instead of implementing a new API (removeValue()), is it possible to use setValue() API to set KEY_CLIENT_TAG entry to null? > >Question should client tags only apply to SELF, or also >SELF_OR_DESCENDENT and ANCESTOR? I wasn??t sure here. Abhay> I don??t see any issue, at this time, to apply client-tags when match-type is SELF, SELF_OR_DESCENDENT or ANCESTOR. > >Second question (a bit unrelated): how scaleable is the tagsync approach? >If we have millions of tagged files and sources they all end up being >registered in Ranger this could easily grow exponentially. Besides >getting outdated? The other approach could be to have this handled in the >client (pickup info from TagSource - ie. Atlas and supply this to the >policy engine). Abhay> I see that there is some lag involved. But, overall, the architecture allows for tag-based policies (really ABAC way of authorization) to be applied across all components uniformly. Having ranger-admin as a central repository of policies and tags, and components as simply clients downloading these artifacts has many more advantages than each component having to do all the work by itself. Also, any Kafka delay will also be an issue even when components directly received tags from Atlas without ranger-admin mediating tag transfer. Moreover, there are several optimizations possible (such as incremental download of tags - not implemented yet) which can speed up tag downloads significantly. With a large number of tags, surely, the size of ranger-admin tag tables will increase, but IMO, it is a fair trade-off considering all other advantages this architecture provides us. Also, it will be useful to know the order of magnitude of delay you experienced (other than possibly up to 1 minute delay because of the interval between tag downloads). > >Cheers >Bolke > > >Verstuurd vanaf mijn iPad > >> Op 4 dec. 2018 om 21:51 heeft Abhay Kulkarni >><akulka...@hortonworks.com> het volgende geschreven: >> >> Hi Bolke, >> >> This looks like a good addition to tag-based authorization in Ranger. I >> will review the patch separately. However, here are a few thoughts. >> >> 1. If the client component is tag-aware and client-supplied tags >>overwrite >> admin-supplied tags, audit needs to record this very clearly. This will >> avoid any potential confusion about why the authorization decision was >> different only for a certain (or certain type) of component. >> >> 2. Do the client-supplied tags have to be removed from the >>access-request? >> >> Thanks, >> -Abhay >> >>> On 12/4/18, 6:02 AM, "Bolke de Bruin" <bdbr...@gmail.com> wrote: >>> >>> Hi All, >>> >>> Ranger assumes that clients are tag unaware. So the Tag Enricher is >>> dependent on a resource to tag mapping supplied externally by for >>>example >>> Apache Atlas. We found out that having tags available in Ranger can >>>have >>> a prohibitive delay. For example, data arrives at the platform and is >>> being tagged programatically in Apache Atlas. Atlas then puts the data >>>on >>> Kafka and Ranger picks it up. The client (or another) needs to refresh >>> its policies before the tagging info becomes available for evaluation. >>> Typically, this can be too slow. Kafka introduces a lag and the policy >>> refresh also introduces a lag (tested). >>> >>> If the client is tag aware and it could supply this information to the >>> plugin policy evaluation could continue. I have created >>> https://issues.apache.org/jira/browse/RANGER-2302 >>> <https://issues.apache.org/jira/browse/RANGER-2302> to track this. I >>>also >>> have created an initial patch. The patch allows a client to set the >>> special ??RangerTagEnricher.KEY_CLIENT_TAGS?? as a value in the access >>> request. This will then be picked up by the Tag Enricher. Currently, >>> client supplied tags overwrite the system supplied tags. The reason for >>> this is that the client might have more recent information. Most likely >>> this will need to be checked against the ??updated?? field in the tag >>> itself, bit that wasn't readily available. >>> >>> I am looking for feedback to see if we can have this in. Or are there >>> other ways to solve this? >>> >>> Cheers >>> Bolke >>> >>> >> >