Re: Allow clients to supply tag information

Bolke de Bruin Wed, 05 Dec 2018 11:59:35 -0800

Hi Abhay,

Also answers inline.


B.

Verstuurd vanaf mijn iPad

> Op 5 dec. 2018 om 20:25 heeft Abhay Kulkarni <akulka...@hortonworks.com> het 
> volgende geschreven:
> 
> Hi Bolke,
> 
> My comments inline.
> 
> Thanks,
> -Abhay
> 
>> On 12/4/18, 1:07 PM, "Bolke de Bruin" <bdbr...@gmail.com> wrote:
>> 
>> Hi Abhay,
>> 
>> Good point on #1 will take that into account if possible (can a enricher
>> call audit events?).
>> 
>> On #2 yes, otherwise the resource matcher will stop working. Maybe proper
>> namespacing is the way to go here. Implementing it this way ensures
>> backwards compatibility. On a broader thought, I think Ranger is lacking
>> here. Context could also be provided by the client and there is no real
>> clean way of doing this at the moment.
> 
> Abhay> I will need to take a look to figure out why resource matcher will
> not work. However, instead of implementing a new API (removeValue()), is
> it possible to use setValue() API to set KEY_CLIENT_TAG entry to null?

I don’t think that is possible. The resource matcher checks for elements and 
setting it to null means it is present which means the signature still doesn’t 
match.

>> 
>> Question should client tags only apply to SELF, or also
>> SELF_OR_DESCENDENT and ANCESTOR? I wasn’t sure here.
> 
> Abhay> I don’t see any issue, at this time, to apply client-tags when
> match-type is SELF, SELF_OR_DESCENDENT or ANCESTOR.

This means a client tag will match against all of them at any time. The client 
isn’t aware of match-types. Correct?

>> 
>> Second question (a bit unrelated): how scaleable is the tagsync approach?
>> If we have millions of tagged files and sources they all end up being
>> registered in Ranger this could easily grow exponentially. Besides
>> getting outdated? The other approach could be to have this handled in the
>> client (pickup info from TagSource - ie. Atlas and supply this to the
>> policy engine).
> 
> Abhay> I see that there is some lag involved. But, overall, the
> architecture allows for tag-based policies (really ABAC way of
> authorization) to be applied across all components uniformly. Having
> ranger-admin as a central repository of policies and tags, and components
> as simply clients downloading these artifacts has many more advantages
> than each component having to do all the work by itself. Also, any Kafka
> delay will also be an issue even when components directly received tags
> from Atlas without ranger-admin mediating tag transfer. Moreover, there
> are several optimizations possible (such as incremental download of tags -
> not implemented yet) which can speed up tag downloads significantly. With
> a large number of tags, surely, the size of ranger-admin tag tables will
> increase, but IMO, it is a fair trade-off considering all other advantages
> this architecture provides us. Also, it will be useful to know the order
> of magnitude of delay you experienced (other than possibly up to 1 minute
> delay because of the interval between tag downloads).

The one minute is already too much for us. The example I gave happens within a 
few milliseconds so basically any delay is not acceptable.

To me it seems architecturally incorrect to have Ranger to be a source for tags 
as that is  Atlas (or some other). Ranger is duplicating things here rather 
than sticking to what it is good at: policies.  Clients are already downloading 
tags, doing that from Atlas instead of Ranger is not adding a lot of complexity 
and can be handled in the plugin transparently. But that is just my opinion. 

Maybe there is a possibility to accept client tags as a temporary in Ranger 
that can then be overwritten by the Tag Store (ie. Atlas). Just thinking out 
loud.

>> 
>> Cheers
>> Bolke
>> 
>> 
>> Verstuurd vanaf mijn iPad
>> 
>>> Op 4 dec. 2018 om 21:51 heeft Abhay Kulkarni
>>> <akulka...@hortonworks.com> het volgende geschreven:
>>> 
>>> Hi Bolke, 
>>> 
>>> This looks like a good addition to tag-based authorization in Ranger. I
>>> will review the patch separately. However, here are a few thoughts.
>>> 
>>> 1. If the client component is tag-aware and client-supplied tags
>>> overwrite
>>> admin-supplied tags, audit needs to record this very clearly. This will
>>> avoid any potential confusion about why the authorization decision was
>>> different only for a certain (or certain type) of component.
>>> 
>>> 2. Do the client-supplied tags have to be removed from the
>>> access-request?
>>> 
>>> Thanks,
>>> -Abhay
>>> 
>>>> On 12/4/18, 6:02 AM, "Bolke de Bruin" <bdbr...@gmail.com> wrote:
>>>> 
>>>> Hi All,
>>>> 
>>>> Ranger assumes that clients are tag unaware. So the Tag Enricher is
>>>> dependent on a resource to tag mapping supplied externally by for
>>>> example
>>>> Apache Atlas. We found out that having tags available in Ranger can
>>>> have
>>>> a prohibitive delay. For example, data arrives at the platform and is
>>>> being tagged programatically in Apache Atlas. Atlas then puts the data
>>>> on
>>>> Kafka and Ranger picks it up. The client (or another) needs to refresh
>>>> its policies before the tagging info becomes available for evaluation.
>>>> Typically, this can be too slow. Kafka introduces a lag and the policy
>>>> refresh also introduces a lag (tested).
>>>> 
>>>> If the client is tag aware and it could supply this information to the
>>>> plugin policy evaluation could continue. I have created
>>>> https://issues.apache.org/jira/browse/RANGER-2302
>>>> <https://issues.apache.org/jira/browse/RANGER-2302> to track this. I
>>>> also
>>>> have created an initial patch. The patch allows a client to set the
>>>> special ³RangerTagEnricher.KEY_CLIENT_TAGS² as a value in the access
>>>> request. This will then be picked up by the Tag Enricher. Currently,
>>>> client supplied tags overwrite the system supplied tags. The reason for
>>>> this is that the client might have more recent information. Most likely
>>>> this will need to be checked against the ³updated² field in the tag
>>>> itself, bit that wasn't readily available.
>>>> 
>>>> I am looking for feedback to see if we can have this in. Or are there
>>>> other ways to solve this?
>>>> 
>>>> Cheers
>>>> Bolke
>>>> 
>>>> 
>>> 
>> 
>

Re: Allow clients to supply tag information

Reply via email to