[jira] [Comment Edited] (ATLAS-3755) Allow system attributes to be updated when policy allows

2020-05-03 Thread Bolke de Bruin (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17098286#comment-17098286
 ] 

Bolke de Bruin edited comment on ATLAS-3755 at 5/3/20, 8:20 AM:


[~madhan] I have made a couple of updates:

1. I removed the defaults in the AtlasEntity to ensure that at json 
deserialisation no defaults are set. This was a change very limited in scope as 
the deserialiser uses the default constructor and the default constructor isn't 
used anywhere else. All other constructors use "init()" that do set the default 
values.
2. The update of the system attributes is now behind a feature flag 
"atlas.store.system_attribute.enable". This defaults to false and disables 
updating/creation of system attributes if not in "import" mode.
3. Access requests are done in one go having virtually no impact on processing 
time
4. Unit tests have been updated to verify the feature flags
5. Removed the side effect in preCreateOrUpdate that was setting system 
attributes, all this is now contained in createOrUpdate

*Use case*
We intend to receive about 0.5 million entity updates per day from other meta 
data systems aside from the ~2 million events per day we receive. We use Kafka 
to handle the backpressure hence the need to have it managed by policy. Also 
Glossary (which relies on the entity system) needs to expose homeId and other 
system attributes as well as Glossary items can be created other meta data 
systems. This needs to be enabled in a follow PR.


was (Author: bolke):
[~madhan] I have made a couple of updates:

1. I removed the defaults in the AtlasEntity to ensure that at json 
deserialisation no defaults are set. This was a change very limited in scope as 
the deserialiser uses the default constructor and the default constructor isn't 
used anywhere else. All other constructors use "init()" that do set the default 
values.
2. The update of the system attributes is now behind a feature flag 
"atlas.store.system_attribute.enable". This defaults to false and disables 
updating/creation of system attributes if not in "import" mode.
3. Access requests are done in one go having virtually no impact on processing 
time
4. Unit tests have been updated to verify the feature flags
5. Removed the side effect in preCreateOrUpdate that was setting system 
attributes, all this is now contained in createOrUpdate

*Use case*
We intend to receive about 0.5 million entity updates per day from other meta 
data systems. We use Kafka to handle the backpressure hence the need to have it 
managed by policy. Also Glossary (which relies on the entity system) needs to 
expose homeId and other system attributes as well as Glossary items can be 
created other meta data systems. This needs to be enabled in a follow PR.

> Allow system attributes to be updated when policy allows
> 
>
> Key: ATLAS-3755
> URL: https://issues.apache.org/jira/browse/ATLAS-3755
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Bolke de Bruin
>Assignee: Bolke de Bruin
>Priority: Critical
> Attachments: 
> 0001-ATLAS-3755-Allow-system-attributes-to-be-updated-by-.patch, 
> 0001-ATLAS-3755-Allow-system-attributes-to-be-updated-by-.patch, 
> 0001-ATLAS-3755-Allow-system-attributes-to-be-updated-by-.patch, 
> feature.patch, feature.patch
>
>
> Atlas does not operate in a isolated environment, this is one of the reasons 
> the "homeId" system attribute was introduced. Unfortunately system attributes 
> can only be updated when importing. This means any integration with other 
> services is significantly limited (Kafka, Rest API will not work). (See also 
> ATLAS-3754)
> To resolve this I propose to make it possible to update the system attributes 
> when policy allows it. This introduces new 
> AtlasPrivilege.ENTITY_UPDATE_SYSTEM_ATTRIBUTE and 
> AtlasPrivilege.ENTITY_CREATE_SYSTEM_ATTRIBUTE next to 
> AtlasPrivilege.ENTITY_UPDATE_ATTRIBUTE and 
> AtlasPrivilege.ENTITY_CREATE_ATTRIBUTE rather than just checking on the 
> entity level. In certain places we will then drop the requirement for an 
> import to be active as this can now happen through other channels as well.
> This allows operators to specify policies that allow granular controls over 
> attributes and system attributes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (ATLAS-3755) Allow system attributes to be updated when policy allows

2020-04-29 Thread Bolke de Bruin (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095795#comment-17095795
 ] 

Bolke de Bruin edited comment on ATLAS-3755 at 4/29/20, 7:09 PM:
-

[~madhan] I don't think that would work for the KafkaConsumer, or does it? 
Please note that also Glossary needs to be updated to contain all system 
attributes as it is missing at least "homeId". In addition I don't think the 
risk is that high that system attributes get updated inadvertent. The default 
authorization model denies access to these attributes. Next to that it would 
require the incoming message to include those system properties. In this case 
you could argue that it should not be consumed at all if not allowed as it is a 
bad behaving client. We would like to integrate Atlas with another metadata 
system, which would actually make the occurrence much more frequent in around 
50% of the updates. Given the fact that system attributes are part of the 
vertex there is not much additional cost as far as I can see in doing this 
during entity updates.

On your second point. I'm not strongly bound to them so I can merge them. I do 
think there might be cases that you would like to allow an update but disallow 
a create.

Authorization per attribute allows end-users to edit a particular attribute 
(say description) without allowing editing of all properties of the entities. 
This is actually a very common use case as you would like users to be able to 
enrich the metadata without adjusting some core attributes or system generated 
attributes. I understand your point about performance. What I could do is to 
submit an ArrayList and create a RangerCollectionResourceMatcher that requires 
all items in the submitted array to be matches (as opposed to 
RangerDefaultResourceMatcher) that should resolve the issue of CPU cycles and 
audit logs.

What do you think?


was (Author: bolke):
[~madhan] I don't think that would work for the KafkaConsumer, or does it? 
Please note that also Glossary needs to be updated to contain all system 
attributes as it is missing at least "homeId". In addition I don't think the 
risk is that high that system attributes get updated inadvertent. The default 
authorization model denies access to these attributes. Next to that it would 
require the incoming message to include those system properties. In this case 
you could argue that it should not be consumed at all if not allowed as it is a 
bad behaving client. We would like to integrate Atlas with another metadata 
system, which would actually make the occurrence much more frequent in around 
50% of the updates. Given the fact that system attributes are part of the 
vertex there is not much additional cost as far as I can see in doing this 
during entity updates.

On your second point. I'm not strongly bound to them so I can merge them. I do 
think there might be cases that you would like to allow an update but disallow 
a create.

Authorization per attribute allows end-users to edit a particular attribute 
(say description) without allowing editing of all properties of the entities. 
This is actually a very common use case as you would like users to be able to 
enrich the metadata without adjusting some core attributes or system generated 
attributes. I understand your point about performance. What I could do is to 
submit an Array in string format (e.g. "attribute1;attribute2;attribute4") and 
create a RangerArrayResourceMatcher that allows matching on 1 item which should 
resolve the issue of CPU cycles and audit logs.

What do you think?

> Allow system attributes to be updated when policy allows
> 
>
> Key: ATLAS-3755
> URL: https://issues.apache.org/jira/browse/ATLAS-3755
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Bolke de Bruin
>Assignee: Bolke de Bruin
>Priority: Critical
> Attachments: 
> 0001-ATLAS-3755-Allow-system-attributes-to-be-updated-by-.patch, 
> 0001-ATLAS-3755-Allow-system-attributes-to-be-updated-by-.patch, 
> 0001-ATLAS-3755-Allow-system-attributes-to-be-updated-by-.patch
>
>
> Atlas does not operate in a isolated environment, this is one of the reasons 
> the "homeId" system attribute was introduced. Unfortunately system attributes 
> can only be updated when importing. This means any integration with other 
> services is significantly limited (Kafka, Rest API will not work). (See also 
> ATLAS-3754)
> To resolve this I propose to make it possible to update the system attributes 
> when policy allows it. This introduces new 
> AtlasPrivilege.ENTITY_UPDATE_SYSTEM_ATTRIBUTE and 
> AtlasPrivilege.ENTITY_CREATE_SYSTEM_ATTRIBUTE next to 
> AtlasPrivilege.ENTITY_UPDATE_ATTRIBUTE and 
> 

[jira] [Comment Edited] (ATLAS-3755) Allow system attributes to be updated when policy allows

2020-04-29 Thread Bolke de Bruin (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095795#comment-17095795
 ] 

Bolke de Bruin edited comment on ATLAS-3755 at 4/29/20, 6:51 PM:
-

[~madhan] I don't think that would work for the KafkaConsumer, or does it? 
Please note that also Glossary needs to be updated to contain all system 
attributes as it is missing at least "homeId". In addition I don't think the 
risk is that high that system attributes get updated inadvertent. The default 
authorization model denies access to these attributes. Next to that it would 
require the incoming message to include those system properties. In this case 
you could argue that it should not be consumed at all if not allowed as it is a 
bad behaving client. We would like to integrate Atlas with another metadata 
system, which would actually make the occurrence much more frequent in around 
50% of the updates. Given the fact that system attributes are part of the 
vertex there is not much additional cost as far as I can see in doing this 
during entity updates.

On your second point. I'm not strongly bound to them so I can merge them. I do 
think there might be cases that you would like to allow an update but disallow 
a create.

Authorization per attribute allows end-users to edit a particular attribute 
(say description) without allowing editing of all properties of the entities. 
This is actually a very common use case as you would like users to be able to 
enrich the metadata without adjusting some core attributes or system generated 
attributes. I understand your point about performance. What I could do is to 
submit an Array in string format (e.g. "attribute1;attribute2;attribute4") and 
create a RangerArrayResourceMatcher that allows matching on 1 item which should 
resolve the issue of CPU cycles and audit logs.

What do you think?


was (Author: bolke):
[~madhan] I don't think that would work for the KafkaConsumer, or does it? 
Please note that also Glossary needs to be updated to contain all system 
attributes as it is missing at least "homeId". In addition I don't think the 
risk is that high that system attributes get updated inadvertent. The default 
authorization model denies access to these attributes. Next to that it would 
require the incoming message to include those system properties. In this case 
you could argue that it should not be consumed at all if not allowed as it is a 
bad behaving client. We would like to integrate Atlas with another metadata 
system, which would actually make the occurrence much more frequent in around 
50% of the updates. Given the fact that system attributes are part of the 
vertex there is not much additional cost as far as I can see in doing this 
during entity updates.

On your second point. I'm not strongly bound so I can merge them. I do think 
there might be cases that you would like to allow an update but disallow a 
create.

Authorization per attribute allows end-users to edit a particular attribute 
(say description) without allowing editing of all properties of the entities. 
This is actually a very common use case as you would like users to be able to 
enrich the metadata without adjusting some core attributes or system generated 
attributes. I understand your point about performance. What I could do is to 
submit an Array in string format (e.g. "attribute1;attribute2;attribute4") and 
create a RangerArrayResourceMatcher that allows matching on 1 item which should 
resolve the issue of CPU cycles and audit logs.

What do you think?

> Allow system attributes to be updated when policy allows
> 
>
> Key: ATLAS-3755
> URL: https://issues.apache.org/jira/browse/ATLAS-3755
> Project: Atlas
>  Issue Type: Improvement
>  Components:  atlas-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Bolke de Bruin
>Assignee: Bolke de Bruin
>Priority: Critical
> Attachments: 
> 0001-ATLAS-3755-Allow-system-attributes-to-be-updated-by-.patch, 
> 0001-ATLAS-3755-Allow-system-attributes-to-be-updated-by-.patch, 
> 0001-ATLAS-3755-Allow-system-attributes-to-be-updated-by-.patch
>
>
> Atlas does not operate in a isolated environment, this is one of the reasons 
> the "homeId" system attribute was introduced. Unfortunately system attributes 
> can only be updated when importing. This means any integration with other 
> services is significantly limited (Kafka, Rest API will not work). (See also 
> ATLAS-3754)
> To resolve this I propose to make it possible to update the system attributes 
> when policy allows it. This introduces new 
> AtlasPrivilege.ENTITY_UPDATE_SYSTEM_ATTRIBUTE and 
> AtlasPrivilege.ENTITY_CREATE_SYSTEM_ATTRIBUTE next to 
> AtlasPrivilege.ENTITY_UPDATE_ATTRIBUTE and 
>