[ https://issues.apache.org/jira/browse/ATLAS-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007924#comment-16007924 ]
David Radley commented on ATLAS-1690: ------------------------------------- Hi [~madhan.neethiraj] and [~cmgrote], This is interesting, it seems to me that if we are going to embrace using tags in context specific ways and we feel tag propagation decisions should not be made by the relationship author then we do not want to hard bake tag propagation into the Atlas store. I am hearing that you feel there are use cases where the relationship author or updater would be responsible for tag propagation. Can I just check how you are thinking about the tag propagation implementation in your scenarios : 1) when a table/column is classified as PII, any lineage from this table/view/column should also be automatically be classified as PII. This means we need a relationship between an entity defined attribute to another entity defined attribute in the other entity. The current proposal does not allow this as the relationship specifies attribute names for each end, that are not defined in the entity. The current relationship proposal adds a new attribute of type 'the other end'. I think we would need a new tag propagating relationship - or more generally a mapping. 2) when a term is classified as PII, all entities that are associated with the term also should automatically be classified as PII. So I am thinking that an asset / entity will have an assigned terms attribute. The GlossaryTerm would have a assignedEntities attribute. These attributes would be added by virtue of the relationship. If the Glossary Term was tagged PII, we need special Atlas logic so assignedEntities propagates to the entities assignedTerms - then special logic for the assignedTerms tag is picked up by the entity itself. 3) when a term is classified as PII, all terms that are synonyms of this term (and all the entities associated with the synonym terms) also should automatically be classified the same. We would need special Atlas logic for the synonym case that propagates the PII tag from the synonym relationship to the terms themselves. In summary I see 2 main tag propagation scenarios, and propose a way forward for each: - mapping between existing entity attributes, that could be used as the basis of propagation. I suggest we introduce the top level concept of a mapping separately from this relationships Jira. - the need for additional logic in Atlas to propagate tags across glossary term to term or term to asset relationships. I can see this is useful. So I suggest that I add in tag_propagation_hint enum including NONE. The actual propagation will depend on specific relationship types. When we implement the glossary, we will use this hint to propagate the tags in glossary specific Atlas logic. Further enhancements can occur as we bring in logic around the classification level use case and enhanced Ranger integration; Ranger will need to be able to override the hinted classification. Outside of the glossary case - tags will not be propagated - even if the tag-propagation-hint is set on a relationship. Give this thinking, it makes sense for the tag_propagation_hint to be on only on the Glossary Relationship types and not on the top level relationships. Madhan and [~mandy_chessell] does this make sense? > Introduce top level relationships > --------------------------------- > > Key: ATLAS-1690 > URL: https://issues.apache.org/jira/browse/ATLAS-1690 > Project: Atlas > Issue Type: Improvement > Reporter: David Radley > Assignee: David Radley > Labels: VirtualDataConnector > Attachments: Atlas_RelationDef_Json_Structure_v1.pdf, Atlas > Relationships proposal v1.0.pdf, Atlas Relationships proposal v1.1.pdf, Atlas > Relationships proposal v1.2.pdf, Atlas Relationships proposal v1.3.pdf, Atlas > Relationships proposal v1.4.pdf, Atlas Relationships proposal v1.5.pdf, Atlas > Relationships proposal v1.6.pdf, Atlas Relationships proposal v1.7.pdf > > > Introduce top level relationships including support for > -many to many relationships > - relationship names including the name for both ends and the relationship. -- This message was sent by Atlassian JIRA (v6.3.15#6346)