sanket bhor created ATLAS-5312:
----------------------------------

             Summary: Handle Delete Propogation between Related entities
                 Key: ATLAS-5312
                 URL: https://issues.apache.org/jira/browse/ATLAS-5312
             Project: Atlas
          Issue Type: New Feature
            Reporter: sanket bhor
            Assignee: sanket bhor


 *Problem Statement*                                                            
                                                                                
                                            
                                                                                
                                                                       When a 
container entity is deleted in Atlas, entities linked via AGGREGATION or 
ASSOCIATION relationships are NOT deleted — they become orphaned stale metadata 
visible in UI, search, lineage, and      
  governance policies.  

*Two gaps exist:*                                                               
                                                                                
                                    
  1. Delete Cascade (AGGREGATION): When a trino_schema is deleted (e.g., DROP 
SCHEMA sales CASCADE), trino_table entities linked via AGGREGATION 
(trino_table_schema) are NOT deleted. Only                
  COMPOSITION-owned children are cascaded today via 
DeleteHandlerV1.getOwnedVertices().                                             
                                                                       
  2. Delete Propagation (Cross-system aliases): When a source entity is deleted 
(e.g., hive_table or hive_db), alias entities in other systems (e.g., 
trino_table via trino_table_hive_table, trino_schema 
  via trino_schema_hive_db) remain as stale metadata pointing to non-existent 
source entities.                                                                
                                             
                                                                                
                                                                              
*Current behavior:*                                                             
                                                                                
                                           
  - DeleteHandlerV1.getOwnedVertices() only follows isOwnedRef=true attributes 
(injected only for COMPOSITION relationships)                                   
                                            
  - AGGREGATION/ASSOCIATION edges: only the relationship edge is removed; the 
child/alias entity persists                                                     
                                             
  - Result: orphaned tables, columns, schemas visible in Atlas after source 
deletion     

 *Proposed Solution (High Level)*                                               
                                                                                
                                            
                                                                                
                                                                                
                                           
  Add a typedef-driven propagateDelete boolean flag on AtlasRelationshipEndDef 
(mirrors existing propagateRename pattern):                                     
                                            
                                                                                
                                                                                
                                           
  - Flag is configured via model patches (SET_PROPAGATE_DELETE action) — no 
hook-side changes required                                                      
                                               
  - At typedef resolution time, AtlasEntityType pre-computes 
deletePropagationTargets list                                                   
                                                              
  - At runtime, DeleteHandlerV1 traverses propagateDelete-marked edges after 
getOwnedVertices() and adds connected entities to the deletion set              
                                              
  - Multi-hop propagation supported via recursion (e.g., hive_db → trino_schema 
→ trino_table → trino_column)                                                   
                                           
  - Idempotent: skip already-DELETED entities; visited-set prevents cycles      
                                                                                
                                           
  - All propagated deletes happen within the same @GraphTransaction — atomic 
commit/rollback                                                                 
                                              
  - Supports both soft delete and hard delete (propagation targets inherit 
parent's delete type)  

 

*Steps to Reproduce*

Scenario A — AGGREGATION orphan (trino_schema → trino_table):

1. Create trino_schema entity (qualifiedName=cat1.sales@inst1) with 3 
trino_table entities linked via trino_table_schema relationship
2. Send ENTITY_DELETE_V2 event for trino_schema:
{"type":"ENTITY_DELETE_V2","user":"trino","entities":[\{"typeName":"trino_schema","uniqueAttributes":{"qualifiedName":"cat1.sales@inst1"}}]}
3. Observe: trino_schema is deleted, but all 3 trino_table entities and their 
trino_column entities remain in Atlas (orphaned)

Scenario B — Cross-system alias orphan (hive_table → trino_table):

1. Create hive_table entity (qualifiedName=default.orders@cluster) linked to 
trino_table (qualifiedName=cat1.schema1.orders@inst1) via 
trino_table_hive_table relationship
2. Send ENTITY_DELETE_V2 event for hive_table:
{"type":"ENTITY_DELETE_V2","user":"hive","entities":[\{"typeName":"hive_table","uniqueAttributes":{"qualifiedName":"default.orders@cluster"}}]}
3. Observe: hive_table and its hive_column entities are deleted, but 
trino_table and its trino_column entities remain (stale alias)

Scenario C — Cross-system schema orphan (hive_db → trino_schema):

1. Create hive_db (qualifiedName=sales@cluster) linked to trino_schema 
(qualifiedName=cat1.sales@inst1) via trino_schema_hive_db relationship
2. Send ENTITY_DELETE_V2 event for hive_db:
{"type":"ENTITY_DELETE_V2","user":"hive","entities":[\{"typeName":"hive_db","uniqueAttributes":{"qualifiedName":"sales@cluster"}}]}
3. Observe: hive_db deleted, but trino_schema, its trino_table entities, and 
trino_column entities all remain (stale)

*Acceptance Criteria*

Functional

- [ ] Deleting a trino_schema cascades deletion to all trino_table entities 
linked via trino_table_schema (AGGREGATION) and their trino_column entities 
(COMPOSITION)
- [ ] Deleting a hive_table propagates deletion to linked trino_table (via 
trino_table_hive_table) and its trino_column entities
- [ ] Deleting a hive_db propagates deletion to linked trino_schema (via 
trino_schema_hive_db), which further cascades to all its trino_table and 
trino_column entities
- [ ] Propagation is unidirectional: deleting trino_table does NOT delete 
hive_table; deleting trino_schema does NOT delete hive_db
- [ ] Multi-hop propagation works: hive_db → trino_schema → trino_table → 
trino_column (full chain)
- [ ] Both soft delete and hard delete modes are supported (propagation targets 
inherit parent's delete type)
- [ ] Feature is opt-in via model patches — no behavior change without explicit 
SET_PROPAGATE_DELETE patch enablement



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to