-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/75212/
-----------------------------------------------------------

(Updated Sept. 20, 2024, 5:47 a.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, and Pinal Shah.


Bugs: ATLAS-4903
    https://issues.apache.org/jira/browse/ATLAS-4903


Repository: atlas


Description
-------

Here in type hive_process_execution we have  "guid": 
"a2fc8760-8906-454c-8ad8-23b1fffa7fdb", "typeName": "hive_process"

show below: 
"entity": {
            "typeName": "hive_process_execution",
            "attributes": {
                "hostName": "",
                "qualifiedName": 
"cm:6135:db_hive_mig_hive.db_hive_mig_hive_tbl_00@cm:1698112413000:6155:1698112413000:1698112463140",
                "name": 
"cm:6135:db_hive_mig_hive.db_hive_mig_hive_tbl_00@cm:1698112413000:6155:1698112413000:1698112463140",
                "queryText": "insert into 
db_hive_mig_hive.db_hive_mig_hive_tbl_00...2023-10-24T01:53:33.000Z",
                "startTime": 1698112413000,
                "queryPlan": "Not Supported",
                "endTime": 1698112463140,
                "userName": 
"mailto:hive/quasar-cpvvvn-1.quasar-cpvvvn.root.hwx.s...@qe-infra-ad.cloudera.com";,
                "queryId": "",
                "owner": null,
                "displayName": null,
                "description": null,
                "userDescription": null
            },
            "guid": "b98ef015-6bd9-4343-ad85-24628aa76731",
            "isIncomplete": false,
            "provenanceType": 0,
            "status": "ACTIVE",
            "createTime": 1698112413000,
            "updateTime": 1698112413000,
            "version": 0,
            "relationshipAttributes": {
                "process": {
                    "guid": "a2fc8760-8906-454c-8ad8-23b1fffa7fdb",
                    "typeName": "hive_process"
                }
            },
            "customAttributes": {
                "__nav_engineType": "\"MR\""
            },
            "businessAttributes": {},
            "proxy": false
        }

But the  entity with "guid": "a2fc8760-8906-454c-8ad8-23b1fffa7fdb" doesnt have 
above hive_process_execution in  "processExecutions": [] block

Hence relationship edge would get created when process_execution  processed but 
before that when it tries to  process hive_process and finds the 
relationshipattribute is empty and assu es the edges are unused, further tries 
to delete the  edges 

When huge migration data is restarted it is seen that the deleted entities 
count keeps accumulating due to above issue causing migration to slow down alot 
and take more time to process data than expected

This patch only when atlas is in migration mode avoids the code where it can 
further delete the edges by adding a flag


Diffs (updated)
-----

  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 6b395dd17 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
 f8c9218c6 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
 b73988fd7 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
 7eac8df73 
  server-api/src/main/java/org/apache/atlas/RequestContext.java e144d3650 


Diff: https://reviews.apache.org/r/75212/diff/3/

Changes: https://reviews.apache.org/r/75212/diff/2-3/


Testing
-------

Migration flag avoids the deletion flow as well as completes migration in 
expected time span


Thanks,

chaitali

Reply via email to