-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/75212/
-----------------------------------------------------------
(Updated Sept. 20, 2024, 5:47 a.m.)
Review request for atlas, Jayendra Parab, Madhan Neethiraj, and Pinal Shah.
Bugs: ATLAS-4903
https://issues.apache.org/jira/browse/ATLAS-4903
Repository: atlas
Description
-------
Here in type hive_process_execution we have "guid":
"a2fc8760-8906-454c-8ad8-23b1fffa7fdb", "typeName": "hive_process"
show below:
"entity": {
"typeName": "hive_process_execution",
"attributes": {
"hostName": "",
"qualifiedName":
"cm:6135:db_hive_mig_hive.db_hive_mig_hive_tbl_00@cm:1698112413000:6155:1698112413000:1698112463140",
"name":
"cm:6135:db_hive_mig_hive.db_hive_mig_hive_tbl_00@cm:1698112413000:6155:1698112413000:1698112463140",
"queryText": "insert into
db_hive_mig_hive.db_hive_mig_hive_tbl_00...2023-10-24T01:53:33.000Z",
"startTime": 1698112413000,
"queryPlan": "Not Supported",
"endTime": 1698112463140,
"userName":
"mailto:hive/quasar-cpvvvn-1.quasar-cpvvvn.root.hwx.s...@qe-infra-ad.cloudera.com",
"queryId": "",
"owner": null,
"displayName": null,
"description": null,
"userDescription": null
},
"guid": "b98ef015-6bd9-4343-ad85-24628aa76731",
"isIncomplete": false,
"provenanceType": 0,
"status": "ACTIVE",
"createTime": 1698112413000,
"updateTime": 1698112413000,
"version": 0,
"relationshipAttributes": {
"process": {
"guid": "a2fc8760-8906-454c-8ad8-23b1fffa7fdb",
"typeName": "hive_process"
}
},
"customAttributes": {
"__nav_engineType": "\"MR\""
},
"businessAttributes": {},
"proxy": false
}
But the entity with "guid": "a2fc8760-8906-454c-8ad8-23b1fffa7fdb" doesnt have
above hive_process_execution in "processExecutions": [] block
Hence relationship edge would get created when process_execution processed but
before that when it tries to process hive_process and finds the
relationshipattribute is empty and assu es the edges are unused, further tries
to delete the edges
When huge migration data is restarted it is seen that the deleted entities
count keeps accumulating due to above issue causing migration to slow down alot
and take more time to process data than expected
This patch only when atlas is in migration mode avoids the code where it can
further delete the edges by adding a flag
Diffs (updated)
-----
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
6b395dd17
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
f8c9218c6
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
b73988fd7
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
7eac8df73
server-api/src/main/java/org/apache/atlas/RequestContext.java e144d3650
Diff: https://reviews.apache.org/r/75212/diff/3/
Changes: https://reviews.apache.org/r/75212/diff/2-3/
Testing
-------
Migration flag avoids the deletion flow as well as completes migration in
expected time span
Thanks,
chaitali