[
https://issues.apache.org/jira/browse/ATLAS-4903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17883174#comment-17883174
]
ASF subversion and git services commented on ATLAS-4903:
--------------------------------------------------------
Commit fe7d79c841b7d294c9990d676672898102d22bb0 in atlas's branch
refs/heads/branch-2.0 from chaitali
[ https://gitbox.apache.org/repos/asf?p=atlas.git;h=fe7d79c84 ]
ATLAS-4903 : When migration restarts it results into deletion of edges and
vertices
Signed-off-by: Pinal Shah <[email protected]>
> When migration restarts it results into deletion of edges and vertices
> -----------------------------------------------------------------------
>
> Key: ATLAS-4903
> URL: https://issues.apache.org/jira/browse/ATLAS-4903
> Project: Atlas
> Issue Type: Improvement
> Affects Versions: 3.0.0
> Reporter: chaitali borole
> Assignee: chaitali borole
> Priority: Major
>
> Here in type hive_process_execution we have "guid":
> "a2fc8760-8906-454c-8ad8-23b1fffa7fdb", "typeName": "hive_process"
> show below:
> "entity": {
> "typeName": "hive_process_execution",
> "attributes": {
> "hostName": "",
> "qualifiedName":
> "cm:6135:db_hive_mig_hive.db_hive_mig_hive_tbl_00@cm:1698112413000:6155:1698112413000:1698112463140",
> "name":
> "cm:6135:db_hive_mig_hive.db_hive_mig_hive_tbl_00@cm:1698112413000:6155:1698112413000:1698112463140",
> "queryText": "insert into
> db_hive_mig_hive.db_hive_mig_hive_tbl_00...2023-10-24T01:53:33.000Z",
> "startTime": 1698112413000,
> "queryPlan": "Not Supported",
> "endTime": 1698112463140,
> "userName":
> "mailto:hive/quasar-cpvvvn-1.quasar-cpvvvn.root.hwx.s...@qe-infra-ad.cloudera.com",
> "queryId": "",
> "owner": null,
> "displayName": null,
> "description": null,
> "userDescription": null
> },
> "guid": "b98ef015-6bd9-4343-ad85-24628aa76731",
> "isIncomplete": false,
> "provenanceType": 0,
> "status": "ACTIVE",
> "createTime": 1698112413000,
> "updateTime": 1698112413000,
> "version": 0,
> "relationshipAttributes": {
> "process": {
> "guid": "a2fc8760-8906-454c-8ad8-23b1fffa7fdb",
> "typeName": "hive_process"
> }
> },
> "customAttributes": {
> "__nav_engineType": "\"MR\""
> },
> "businessAttributes": {},
> "proxy": false
> }
> But the entity with "guid": "a2fc8760-8906-454c-8ad8-23b1fffa7fdb" doesnt
> have above hive_process_execution in "processExecutions": [] block
> Hence relationship edge would get created when process_execution processed
> but before that when it tries to process hive_process and finds the
> relationshipattribute is empty and assu es the edges are unused, further
> tries to delete the edges
> When huge migration data is restarted it is seen that the deleted entities
> count keeps accumulating due to above issue causing migration to slow down
> alot and take more time to process data than expected
--
This message was sent by Atlassian Jira
(v8.20.10#820010)