-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73917/
-----------------------------------------------------------
(Updated April 20, 2022, 6:20 p.m.)
Review request for atlas, Ashutosh Mestry, Jayendra Parab, Madhan Neethiraj,
Pinal Shah, Radhika Kundam, Sarath Subramanian, and Sidharth Mishra.
Changes
-------
Addressed review comments
Bugs: ATLAS-4572
https://issues.apache.org/jira/browse/ATLAS-4572
Repository: atlas
Description
-------
Earlier the process for soft deleting relationships involved unnecessary
invocation of delete methods on already deleted Relationship edges.
This would consume a lot of time on an entity which has a long list of soft
deleted relationships.
This changes implements a check on relationship edges, identifying already
deleted relationship edges and avoiding invocation of delete method on them.
Thus only allowing deletion of active relationship edges.
Diffs (updated)
-----
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/DeleteHandlerV1.java
f118ae69a
Diff: https://reviews.apache.org/r/73917/diff/3/
Changes: https://reviews.apache.org/r/73917/diff/2-3/
Testing
-------
PreCommit:
https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/1065/
We took 2 cluster, one with performance patch applied and other one without any
changes.
We loaded kafka dump on both the clusters. There were around 170k spark process
entities.
On cluster without any changes, it took more than 48 hours.
On cluster with performance improvement changes, it took 25 hours to consume
the entire kafka dump.
Also,
On cluster without changes it was taking approx 45 seconds to process each
Kafka message.
On cluster with performance improvement changes it was taking 3 to 5 seconds to
process each Kafka message.
Also we did sanity testing for Atlas.
Thanks,
Mandar Ambawane