----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/74713/ -----------------------------------------------------------
(Updated Nov. 28, 2023, 3:26 a.m.) Review request for atlas, Ashutosh Mestry, Jayendra Parab, Mandar Ambawane, Pinal Shah, Sheetal Shah, and Sidharth Mishra. Bugs: ATLAS-4803 https://issues.apache.org/jira/browse/ATLAS-4803 Repository: atlas Description ------- Kafka lag was not decreasing for ATLAS_HOOK topics, create Entity API was taking 50-60 sec per request. Hive_table typename count was 10mn record. Impala_lineage_column typename count was 26mn count. Able to reproduce the issue. Metrics This difference exists because earlier even fromVertex did not have any edges, the search would iterate through all the edges of the toVertex and timeConsume was high. Before: "getRelationshipEdge":{"count":100000,"timeTaken":50000} After removing if condition for toVertex.hasEdge: "getRelationshipEdge":{"count":100000,"timeTaken":80} Diffs ----- graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 0dd573b89 repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java ef0313e02 Diff: https://reviews.apache.org/r/74713/diff/1/ Testing (updated) ------- What was the relationship type? __hive_db.table, __hive_table.columns What entity type was identified and tested , meaning which entity type of vertex took time to find edges? Impala_column_lineage, impala_process, hive_table, hive_column What was the count of the edges corresponding to that entity type? Hive_column = 28m Impala_column_lineage = 24m Timing before and after Before: "getRelationshipEdge":{"count":100000,"timeTaken":50000} After removing if condition for toVertex.hasEdge: "getRelationshipEdge":{"count":100000,"timeTaken":80} Volume testing Initiate kafka dump and lag started decreasing. Thanks, Paresh Devalia