----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/74130/#review224700 -----------------------------------------------------------
Ship it! Ship It! - Sidharth Mishra On Sept. 22, 2022, 6:17 p.m., Sheetal Shah wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/74130/ > ----------------------------------------------------------- > > (Updated Sept. 22, 2022, 6:17 p.m.) > > > Review request for atlas, Jayendra Parab, Mandar Ambawane, and Pinal Shah. > > > Repository: atlas > > > Description > ------- > > Problem statement : While working with a kafka dump which contained messages > from spark streaming applications, > it was observed that when an application is getting updated, it takes longest > time while > re-indexing the edges and that "deleted" relationship edges were also being > re-indexed every-time an application was getting updated for an incoming > process message. > This takes a few minutes to process for 35k processes, average time was 135 > seconds; this time would increase as new processes enter the system. > > Changes have been made to consider only active edges to process the > relationship edges which always ends up > considering only new additional edges for processing/indexing leading to a > significant difference in processing time when number of deleted edges are > too high for an updating entity > > > Diffs > ----- > > > repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java > 68d331dfd > > > Diff: https://reviews.apache.org/r/74130/diff/1/ > > > Testing > ------- > > We tested the same kafka dump for the changes and the time taken to process > messages was significantly less. Running the dump with the fix showed a > drastic improvement in that it considered only non-deleted edges for > processing/re-indexing leading to a consistent processing time of around 1 to > 2 seconds. > > > Thanks, > > Sheetal Shah > >