-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/74130/#review224700
-----------------------------------------------------------


Ship it!




Ship It!

- Sidharth Mishra


On Sept. 22, 2022, 6:17 p.m., Sheetal Shah wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/74130/
> -----------------------------------------------------------
> 
> (Updated Sept. 22, 2022, 6:17 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Mandar Ambawane, and Pinal Shah.
> 
> 
> Repository: atlas
> 
> 
> Description
> -------
> 
> Problem statement : While working with a kafka dump which contained messages 
> from spark streaming applications, 
> it was observed that when an application is getting updated, it takes longest 
> time while
> re-indexing the edges and that "deleted" relationship edges were also being
> re-indexed every-time an application was getting updated for an incoming 
> process message.
> This takes a few minutes to process for 35k processes, average time was 135 
> seconds; this time would increase as new processes enter the system.
> 
> Changes have been made to consider only active edges to process the 
> relationship edges which always ends up
> considering only new additional edges for processing/indexing leading to a 
> significant difference in processing time when number of deleted edges are 
> too high for an updating entity
> 
> 
> Diffs
> -----
> 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  68d331dfd 
> 
> 
> Diff: https://reviews.apache.org/r/74130/diff/1/
> 
> 
> Testing
> -------
> 
> We tested the same kafka dump for the changes and the time taken to process 
> messages was significantly less. Running the dump with the fix showed a 
> drastic improvement in that it considered only non-deleted edges for 
> processing/re-indexing leading to a consistent processing time of around 1 to 
> 2 seconds.
> 
> 
> Thanks,
> 
> Sheetal Shah
> 
>

Reply via email to