sandy-bes opened a new pull request, #2351:
URL: https://github.com/apache/age/pull/2351

   Motivation / Problem: 
   As a result of load testing, a significant performance degradation was found 
in insertion scenarios. The scenarios used were taken from an open-source 
benchmark and rewritten in pure SQL. Examples of the queries can be found here:
   1) 
https://github.com/ldbc/ldbc_snb_interactive_v1_impls/blob/main/cypher/queries/interactive-update-1.cypher
   2) 
https://github.com/ldbc/ldbc_snb_interactive_v1_impls/blob/main/cypher/queries/interactive-update-6.cypher
   3) 
https://github.com/ldbc/ldbc_snb_interactive_v1_impls/blob/main/cypher/queries/interactive-update-7.cypher
   
   Analysis showed that the main bottleneck is the entity_exists function. The 
root cause lies in the use of a Sequential Scan (SeqScan) to check for the 
existence of an entity prior to insertion. The time complexity of a `SeqScan` 
is O(N), meaning the search time grows linearly as the number of rows in the 
table increases. The larger the graph became, the longer each individual 
insertion took. This led to a drop in TPS regardless of the concurrency level 
(the issue was consistently reproduced with both 1 and 30 threads).
   
   Changes Made:
   - Added Index Scan (IndexScan / time complexity O(log N)) inside the 
entity_exists function.
   - Refactored other functions utilizing SeqScan — they were also migrated to 
use IndexScan wherever applicable.
   
   Performance Impact: 
   Benchmarks were conducted on a server with 30 CPU cores and 32 GB of RAM, 
using a graph ranging from 20,000 to 200,000 objects over a 2-minute duration. 
The transition to index access completely eliminated the performance 
degradation associated with data volume growth:
   - Before: ~1,500 TPS (at peak, with subsequent degradation as the table 
grew).
   - After: Stable ~15,000 TPS (a 10x speedup).
   
   Acknowledgments:
   - Huge thanks to Daria Barsukova for conducting the load testing and 
isolating the issue.
   - Implementation of index scanning: Alexandra Bondar.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to