[PR] Add index scan [age]

via GitHub Mon, 02 Mar 2026 23:15:07 -0800


sandy-bes opened a new pull request, #2351:
URL: https://github.com/apache/age/pull/2351

Motivation / Problem:
As a result of load testing, a significant performance degradation was found
in insertion scenarios. The scenarios used were taken from an open-source
benchmark and rewritten in pure SQL. Examples of the queries can be found here:
1)
https://github.com/ldbc/ldbc_snb_interactive_v1_impls/blob/main/cypher/queries/interactive-update-1.cypher
2)
https://github.com/ldbc/ldbc_snb_interactive_v1_impls/blob/main/cypher/queries/interactive-update-6.cypher
3)
https://github.com/ldbc/ldbc_snb_interactive_v1_impls/blob/main/cypher/queries/interactive-update-7.cypher

Analysis showed that the main bottleneck is the entity_exists function. The
root cause lies in the use of a Sequential Scan (SeqScan) to check for the
existence of an entity prior to insertion. The time complexity of a `SeqScan`
is O(N), meaning the search time grows linearly as the number of rows in the
table increases. The larger the graph became, the longer each individual
insertion took. This led to a drop in TPS regardless of the concurrency level
(the issue was consistently reproduced with both 1 and 30 threads).

Changes Made:
- Added Index Scan (IndexScan / time complexity O(log N)) inside the
entity_exists function.
- Refactored other functions utilizing SeqScan — they were also migrated to
use IndexScan wherever applicable.

Performance Impact:
Benchmarks were conducted on a server with 30 CPU cores and 32 GB of RAM,
using a graph ranging from 20,000 to 200,000 objects over a 2-minute duration.
The transition to index access completely eliminated the performance
degradation associated with data volume growth:
- Before: ~1,500 TPS (at peak, with subsequent degradation as the table
grew).
- After: Stable ~15,000 TPS (a 10x speedup).

Acknowledgments:
- Huge thanks to Daria Barsukova for conducting the load testing and
isolating the issue.
- Implementation of index scanning: Alexandra Bondar.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] Add index scan [age]

Reply via email to