Antonio Murgia created PHOENIX-7620:
---------------------------------------
Summary: Sequence cache is ignored when performing SELECT NEXT N
VALUES
Key: PHOENIX-7620
URL: https://issues.apache.org/jira/browse/PHOENIX-7620
Project: Phoenix
Issue Type: Improvement
Components: phoenix
Affects Versions: 5.1.3
Reporter: Antonio Murgia
During performance testing under high concurrency, we observed significant
contention around sequence value generation in Apache Phoenix, particularly
with:
SELECT NEXT :n VALUES FOR ...
SELECT NEXT VALUE FOR ...
As expected, using sequences without caching under high parallelism (e.g. 600
clients) leads to severe contention due to the need for synchronized access and
row-level locking in HBase.
To mitigate this, we recreated sequences with an explicit CACHE n clause. This
resulted in a dramatic improvement for SELECT NEXT VALUE FOR, with most
round-trips eliminated and near-zero response times.
However, when testing SELECT NEXT N VALUES FOR, we noticed it doesn't honor the
CACHE setting. Despite being a bulk allocation, it appears to always generate a
remote round-trip per call. Upon inspecting the code path (specifically
SequenceRegionObserver [1]), it seems the number of values to allocate is
passed via NUM_TO_ALLOCATE, but the caching logic is bypassed.
We're wondering:
# Is this behavior intentional for bulk allocations?
# Would it be feasible to enhance the logic so that SELECT NEXT N leverages
the sequence cache when available, thereby reducing unnecessary round-trips?
This change could significantly improve performance in high-throughput
scenarios using batch inserts or parallel writers.
Looking forward to your thoughts and happy to discuss further.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)