Hari Krishna Dara created PHOENIX-7759:
------------------------------------------
Summary: Preserve buffered mutations when batch size limit is
exceeded
Key: PHOENIX-7759
URL: https://issues.apache.org/jira/browse/PHOENIX-7759
Project: Phoenix
Issue Type: New Feature
Reporter: Hari Krishna Dara
Assignee: Hari Krishna Dara
h3. Summary
When applications \{{UPSERT}} multiple rows with deferred commit, mutations
accumulate in client-side \{{MutationState}}. This occurs when:
* Using \{{executeUpdate()}} with \{{autoCommit=false}}
* Using \{{addBatch()}} with \{{executeBatch()}} (regardless of \{{autoCommit}}
value)
Currently, when the configured limit (\{{phoenix.mutate.maxSize}} or
\{{phoenix.mutate.maxSizeBytes}}) is reached, Phoenix clears all buffered
mutations and throws an exception, causing data loss and requiring applications
to restart batch processing from the beginning.
h3. Problem
Applications have no opportunity to commit partial progress when limits are
reached. Workarounds like setting excessively large limits or implementing
custom batching heuristics are either risky or inefficient.
h3. Solution
Introduce a new configuration property
\{{phoenix.mutate.preserveOnLimitExceeded}} (default: \{{false}}) that, when
enabled:
1. Performs a pre-check before joining mutations to detect if limits would be
exceeded
2. Throws a new \{{MutationLimitReachedException}} without clearing existing
buffered mutations
2. For \{{executeBatch()}}, handle the above exception to trim the batch to
contain only unprocessed items and translating the exception into the new
\{{MutationLimitBatchException}} that captures the "processed count"
This allows applications to commit existing mutations and continue processing
from where they left off, effectively providing the ability to "dynamically
size" the batch.
h3. Backward Compatibility
The new behavior is opt-in. Default behavior (clear mutations on limit
exceeded) is unchanged.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)