[
https://issues.apache.org/jira/browse/PHOENIX-7759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hari Krishna Dara updated PHOENIX-7759:
---------------------------------------
Attachment: PHOENIX-7759-design.pdf
> Preserve buffered mutations when batch size limit is exceeded
> -------------------------------------------------------------
>
> Key: PHOENIX-7759
> URL: https://issues.apache.org/jira/browse/PHOENIX-7759
> Project: Phoenix
> Issue Type: New Feature
> Reporter: Hari Krishna Dara
> Assignee: Hari Krishna Dara
> Priority: Minor
> Attachments: PHOENIX-7759-design.pdf
>
>
> h3. Summary
> When applications {{UPSERT}} multiple rows with deferred commit, mutations
> accumulate in client-side {{{}MutationState{}}}. This occurs when:
> * Using {{executeUpdate()}} with {{autoCommit=false}}
> * Using {{addBatch()}} with {{executeBatch()}} (regardless of {{autoCommit}}
> value)
> Currently, when the configured limit ({{{}phoenix.mutate.maxSize{}}} or
> {{{}phoenix.mutate.maxSizeBytes{}}}) is reached, Phoenix clears all buffered
> mutations and throws an exception, causing data loss and requiring
> applications to restart batch processing from the beginning.
> h3. Problem
> Applications have no opportunity to commit partial progress when limits are
> reached. Workarounds like setting excessively large limits or implementing
> custom batching heuristics are either risky or inefficient.
> h3. Solution
> Introduce a new configuration property
> {{phoenix.mutate.preserveOnLimitExceeded}} (default: {{{}false{}}}) that,
> when enabled:
> 1. Performs a pre-check before joining mutations to detect if limits would be
> exceeded
> 2. Throws a new {{MutationLimitReachedException}} without clearing existing
> buffered mutations
> 2. For {{{}executeBatch(){}}}, handle the above exception to trim the batch
> to contain only unprocessed items and translating the exception into the new
> {{MutationLimitBatchException}} that captures the "processed count"
> This allows applications to commit existing mutations and continue processing
> from where they left off, effectively providing the ability to "dynamically
> size" the batch.
> h3. Backward Compatibility
> The new behavior is opt-in. Default behavior (clear mutations on limit
> exceeded) is unchanged.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)