[ 
https://issues.apache.org/jira/browse/PHOENIX-7759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Krishna Dara updated PHOENIX-7759:
---------------------------------------
    Description: 
h3. Summary

When applications {{UPSERT}} multiple rows with deferred commit, mutations 
accumulate in client-side {{{}MutationState{}}}. This occurs when:
 * Using {{executeUpdate()}} with {{autoCommit=false}}
 * Using {{addBatch()}} with {{executeBatch()}} (regardless of {{autoCommit}} 
value)

Currently, when the configured limit ({{{}phoenix.mutate.maxSize{}}} or 
{{{}phoenix.mutate.maxSizeBytes{}}}) is reached, Phoenix clears all buffered 
mutations and throws an exception, causing data loss and requiring applications 
to restart batch processing from the beginning.
h3. Problem

Applications have no opportunity to commit partial progress when limits are 
reached. Workarounds like setting excessively large limits or implementing 
custom batching heuristics are either risky or inefficient.
h3. Solution

Introduce a new configuration property 
{{phoenix.mutate.preserveOnLimitExceeded}} (default: {{{}false{}}}) that, when 
enabled:
1. Performs a pre-check before joining mutations to detect if limits would be 
exceeded
2. Throws a new {{MutationLimitReachedException}} without clearing existing 
buffered mutations
2. For {{{}executeBatch(){}}}, handle the above exception to trim the batch to 
contain only unprocessed items and translating the exception into the new 
{{MutationLimitBatchException}} that captures the "processed count"

This allows applications to commit existing mutations and continue processing 
from where they left off, effectively providing the ability to "dynamically 
size" the batch.
h3. Backward Compatibility

The new behavior is opt-in. Default behavior (clear mutations on limit 
exceeded) is unchanged.

  was:
h3. Summary
When applications \{{UPSERT}} multiple rows with deferred commit, mutations 
accumulate in client-side \{{MutationState}}. This occurs when:
* Using \{{executeUpdate()}} with \{{autoCommit=false}}
* Using \{{addBatch()}} with \{{executeBatch()}} (regardless of \{{autoCommit}} 
value)

Currently, when the configured limit (\{{phoenix.mutate.maxSize}} or 
\{{phoenix.mutate.maxSizeBytes}}) is reached, Phoenix clears all buffered 
mutations and throws an exception, causing data loss and requiring applications 
to restart batch processing from the beginning.

h3. Problem
Applications have no opportunity to commit partial progress when limits are 
reached. Workarounds like setting excessively large limits or implementing 
custom batching heuristics are either risky or inefficient.

h3. Solution
Introduce a new configuration property 
\{{phoenix.mutate.preserveOnLimitExceeded}} (default: \{{false}}) that, when 
enabled:
1. Performs a pre-check before joining mutations to detect if limits would be 
exceeded
2. Throws a new \{{MutationLimitReachedException}} without clearing existing 
buffered mutations
2. For \{{executeBatch()}}, handle the above exception to trim the batch to 
contain only unprocessed items and translating the exception into the new 
\{{MutationLimitBatchException}} that captures the "processed count"

This allows applications to commit existing mutations and continue processing 
from where they left off, effectively providing the ability to "dynamically 
size" the batch.

h3. Backward Compatibility
The new behavior is opt-in. Default behavior (clear mutations on limit 
exceeded) is unchanged.


> Preserve buffered mutations when batch size limit is exceeded
> -------------------------------------------------------------
>
>                 Key: PHOENIX-7759
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-7759
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Hari Krishna Dara
>            Assignee: Hari Krishna Dara
>            Priority: Minor
>
> h3. Summary
> When applications {{UPSERT}} multiple rows with deferred commit, mutations 
> accumulate in client-side {{{}MutationState{}}}. This occurs when:
>  * Using {{executeUpdate()}} with {{autoCommit=false}}
>  * Using {{addBatch()}} with {{executeBatch()}} (regardless of {{autoCommit}} 
> value)
> Currently, when the configured limit ({{{}phoenix.mutate.maxSize{}}} or 
> {{{}phoenix.mutate.maxSizeBytes{}}}) is reached, Phoenix clears all buffered 
> mutations and throws an exception, causing data loss and requiring 
> applications to restart batch processing from the beginning.
> h3. Problem
> Applications have no opportunity to commit partial progress when limits are 
> reached. Workarounds like setting excessively large limits or implementing 
> custom batching heuristics are either risky or inefficient.
> h3. Solution
> Introduce a new configuration property 
> {{phoenix.mutate.preserveOnLimitExceeded}} (default: {{{}false{}}}) that, 
> when enabled:
> 1. Performs a pre-check before joining mutations to detect if limits would be 
> exceeded
> 2. Throws a new {{MutationLimitReachedException}} without clearing existing 
> buffered mutations
> 2. For {{{}executeBatch(){}}}, handle the above exception to trim the batch 
> to contain only unprocessed items and translating the exception into the new 
> {{MutationLimitBatchException}} that captures the "processed count"
> This allows applications to commit existing mutations and continue processing 
> from where they left off, effectively providing the ability to "dynamically 
> size" the batch.
> h3. Backward Compatibility
> The new behavior is opt-in. Default behavior (clear mutations on limit 
> exceeded) is unchanged.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to