[
https://issues.apache.org/jira/browse/PHOENIX-7759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hari Krishna Dara updated PHOENIX-7759:
---------------------------------------
Description:
h3. Summary
When applications {{UPSERT}} multiple rows with deferred commit, mutations
accumulate in client-side {{{}MutationState{}}}. This occurs when:
* Using {{executeUpdate()}} with {{autoCommit=false}}
* Using {{addBatch()}} with {{executeBatch()}} (regardless of {{autoCommit}}
value)
Currently, when the configured limit ({{{}phoenix.mutate.maxSize{}}} or
{{{}phoenix.mutate.maxSizeBytes{}}}) is reached, Phoenix clears all buffered
mutations and throws an exception, causing data loss and requiring applications
to restart batch processing from the beginning.
h3. Problem
Applications have no opportunity to commit partial progress when limits are
reached. Workarounds like setting excessively large limits or implementing
custom batching heuristics are either risky or inefficient.
h3. Solution
Introduce a new configuration property
{{phoenix.mutate.preserveOnLimitExceeded}} (default: {{{}false{}}}) that, when
enabled:
1. Performs a pre-check before joining mutations to detect if limits would be
exceeded
2. Throws a new {{MutationLimitReachedException}} without clearing existing
buffered mutations
2. For {{{}executeBatch(){}}}, handle the above exception to trim the batch to
contain only unprocessed items and translating the exception into the new
{{MutationLimitBatchException}} that captures the "processed count"
This allows applications to commit existing mutations and continue processing
from where they left off, effectively providing the ability to "dynamically
size" the batch.
h3. Backward Compatibility
The new behavior is opt-in. Default behavior (clear mutations on limit
exceeded) is unchanged.
was:
h3. Summary
When applications \{{UPSERT}} multiple rows with deferred commit, mutations
accumulate in client-side \{{MutationState}}. This occurs when:
* Using \{{executeUpdate()}} with \{{autoCommit=false}}
* Using \{{addBatch()}} with \{{executeBatch()}} (regardless of \{{autoCommit}}
value)
Currently, when the configured limit (\{{phoenix.mutate.maxSize}} or
\{{phoenix.mutate.maxSizeBytes}}) is reached, Phoenix clears all buffered
mutations and throws an exception, causing data loss and requiring applications
to restart batch processing from the beginning.
h3. Problem
Applications have no opportunity to commit partial progress when limits are
reached. Workarounds like setting excessively large limits or implementing
custom batching heuristics are either risky or inefficient.
h3. Solution
Introduce a new configuration property
\{{phoenix.mutate.preserveOnLimitExceeded}} (default: \{{false}}) that, when
enabled:
1. Performs a pre-check before joining mutations to detect if limits would be
exceeded
2. Throws a new \{{MutationLimitReachedException}} without clearing existing
buffered mutations
2. For \{{executeBatch()}}, handle the above exception to trim the batch to
contain only unprocessed items and translating the exception into the new
\{{MutationLimitBatchException}} that captures the "processed count"
This allows applications to commit existing mutations and continue processing
from where they left off, effectively providing the ability to "dynamically
size" the batch.
h3. Backward Compatibility
The new behavior is opt-in. Default behavior (clear mutations on limit
exceeded) is unchanged.
> Preserve buffered mutations when batch size limit is exceeded
> -------------------------------------------------------------
>
> Key: PHOENIX-7759
> URL: https://issues.apache.org/jira/browse/PHOENIX-7759
> Project: Phoenix
> Issue Type: New Feature
> Reporter: Hari Krishna Dara
> Assignee: Hari Krishna Dara
> Priority: Minor
>
> h3. Summary
> When applications {{UPSERT}} multiple rows with deferred commit, mutations
> accumulate in client-side {{{}MutationState{}}}. This occurs when:
> * Using {{executeUpdate()}} with {{autoCommit=false}}
> * Using {{addBatch()}} with {{executeBatch()}} (regardless of {{autoCommit}}
> value)
> Currently, when the configured limit ({{{}phoenix.mutate.maxSize{}}} or
> {{{}phoenix.mutate.maxSizeBytes{}}}) is reached, Phoenix clears all buffered
> mutations and throws an exception, causing data loss and requiring
> applications to restart batch processing from the beginning.
> h3. Problem
> Applications have no opportunity to commit partial progress when limits are
> reached. Workarounds like setting excessively large limits or implementing
> custom batching heuristics are either risky or inefficient.
> h3. Solution
> Introduce a new configuration property
> {{phoenix.mutate.preserveOnLimitExceeded}} (default: {{{}false{}}}) that,
> when enabled:
> 1. Performs a pre-check before joining mutations to detect if limits would be
> exceeded
> 2. Throws a new {{MutationLimitReachedException}} without clearing existing
> buffered mutations
> 2. For {{{}executeBatch(){}}}, handle the above exception to trim the batch
> to contain only unprocessed items and translating the exception into the new
> {{MutationLimitBatchException}} that captures the "processed count"
> This allows applications to commit existing mutations and continue processing
> from where they left off, effectively providing the ability to "dynamically
> size" the batch.
> h3. Backward Compatibility
> The new behavior is opt-in. Default behavior (clear mutations on limit
> exceeded) is unchanged.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)