[
https://issues.apache.org/jira/browse/KAFKA-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Martin Hørslev updated KAFKA-14172:
-----------------------------------
Description:
h1. State stores lose state when tasks are reassigned under EOS with standby
replicas and default acceptable lag.
I have observed that state stores used in a transform step under a Exactly Once
semantics ends up loosing state after a rebalancing event that includes
reassignment of tasks to previous standby task within the acceptable standby
lag.
The problem is reproduceable and an integration test have been created to
showcase the [issue|https://github.com/apache/kafka/pull/12540].
A detailed description of the observed issue is provided
[here|https://github.com/apache/kafka/pull/12540/files?short_path=3ca480e#diff-3ca480ef093a1faa18912e1ebc679be492b341147b96d7a85bda59911228ef45]
Similar issues have been observed and reported to StackOverflow for example
[here|https://stackoverflow.com/questions/69038181/kafka-streams-aggregation-data-loss-between-instance-restarts-and-rebalances].
was:
h1. State stores lose state when tasks are reassigned under EOS with standby
replicas and default acceptable lag.
I have observed that state stores used in a transform step under a Exactly Once
semantics ends up loosing state after a rebalancing event that includes
reassignment of tasks to previous standby task within the acceptable standby
lag.
The problem is reproduceable and an integration test have been created to
showcase the [issue|https://github.com/apache/kafka/pull/12540].
Similar issues have been observed and reported to StackOverflow for example
[here|https://stackoverflow.com/questions/69038181/kafka-streams-aggregation-data-loss-between-instance-restarts-and-rebalances].
> bug: State stores loose state when tasks are reassigned under EOS wit…
> ----------------------------------------------------------------------
>
> Key: KAFKA-14172
> URL: https://issues.apache.org/jira/browse/KAFKA-14172
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 3.1.1
> Reporter: Martin Hørslev
> Priority: Major
>
> h1. State stores lose state when tasks are reassigned under EOS with standby
> replicas and default acceptable lag.
> I have observed that state stores used in a transform step under a Exactly
> Once semantics ends up loosing state after a rebalancing event that includes
> reassignment of tasks to previous standby task within the acceptable standby
> lag.
>
> The problem is reproduceable and an integration test have been created to
> showcase the [issue|https://github.com/apache/kafka/pull/12540].
> A detailed description of the observed issue is provided
> [here|https://github.com/apache/kafka/pull/12540/files?short_path=3ca480e#diff-3ca480ef093a1faa18912e1ebc679be492b341147b96d7a85bda59911228ef45]
> Similar issues have been observed and reported to StackOverflow for example
> [here|https://stackoverflow.com/questions/69038181/kafka-streams-aggregation-data-loss-between-instance-restarts-and-rebalances].
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)