[ https://issues.apache.org/jira/browse/KAFKA-18943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias J. Sax reassigned KAFKA-18943: --------------------------------------- Assignee: Matthias J. Sax > Kafka Streams incorrectly commits TX during task revokation > ----------------------------------------------------------- > > Key: KAFKA-18943 > URL: https://issues.apache.org/jira/browse/KAFKA-18943 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 3.4.0 > Reporter: Matthias J. Sax > Assignee: Matthias J. Sax > Priority: Blocker > > We found a very rare edge case in Kafka Streams commit logic, that may lead > to data loss for EOSv2 under certain circumstances. > Background: > When tasks are revoked cleanly, we need to commit the progress the revoked > tasks made before we can release them. For EOSv2, if we commit, a thread > always needs to commit all tasks it owns. Thus, during revocation, if a > single revoked tasks did make progress, we need to include all non-revoked > task in the commit. > To avoid unnecessary commits (which are expensive), we have an optimization > in place, that is supposed to skip the commit, if no revoked task did make > progress (ie, for revoked tasks w/o progress, we can release them safely w/o > the need to commit, and if all tasks can be revoked w/o a commit, we don't > commit at all). > This optimization has a bug, and may commit an open TX, even if no revoked > task did make progress, and because the commit is done accidentally, we do > not add offsets to the TX. Thus, we may commit result data of non-revoked > tasks, w/o including the non-revoked tasks advanced offsets. > If this happens, and an fatal error happens right afterwards (ie before a > consecutive commit for the same non-revoked tasks was done), we would seek to > an incorrect (too old) offset after recovery, and re-process the same input > data a second time, producing duplicate results. > These issue results in duplicates if the following conditions are all met at > the same time: > * a thread as more than one task assigned > * when tasks are revoked, at least one task is not revoked > * all revoked tasks did not make any progress (or to rephrase: no revoked > task did make progress) > * at least one non-revoked task did make progress > * after the incorrect commit, and before the next successful commit, a fatal > error happen, triggering a rebalance, task re-assignment and a "fetch > offsets" happen after the rebalance > The bug was introduced via https://issues.apache.org/jira/browse/KAFKA-14294 > in 3.4.0 release. > Looking into the details, it actually seems that there is some other issue, > that KAFKA-14294 did not address: KAFKA-14294 attempts to commit tasks after > output was produced by a punctuation, even if offsets did not advance. > However, it does apply this logic only for regular commits, but for the > task-revoked-case, we don't commit. Thus, if a revoked task did not advance > offsets, but did produce output data, no commit happens (while it seems we > should commit the open TX for this case, too; this might even apply to the > ALOS, case, too). > Thus, there is two possible fixes. > # Try to identify all corner case and patch it up > # Drop the optimization and just commit all tasks on > `TaskManager#handleRevocation()` -- This message was sent by Atlassian Jira (v8.20.10#820010)