Matthias J. Sax created KAFKA-13296:
---------------------------------------
Summary: Verify old assignment within StreamsPartitionAssignor
Key: KAFKA-13296
URL: https://issues.apache.org/jira/browse/KAFKA-13296
Project: Kafka
Issue Type: Improvement
Components: streams
Reporter: Matthias J. Sax
`StreamsPartitionAssignor` is responsible to assign partitions and tasks to all
StreamsThreads within an application.
While it ensures to not assign a single partition/task to two threads, there is
limited verification about it. In particular, we had one incident for with a
zombie thread/consumer did not cleanup its own internal state correctly due to
KAFKA-12983. This unclean zombie-state implied that the _old assignment_
reported to `StreamsPartitionAssignor` contained a single partition for two
consumers. As a result, both threads/consumers later revoked the same partition
and the zombie-thread could commit it's unclean work (even if it should have
been fenced), leading to duplicate output under EOS_v2.
We should consider to add a check to `StreamsPartitionAssignor` if the _old
assignment_ is valid, ie, no partition should be missing and no partition
should be assigned to two consumers. For this case, we should log the invalid
_old assignment_ and send an error code back to all consumer that indicates
that they should shut down "unclean" (ie, without and flushing and no
committing any offsets or transactions).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)