[ 
https://issues.apache.org/jira/browse/KAFKA-20198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chomingi reassigned KAFKA-20198:
--------------------------------

    Assignee: chomingi

> StickyPartitionAssignor with group protocol classic is not acting sticky
> ------------------------------------------------------------------------
>
>                 Key: KAFKA-20198
>                 URL: https://issues.apache.org/jira/browse/KAFKA-20198
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 4.1.1
>            Reporter: Jochen Rauschenbusch
>            Assignee: chomingi
>            Priority: Major
>         Attachments: HATaskAssignorLogs.json, StickyTaskAssignorLogs.json
>
>
> h2. Problem
>  During some tests, I noticed that many state stores were closed during group 
> rebalancing triggered by instance scaling. I assumed that the 
> StickyTaskAssignor was supposed to prevent exactly this. However, with each 
> new application instance that started the stream, the rebalancing resulted in 
> a cascade of "Handle new assignments" log entries. Scaling from one to two 
> application instances (each with ten Kafka stream threads) generated 429 such 
> entries, which seems excessive. The log entries showed that almost all tasks 
> were moved to other group members throughout the entire rebalancing phase.
> h2. Setup
>  * Scala application based on Scala 2.13 and Kafka Streams
>  * Application consumes from a single topic having 450 Partitions
>  * Stream topology is implementing some stateful aggregations
>  * Change logging is disabled. Only InMemory state stores are used.
>  * Each app instance is configured to create 10 Stream Threads
> *Following libraries are used*
>  * org.apache.kafka:kafka-streams:4.2.0
>  * org.apache.kafka:kafka-streams-scala_2.13:4.2.0
>  * org.apache.kafka:kafka-streams-test-utils:4.2.0
> The Kafka Cluster based on v4.1.0 was created with [Strimzi Operator 
> v0.50.0|https://github.com/strimzi/strimzi-kafka-operator/releases/tag/0.50.0].
> I already discussed this behavior with [~lucasbru]  and it seems to be a bug:
> [Confluent Slack 
> Channel|https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1770905604912249]
> h2. Further Tests
> Having implemented a pretty simple Spring Boot app with an absolut minimal 
> topology revealed the same behavior. The topology in this case didn't used 
> state stores at all. It just consumes from a single topic (again 450 
> partitions) and does some logging of the key/value combinations. Also here 
> the rebalancing led to a cascade of task re-assignments. Again i configured 
> the app to use 10 Stream Threads.
> I also did another Tests with the HATaskAssignor. Here the logic seems to 1st 
> revoke all assigned partitions and then re-assigns the tasks in a round-robin 
> manner, which seems to be as expected.
> Another test using KIP-1071 showed that there the Sticky Task assignment 
> works as expected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to