[
https://issues.apache.org/jira/browse/KAFKA-16277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
A. Sophie Blee-Goldman resolved KAFKA-16277.
--------------------------------------------
Resolution: Fixed
> CooperativeStickyAssignor does not spread topics evenly among consumer group
> ----------------------------------------------------------------------------
>
> Key: KAFKA-16277
> URL: https://issues.apache.org/jira/browse/KAFKA-16277
> Project: Kafka
> Issue Type: Bug
> Components: clients, consumer
> Reporter: Cameron Redpath
> Priority: Major
> Fix For: 3.8.0
>
> Attachments: image-2024-02-19-13-00-28-306.png
>
>
> Consider the following scenario:
> `topic-1`: 12 partitions
> `topic-2`: 12 partitions
>
> Of note, `topic-1` gets approximately 10 times more messages through it than
> `topic-2`.
>
> Both of these topics are consumed by a single application, single consumer
> group, which scales under load. Each member of the consumer group subscribes
> to both topics. The `partition.assignment.strategy` being used is
> `org.apache.kafka.clients.consumer.CooperativeStickyAssignor`. The
> application may start with one consumer. It consumes all partitions from both
> topics.
>
> The problem begins when the application scales up to two consumers. What is
> seen is that all partitions from `topic-1` go to one consumer, and all
> partitions from `topic-2` go to the other consumer. In the case with one
> topic receiving more messages than the other, this results in a very
> imbalanced group where one consumer is receiving 10x the traffic of the other
> due to partition assignment.
>
> This is the issue being seen in our cluster at the moment. See this graph of
> the number of messages being processed by each consumer as the group scales
> from one to four consumers:
> !image-2024-02-19-13-00-28-306.png|width=537,height=612!
> Things to note from this graphic:
> * With two consumers, the partitions for a topic all go to a single consumer
> each
> * With three consumers, the partitions for a topic are split between two
> consumers each
> * With four consumers, the partitions for a topic are split between three
> consumers each
> * The total number of messages being processed by each consumer in the group
> is very imbalanced throughout the entire period
>
> With regard to the number of _partitions_ being assigned to each consumer,
> the group is balanced. However, the assignment appears to be biased so that
> partitions from the same topic go to the same consumer. In our scenario, this
> leads to very undesirable partition assignment.
>
> I question if the behaviour of the assignor should be revised, so that each
> topic has its partitions maximally spread across all available members of the
> consumer group. In the above scenario, this would result in much more even
> distribution of load. The behaviour would then be:
> * With two consumers, 6 partitions from each topic go to each consumer
> * With three consumers, 4 partitions from each topic go to each consumer
> * With four consumers, 3 partitions from each topic go to each consumer
>
> Of note, we only saw this behaviour after migrating to the
> `CooperativeStickyAssignor`. It was not an issue with the default partition
> assignment strategy.
>
> It is possible this may be intended behaviour. In which case, what is the
> preferred workaround for our scenario? Our current workaround if we decide to
> go ahead with the update to `CooperativeStickyAssignor` may be to limit our
> consumers so they only subscribe to one topic, and have two consumer threads
> per instance of the application.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)