[ https://issues.apache.org/jira/browse/KAFKA-17747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17944899#comment-17944899 ]
PoAn Yang commented on KAFKA-17747: ----------------------------------- KIP-1101: https://cwiki.apache.org/confluence/x/FouMEw Discussion thread: https://lists.apache.org/thread/l8ko353v3nn1blgymsty895x6c98oxlx Vote thread: https://lists.apache.org/thread/j90htmphjk783p697jjfg1xt8thmy33p > Trigger rebalance on rack topology changes > ------------------------------------------ > > Key: KAFKA-17747 > URL: https://issues.apache.org/jira/browse/KAFKA-17747 > Project: Kafka > Issue Type: Improvement > Components: group-coordinator > Reporter: David Jacot > Assignee: PoAn Yang > Priority: Major > Fix For: 4.1.0 > > > At the moment, we trigger a rebalance of the consumer group only when the > number of partitions of a topic has changed (e.g. increased the number of > partitions). We tried to extend this mechanism to also take racks into > consideration (see [this|https://github.com/apache/kafka/pull/17233]) but it > turned out to be to expensive from a memory and cpu perspective. It was also > bad because we ended up duplicating many of the information already present > in the Metadata image. We should design a better way to do this and it may > require a KIP depending on the solution. > I have two high level ideas in mind: > # One way would be to include a new epoch to the topic metadata stored in > the controlled. This new epoch could be incremented whenever the topology of > the topic has changed (e.g. adding partition, reassignment, etc.). Then we > could store the epoch in the group coordinator to detect changes and > rebalance the group. The downside of this approach is that it couple the > group coordinator to the controller. > # Another way would be to come up with a way to compute a hash of the > current topology on the topic(s). The digest would then be stored in the > group coordinator and used to detect changes. The downside of this is that it > requires to re-compute the hash to determine whether this is a change or not. > Option 1) would be a bit more efficient because the controller knows when the > epoch must be bumped. > We should explore those ideas and possibly other ones. > -- This message was sent by Atlassian Jira (v8.20.10#820010)