[ https://issues.apache.org/jira/browse/KAFKA-13217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514505#comment-17514505 ]
Seungchan Ahn commented on KAFKA-13217: --------------------------------------- Anyone can assign himself/herself to this issue to start working on. We have agreed on [the KIP|https://cwiki.apache.org/confluence/x/KZvkCw] . You can get into the implementation of the changes suggested in the KIP. Personally, I wish I can do it myself, but I don't think I can start working on it in a week. And I don't want to be a blocking point. So please feel free to take this. I will take this back if no one took this till I successfully spare my time for this. cc: [~guozhang] > Reconsider skipping the LeaveGroup on close() or add an overload that does so > ----------------------------------------------------------------------------- > > Key: KAFKA-13217 > URL: https://issues.apache.org/jira/browse/KAFKA-13217 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: A. Sophie Blee-Goldman > Assignee: Seungchan Ahn > Priority: Major > Labels: needs-kip, newbie, newbie++ > Fix For: 3.3.0 > > > In Kafka Streams, when an instance is shut down via the close() API, we > intentionally skip sending a LeaveGroup request. This is because often the > shutdown is not due to a scaling down event but instead some transient > closure, such as during a rolling bounce. In cases where the instance is > expected to start up again shortly after, we originally wanted to avoid that > member's tasks from being redistributed across the remaining group members > since this would disturb the stable assignment and could cause unnecessary > state migration and restoration. We also hoped > to limit the disruption to just a single rebalance, rather than forcing the > group to rebalance once when the member shuts down and then again when it > comes back up. So it's really an optimization for the case in which the > shutdown is temporary. > > That said, many of those optimizations are no longer necessary or at least > much less useful given recent features and improvements. For example > rebalances are now lightweight so skipping the 2nd rebalance is not as worth > optimizing for, and the new assignor will take into account the actual > underlying state for each task/partition assignment, rather than just the > previous assignment, so the assignment should be considerably more stable > across bounces and rolling restarts. > > Given that, it might be time to reconsider this optimization. Alternatively, > we could introduce another form of the close() API that forces the member to > leave the group, to be used in event of actual scale down rather than a > transient bounce. -- This message was sent by Atlassian Jira (v8.20.1#820001)