[
https://issues.apache.org/jira/browse/KAFKA-13217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias J. Sax updated KAFKA-13217:
------------------------------------
Labels: kip newbie newbie++ (was: needs-kip newbie newbie++)
> Reconsider skipping the LeaveGroup on close() or add an overload that does so
> -----------------------------------------------------------------------------
>
> Key: KAFKA-13217
> URL: https://issues.apache.org/jira/browse/KAFKA-13217
> Project: Kafka
> Issue Type: Improvement
> Components: streams
> Reporter: A. Sophie Blee-Goldman
> Assignee: Sayantanu Dey
> Priority: Major
> Labels: kip, newbie, newbie++
> Fix For: 3.3.0
>
>
> In Kafka Streams, when an instance is shut down via the close() API, we
> intentionally skip sending a LeaveGroup request. This is because often the
> shutdown is not due to a scaling down event but instead some transient
> closure, such as during a rolling bounce. In cases where the instance is
> expected to start up again shortly after, we originally wanted to avoid that
> member's tasks from being redistributed across the remaining group members
> since this would disturb the stable assignment and could cause unnecessary
> state migration and restoration. We also hoped
> to limit the disruption to just a single rebalance, rather than forcing the
> group to rebalance once when the member shuts down and then again when it
> comes back up. So it's really an optimization for the case in which the
> shutdown is temporary.
>
> That said, many of those optimizations are no longer necessary or at least
> much less useful given recent features and improvements. For example
> rebalances are now lightweight so skipping the 2nd rebalance is not as worth
> optimizing for, and the new assignor will take into account the actual
> underlying state for each task/partition assignment, rather than just the
> previous assignment, so the assignment should be considerably more stable
> across bounces and rolling restarts.
>
> Given that, it might be time to reconsider this optimization. Alternatively,
> we could introduce another form of the close() API that forces the member to
> leave the group, to be used in event of actual scale down rather than a
> transient bounce.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)