[jira] [Commented] (KAFKA-13217) Reconsider skipping the LeaveGroup on close() or add an overload that does so

Seungchan Ahn (Jira) Wed, 30 Mar 2022 00:54:05 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-13217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514505#comment-17514505
 ]


Seungchan Ahn commented on KAFKA-13217:
---------------------------------------

Anyone can assign himself/herself to this issue to start working on. We have 
agreed on [the KIP|https://cwiki.apache.org/confluence/x/KZvkCw] . You can get 
into the implementation of the changes suggested in the KIP.

Personally, I wish I can do it myself, but I don't think I can start working on 
it in a week. And I don't want to be a blocking point. So please feel free to 
take this.

I will take this back if no one took this till I successfully spare my time for 
this.

cc: [~guozhang]

> Reconsider skipping the LeaveGroup on close() or add an overload that does so
> -----------------------------------------------------------------------------
>
>                 Key: KAFKA-13217
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13217
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: A. Sophie Blee-Goldman
>            Assignee: Seungchan Ahn
>            Priority: Major
>              Labels: needs-kip, newbie, newbie++
>             Fix For: 3.3.0
>
>
> In Kafka Streams, when an instance is shut down via the close() API, we 
> intentionally skip sending a LeaveGroup request. This is because often the 
> shutdown is not due to a scaling down event but instead some transient 
> closure, such as during a rolling bounce. In cases where the instance is 
> expected to start up again shortly after, we originally wanted to avoid that 
> member's tasks from being redistributed across the remaining group members 
> since this would disturb the stable assignment and could cause unnecessary 
> state migration and restoration. We also hoped
> to limit the disruption to just a single rebalance, rather than forcing the 
> group to rebalance once when the member shuts down and then again when it 
> comes back up. So it's really an optimization  for the case in which the 
> shutdown is temporary.
>  
> That said, many of those optimizations are no longer necessary or at least 
> much less useful given recent features and improvements. For example 
> rebalances are now lightweight so skipping the 2nd rebalance is not as worth 
> optimizing for, and the new assignor will take into account the actual 
> underlying state for each task/partition assignment, rather than just the 
> previous assignment, so the assignment should be considerably more stable 
> across bounces and rolling restarts. 
>  
> Given that, it might be time to reconsider this optimization. Alternatively, 
> we could introduce another form of the close() API that forces the member to 
> leave the group, to be used in event of actual scale down rather than a 
> transient bounce.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (KAFKA-13217) Reconsider skipping the LeaveGroup on close() or add an overload that does so

Reply via email to