[ 
https://issues.apache.org/jira/browse/KAFKA-12890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751269#comment-17751269
 ] 

Yang commented on KAFKA-12890:
------------------------------

Hi [~dajac]   I know this issue and PR have been closed for a while, just 
wonder if you have any broker-side log for this issue that you can share. 
Thanks!

> Consumer group stuck in `CompletingRebalance`
> ---------------------------------------------
>
>                 Key: KAFKA-12890
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12890
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 2.7.0, 2.6.1, 2.8.0, 2.7.1, 2.6.2
>            Reporter: David Jacot
>            Assignee: David Jacot
>            Priority: Blocker
>             Fix For: 2.8.1, 3.0.0
>
>
> We have seen recently multiple consumer groups stuck in 
> `CompletingRebalance`. It appears that those group never receives the 
> assignment from the leader of the group and remains stuck in this state 
> forever.
> When a group transitions to the `CompletingRebalance` state, the group 
> coordinator sets up `DelayedHeartbeat` for each member of the group. It does 
> so to ensure that the member sends a sync request within the session timeout. 
> If it does not, the group coordinator rebalances the group. Note that here, 
> `DelayedHeartbeat` is used here for this purpose. `DelayedHeartbeat` are also 
> completed when member heartbeats.
> The issue is that https://github.com/apache/kafka/pull/8834 has changed the 
> heartbeat logic to allow members to heartbeat while the group is in the 
> `CompletingRebalance` state. This was not allowed before. Now, if a member 
> starts to heartbeat while the group is in the `CompletingRebalance`, the 
> heartbeat request will basically complete the pending `DelayedHeartbeat` that 
> was setup previously for catching not receiving the sync request. Therefore, 
> if the sync request never comes, the group coordinator does not notice 
> anymore.
> We need to bring that behavior back somehow.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to