[ 
https://issues.apache.org/jira/browse/KAFKA-17116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17867928#comment-17867928
 ] 

TengYao Chi commented on KAFKA-17116:
-------------------------------------

Hi [~lianetm] 

Currently, we use the ConsumerHeartbeat API to manage the join and leave 
processes of AsyncKafkaConsumer. I propose that we let the consumer generate a 
temporary ID to be used for identification by the broker before member ID 
allocation.

To achieve this, we need to add a new field in ConsumerHeartbeatRequestData to 
attach this ID.

Once the broker receives the initial join heartbeat request, it can generate 
the member ID and put the temporary ID : member ID pair into a map that 
maintains the relationship between temporary IDs and member IDs.

With this map, we can identify the leaving heartbeat request by temporary ID if 
we encounter the scenario described in the issue.

If the consumer receives the allocated member ID normally, we can remove the 
temporary ID : member ID entry from the map to avoid memory leaks.

> New consumer may not send effective leave group if member ID received after 
> close 
> ----------------------------------------------------------------------------------
>
>                 Key: KAFKA-17116
>                 URL: https://issues.apache.org/jira/browse/KAFKA-17116
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients, consumer
>    Affects Versions: 3.8.0
>            Reporter: Lianet Magrans
>            Assignee: TengYao Chi
>            Priority: Major
>              Labels: kip-848-client-support
>             Fix For: 3.9.0
>
>
> If the new consumer is closed after sending a HB to join, but before 
> receiving the response to it, it will send a leave group request but without 
> member ID (will simply fail with UNKNOWN_MEMBER_ID). This will make that the 
> broker will have a registered new member, for which it will never receive a 
> leave request for it.
>  # consumer.subscribe -> sends HB to join, transitions to JOINING
>  # consumer.close -> will transition to LEAVING and send HB with epoch -1 
> (without waiting for in-flight requests)
>  # consumer receives response to initial HB, containing the assigned member 
> ID. It will simply ignore it because it's not in the group anymore 
> (UNSUBSCRIBED)
> Note that the expectation, with the current logic, and main downsides of this 
> are:
>  # If the case was that the member received partitions on the first HB, those 
> partitions won't be re-assigned (broker waiting for the closed consumer to 
> reconcile them), until the rebalance timeout expires. 
>  # Even if no partitions were assigned to it, the member will remain in the 
> group from the broker point of view (but not from the client POV). The member 
> will be eventually kicked out for not sending HBs, but only when it's session 
> timeout expires.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to