[ 
https://issues.apache.org/jira/browse/KAFKA-18143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902674#comment-17902674
 ] 

Lianet Magrans commented on KAFKA-18143:
----------------------------------------

Hello [~ravigupta], could you attach your application logs maybe? It will help 
understand what's going on. Thanks!  

> Kafka consumer keeps getting records on poll after eviction from group
> ----------------------------------------------------------------------
>
>                 Key: KAFKA-18143
>                 URL: https://issues.apache.org/jira/browse/KAFKA-18143
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 3.7.1, 3.9.0
>            Reporter: Ravi Gupta
>            Priority: Major
>
> My application polls records from a MSK Kafka cluster. The application 
> maintains offset of each partition and hence has disabled autocommit. It 
> actually never commits offset as it persists offset to internal data store.
> In production, we observed duplicate records. {*}The duplicate records don't 
> stop until we restart the instance with zombie consumer (evicted from group 
> but keeps polling){*}. A consumer turns to zombie when it fails to send 
> heartbeats. This typically happens due to IAM authentication issues in MSK 
> which sometime lasts for longer time.
> On further digging, I found that the partitions that are assigned to zombie 
> consumer are assigned to other active consumers, but the zombie consumers 
> poll continue to return the records.
> The question is - *should a zombie consumer get records on poll?*
> I have been able to reproduce it locally. Here is my local setup with issue 
> reproduced:
>  * A single broker (docker image) with three different external port.
>  * Create one topic with two partitions
>  * [Toxiproxy|https://github.com/Shopify/toxiproxy] proxies these port.
>  * Two consumers (subscribed to the topic) each connecting one of the proxied 
> port with session timeout set to 10 seconds
>  * Introduce latency in one of the proxy to evict one of the consumer from 
> the group
>  * Produce some messages for each partition
>  * Both the consumer keep getting the messages



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to