[ https://issues.apache.org/jira/browse/KAFKA-18143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902674#comment-17902674 ]
Lianet Magrans commented on KAFKA-18143: ---------------------------------------- Hello [~ravigupta], could you attach your application logs maybe? It will help understand what's going on. Thanks! > Kafka consumer keeps getting records on poll after eviction from group > ---------------------------------------------------------------------- > > Key: KAFKA-18143 > URL: https://issues.apache.org/jira/browse/KAFKA-18143 > Project: Kafka > Issue Type: Bug > Components: consumer > Affects Versions: 3.7.1, 3.9.0 > Reporter: Ravi Gupta > Priority: Major > > My application polls records from a MSK Kafka cluster. The application > maintains offset of each partition and hence has disabled autocommit. It > actually never commits offset as it persists offset to internal data store. > In production, we observed duplicate records. {*}The duplicate records don't > stop until we restart the instance with zombie consumer (evicted from group > but keeps polling){*}. A consumer turns to zombie when it fails to send > heartbeats. This typically happens due to IAM authentication issues in MSK > which sometime lasts for longer time. > On further digging, I found that the partitions that are assigned to zombie > consumer are assigned to other active consumers, but the zombie consumers > poll continue to return the records. > The question is - *should a zombie consumer get records on poll?* > I have been able to reproduce it locally. Here is my local setup with issue > reproduced: > * A single broker (docker image) with three different external port. > * Create one topic with two partitions > * [Toxiproxy|https://github.com/Shopify/toxiproxy] proxies these port. > * Two consumers (subscribed to the topic) each connecting one of the proxied > port with session timeout set to 10 seconds > * Introduce latency in one of the proxy to evict one of the consumer from > the group > * Produce some messages for each partition > * Both the consumer keep getting the messages -- This message was sent by Atlassian Jira (v8.20.10#820010)