kevin j staiger created KAFKA-12544: ---------------------------------------
Summary: Particular partitions lagging and consumers intermittently reading Key: KAFKA-12544 URL: https://issues.apache.org/jira/browse/KAFKA-12544 Project: Kafka Issue Type: Bug Components: consumer Environment: production Reporter: kevin j staiger Attachments: Screen Shot 2021-03-17 at 4.04.50 PM.png, Screen Shot 2021-03-23 at 8.42.13 PM.png Hi, we are experiencing a strange issue with a kafka topic where is seems like a particular consumer gets stuck in a bad state, we're running an 8 pod kubernetes cluster with 2 threads and 16 partitions, things run smoothly for awhile and then one of the pods (with 2 consumers and 2 partitions) will become very intermittent in its read rate and partition lag will spike. Eventually all of the pods switch from reading at a steady rate to this spike intermittent rate. the cpu on the pods seems normal and the byte rate of the events seems fine, any idea why certain consumers can get into this state where there seem to gaps of 0 operations happening and the lag continually increases? thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)