[ https://issues.apache.org/jira/browse/KAFKA-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15833429#comment-15833429 ]
Bhavesh Shah commented on KAFKA-4676: ------------------------------------- Hi [~hachikuji]/[~ijuma], Event after upgrading kafka to 0.10.1.1 we encountered kafka topics were getting stuck again. [All below logs are after upgrading to 0.10.1.1] Logs from consumer application around the time topics were stuck. {code} org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Auto offset commit failed for group dummy-consumer-group: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This maeans that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records. {code} Apart from below, Couldn't find anything significant from broker/controller logs around the time topic was stuck {code} [2017-01-22 03:30:05,383] INFO [GroupCoordinator 0]: Preparing to restabilize group dummy-consumer-group with old generation 80 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:07,949] INFO [GroupCoordinator 0]: Stabilized group dummy-consumer-group generation 81 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:07,955] INFO [GroupCoordinator 0]: Assignment received from leader for group dummy-consumer-group for generation 81 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:07,963] INFO [GroupCoordinator 0]: Preparing to restabilize group dummy-consumer-group with old generation 81 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:10,972] INFO [GroupCoordinator 0]: Stabilized group dummy-consumer-group generation 82 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:10,973] INFO [GroupCoordinator 0]: Assignment received from leader for group dummy-consumer-group for generation 82 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:11,373] INFO [GroupCoordinator 0]: Preparing to restabilize group dummy-consumer-group with old generation 82 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:13,986] INFO [GroupCoordinator 0]: Stabilized group dummy-consumer-group generation 83 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:13,987] INFO [GroupCoordinator 0]: Assignment received from leader for group dummy-consumer-group for generation 83 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:14,744] INFO [GroupCoordinator 0]: Preparing to restabilize group dummy-consumer-group with old generation 83 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:15,889] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager) [2017-01-22 03:30:16,992] INFO [GroupCoordinator 0]: Stabilized group dummy-consumer-group generation 84 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:16,993] INFO [GroupCoordinator 0]: Assignment received from leader for group dummy-consumer-group for generation 84 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:18,131] INFO [GroupCoordinator 0]: Preparing to restabilize group dummy-consumer-group with old generation 84 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:19,998] INFO [GroupCoordinator 0]: Stabilized group dummy-consumer-group generation 85 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:19,999] INFO [GroupCoordinator 0]: Assignment received from leader for group dummy-consumer-group for generation 85 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:20,509] INFO [GroupCoordinator 0]: Preparing to restabilize group dummy-consumer-group with old generation 85 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:23,004] INFO [GroupCoordinator 0]: Stabilized group dummy-consumer-group generation 86 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:23,006] INFO [GroupCoordinator 0]: Assignment received from leader for group dummy-consumer-group for generation 86 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:23,079] INFO [GroupCoordinator 0]: Preparing to restabilize group dummy-consumer-group with old generation 86 (kafka.coordinator.GroupCoordinator) [2017-01-22 03:30:25,379] INFO [GroupCoordinator 0]: Group dummy-consumer-group with generation 87 is now empty (kafka.coordinator.GroupCoordinator) {code} We also made below consumer config values configurable. Please let us know if they are in-appropriate/unusual. {code} session.timeout.ms=300000 max.poll.interval.ms=300000 max.poll.records=100 request.timeout.ms=3050000 {code} Keeping eyes on broker logs to discover something helpful with regards to reported issue. > Kafka consumers gets stuck for some partitions > ---------------------------------------------- > > Key: KAFKA-4676 > URL: https://issues.apache.org/jira/browse/KAFKA-4676 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.10.1.0 > Reporter: Vishal Shukla > Priority: Critical > Labels: consumer, reliability > Attachments: stuck-topic-thread-dump.log > > > We recently upgraded to Kafka 0.10.1.0. We are frequently facing issue that > Kafka consumers get stuck suddenly for some partitions. > Attached thread dump. -- This message was sent by Atlassian JIRA (v6.3.4#6332)