[ https://issues.apache.org/jira/browse/KAFKA-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958590#comment-15958590 ]
Domenico Di Giulio commented on KAFKA-5016: ------------------------------------------- More information on Kafka versions (tried to cross them): client 0.10.2.0 - server 0.10.0.0 --> doesn't work client 0.10.2.0 - server 0.10.2.0 --> doesn't work client 0.10.0.0 - server 0.10.0.0 --> works client 0.10.0.0 - server 0.10.2.0 --> works So the issue may depend on the client although it looks like it's the "stabilization" that doesn't complete on the server. Also notice that it works with client 0.10.0.0 but takes always 30 seconds to "restabilize group": [2017-04-06 10:18:22,709] INFO [GroupCoordinator 2]: Preparing to restabilize group testOutputTopic with old generation 2 (kafka.coordinator.GroupCoordinator) [2017-04-06 10:18:52,579] INFO [GroupCoordinator 2]: Stabilized group testOutputTopic generation 3 (kafka.coordinator.GroupCoordinator) So I repeated the test with 0.10.2.0 client/server, to get the detailed trace for both (attached: look again for "SERVER HANGS"). Thanks a lot for any indication you may have (including "you're just doing it all wrong"). > Consumer hang in poll method while rebalancing is in progress > ------------------------------------------------------------- > > Key: KAFKA-5016 > URL: https://issues.apache.org/jira/browse/KAFKA-5016 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.10.1.0, 0.10.2.0 > Reporter: Domenico Di Giulio > Attachments: Kafka 0.10.2.0 Issue (TRACE) - Server + Client.txt, > Kafka 0.10.2.0 Issue (TRACE).txt > > > After moving to Kafka 0.10.2.0, it looks like I'm experiencing a hang in the > rebalancing code. > This is a test case, not (still) production code. It does the following with > a single-partition topic and two consumers in the same group: > 1) a topic with one partition is forced to be created (auto-created) > 2) a producer is used to write 10 messages > 3) the first consumer reads all the messages and commits > 4) the second consumer attempts a poll() and hangs indefinitely > The same issue can't be found with 0.10.0.0. > See the attached logs at TRACE level. Look for "SERVER HANGS" to see where > the hang is found: when this happens, the client keeps failing any hearbeat > attempt, as the rebalancing is in progress, and the poll method hangs > indefinitely. -- This message was sent by Atlassian JIRA (v6.3.15#6346)