[ https://issues.apache.org/jira/browse/KAFKA-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377764#comment-16377764 ]
Jason Gustafson commented on KAFKA-6593: ---------------------------------------- I was wrong about the explanation. The {{ConsumerNetworkClient}} already has some protection from this kind of scenario (e.g. it limits the maximum poll time to 5 seconds). So something else is going on in the case that I'm looking at. > Coordinator disconnect in heartbeat thread can cause commitSync to block > indefinitely > ------------------------------------------------------------------------------------- > > Key: KAFKA-6593 > URL: https://issues.apache.org/jira/browse/KAFKA-6593 > Project: Kafka > Issue Type: Bug > Components: consumer > Affects Versions: 1.0.0, 0.11.0.2 > Reporter: Jason Gustafson > Assignee: Jason Gustafson > Priority: Major > Fix For: 1.1.0 > > > If a coordinator disconnect is observed in the heartbeat thread, it can cause > a pending offset commit to be cancelled just before the foreground thread > begins waiting on its response in poll(). Since the poll timeout is > Long.MAX_VALUE, this will cause the consumer to effectively hang until some > other network event causes the poll() to return. We try to protect this case > with a poll condition on the future, but this isn't bulletproof since the > future can be completed outside of the lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)