[ https://issues.apache.org/jira/browse/KAFKA-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15854842#comment-15854842 ]
Vipul Singh commented on KAFKA-4739: ------------------------------------ [~hachikuji], I have updated the description of the jira with updated logs. To answer your questions: 1. Once the consumer reaches this state, then around 10 every second, the new log lines posted above should help establish this. 2. Yes, both are running on 0.9.0.1 at the moment 3. I have posted logs with timestamps above to help answer this question 4. I am afraid not at the moment. An observation from our side: it looks like the network client never attempts to reconnect to the broker after getting disconnected. It cancels all the in-flight requests, but never attempts to reconnect. A couple of configs we use: on broker side: request_timeout_ms = 300001 on consumer side: session.timeout.ms = 120000 request.timeout.ms = 120001 Hope this helps! > KafkaConsumer poll going into an infinite loop > ---------------------------------------------- > > Key: KAFKA-4739 > URL: https://issues.apache.org/jira/browse/KAFKA-4739 > Project: Kafka > Issue Type: Bug > Components: consumer > Affects Versions: 0.9.0.1 > Reporter: Vipul Singh > > We are seeing an issue with our kafka consumer where it seems to go into an > infinite loop while polling, trying to fetch data from kafka. We are seeing > the heartbeat requests on the broker from the consumer, but nothing else from > the kafka consumer. > We enabled debug level logging on the consumer, and see these logs: > DEBUG org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient: > Cancelled FETCH request ClientRequest(metadata info) with correlation id abc > due to node xyz being disconnected > DEBUG org.apache.kafka.clients.consumer.internals.Fetcher: Fetch failed > ! org.apache.kafka.common.errors.DisconnectException: null > DEBUG org.apache.kafka.clients.NetworkClient: Initiating connection to node > abc at nodename:port > DEBUG org.apache.kafka.clients.consumer.internals.Fetcher: Fetch failed > ! org.apache.kafka.clients.consumer.internals.SendFailedException: null > DEBUG org.apache.kafka.clients.consumer.internals.Fetcher: Fetch failed > ! org.apache.kafka.clients.consumer.internals.SendFailedException: null > DEBUG org.apache.kafka.clients.consumer.internals.Fetcher: Fetch failed > ! org.apache.kafka.clients.consumer.internals.SendFailedException: null > DEBUG org.apache.kafka.clients.consumer.internals.Fetcher: Fetch failed > ! org.apache.kafka.clients.consumer.internals.SendFailedException: null > DEBUG org.apache.kafka.clients.consumer.internals.Fetcher: Fetch failed > ! org.apache.kafka.clients.consumer.internals.SendFailedException: null > DEBUG org.apache.kafka.clients.consumer.internals.Fetcher: Fetch failed > ! org.apache.kafka.clients.consumer.internals.SendFailedException: null > DEBUG org.apache.kafka.clients.consumer.internals.Fetcher: Fetch failed > ! org.apache.kafka.clients.consumer.internals.SendFailedException: null > DEBUG org.apache.kafka.clients.consumer.internals.Fetcher: Fetch failed > ! org.apache.kafka.clients.consumer.internals.SendFailedException: null > DEBUG org.apache.kafka.clients.consumer.internals.Fetcher: Fetch failed > ! org.apache.kafka.clients.consumer.internals.SendFailedException: null > DEBUGorg.apache.kafka.clients.consumer.internals.Fetcher: Fetch failed > ! org.apache.kafka.clients.consumer.internals.SendFailedException: null > DEBUG org.apache.kafka.clients.consumer.internals.Fetcher: Fetch failed > ! org.apache.kafka.clients.consumer.internals.SendFailedException: null > DEBUG org.apache.kafka.clients.NetworkClient: Completed connection to node xyz > DEBUG org.apache.kafka.clients.Metadata: Updated cluster metadata version 4 > to Cluster(cluster_info) > DEBUG org.apache.kafka.clients.consumer.internals.AbstractCoordinator: > Received successful heartbeat response. > DEBUG org.apache.kafka.clients.consumer.internals.AbstractCoordinator: > Received successful heartbeat response. > DEBUG org.apache.kafka.clients.consumer.internals.AbstractCoordinator: > Received successful heartbeat response. > DEBUG org.apache.kafka.clients.consumer.internals.AbstractCoordinator: > Received successful heartbeat response. > And this just goes on. The way we have been able to replicate this issue, is > by restarting the process in multiple successions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)