[ 
https://issues.apache.org/jira/browse/KAFKA-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15855098#comment-15855098
 ] 

Jason Gustafson commented on KAFKA-4739:
----------------------------------------

[~sagar8192] Unfortunately, there is no such option. Traditionally, kafka 
clients attempt to handle broker failures internally. This usually means a 
metadata refresh and a reconnect, which is exactly what the client appears to 
be doing here. We normally expect that the assigned partitions are spread 
across multiple brokers, so a failure fetching from any particular broker 
should only affect the availability of the partitions it was hosting. This is 
typically what you want since a broker failure will cause another broker to 
take over its partitions. There is little applications can do in these cases 
anyway other than possibly sending an alert. Nevertheless, this behavior is 
often contested and may change, especially as some of the automatic behavior 
(such as topic auto-creation) is retired.

One small request: the logs seem to have sanitized broker ids. Can you ensure 
that they have all been updated consistently? The puzzling thing is that the 
the requests appear to be timing out on the client after 30s, yet you've 
enabled 120s in the config. Are you sure the 120s is correct? In which config 
did you enable "request_timeout_ms = 300001" (the broker doesn't have such a 
config)? It's also strange that multiple fetches are cancelled after a 
disconnect. The consumer should only ever have one fetch in-flight for each 
broker. I don't have a ready explanation for that. Could there be some details 
left out of the logs? We might get more information if you enable TRACE level 
logging.

> KafkaConsumer poll going into an infinite loop
> ----------------------------------------------
>
>                 Key: KAFKA-4739
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4739
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 0.9.0.1
>            Reporter: Vipul Singh
>
> We are seeing an issue with our kafka consumer where it seems to go into an 
> infinite loop while polling, trying to fetch data from kafka. We are seeing 
> the heartbeat requests on the broker from the consumer, but nothing else from 
> the kafka consumer.
> We enabled debug level logging on the consumer, and see these logs: 
> https://gist.github.com/neoeahit/757bff7acdea62656f065f4dcb8974b4
> And this just goes on. The way we have been able to replicate this issue, is 
> by restarting the process in multiple successions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to