[ 
https://issues.apache.org/jira/browse/KAFKA-7931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876222#comment-16876222
 ] 

Sam Weston commented on KAFKA-7931:
-----------------------------------

Have you made any progress with this? I have the same problem if I lose more 
than 1 node every 5 minutes or so, and I haven't worked out how to monitor for 
it yet...

 

> Java Client: if all ephemeral brokers fail, client can never reconnect to 
> brokers
> ---------------------------------------------------------------------------------
>
>                 Key: KAFKA-7931
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7931
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 2.1.0
>            Reporter: Brian
>            Priority: Critical
>
> Steps to reproduce:
>  * Setup kafka cluster in GKE, with bootstrap server address configured to 
> point to a load balancer that exposes all GKE nodes
>  * Run producer that emits values into a partition with 3 replicas
>  * Kill every broker in the cluster
>  * Wait for brokers to restart
> Observed result:
> The java client cannot find any of the nodes even though they have all 
> recovered. I see messages like "Connection to node 30 (/10.6.0.101:9092) 
> could not be established. Broker may not be available.".
> Note, this is *not* a duplicate of 
> https://issues.apache.org/jira/browse/KAFKA-7890. I'm using the client 
> version that contains the fix for 
> https://issues.apache.org/jira/browse/KAFKA-7890.
> Versions:
> Kakfa: kafka version 2.1.0, using confluentinc/cp-kafka/5.1.0 docker image
> Client: trunk from a few days ago (git sha 
> 9f7e6b291309286e3e3c1610e98d978773c9d504), to pull in the fix for KAFKA-7890
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to