[jira] [Commented] (KAFKA-3068) NetworkClient may connect to a different Kafka cluster than originally configured

Jun Rao (JIRA) Wed, 06 Jan 2016 19:03:07 -0800

    [ 
https://issues.apache.org/jira/browse/KAFKA-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086717#comment-15086717
 ]


Jun Rao commented on KAFKA-3068:
--------------------------------

There are 2 possible ways of configuring bootstrap servers: (1) Using a VIP on 
a load balancer. In the case, we can expect the VIP to map to the current live 
brokers. (2) Using a list of broker hosts. In this case, it's the user's 
responsibility to make sure the list is up to date and contains at least one 
live broker. When we hit a case that none of the current brokers (from last 
metadata response) is connectable (e.g., the cluster is shrunk to 1 node, then 
that node dies and other brokers are restarted), falling back to the bootstrap 
servers will help if option (1) is used since the VIP will allow us to connect 
to a live broker. If option (2) is used, falling back to the bootstrap servers 
may not help if none of the bootstrap servers is reachable. However, in this 
case, it's really the user's responsibility to re-configure bootstrap servers 
and restart the producer. So, overall, it seems that falling back to bootstrap 
servers when all existing connections are gone will help.

Now, on caching old brokers long than the metadata refresh interval. Currently, 
we can say if you ever want to reuse a server in a different Kafka cluster, 
wait for at least the metadata refresh interval after taking the broker down. 
If we cache the old brokers longer, this reasoning will be more complicated. 
Also, from the above, I am not sure if old brokers are more useful than the 
configured bootstrap servers.

Finally, we discussed to fall back to the bootstrap servers in KAFKA-1303 for a 
different scenario early on, but didn't pursue that in the end.

> NetworkClient may connect to a different Kafka cluster than originally 
> configured
> ---------------------------------------------------------------------------------
>
>                 Key: KAFKA-3068
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3068
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 0.9.0.0
>            Reporter: Jun Rao
>
> In https://github.com/apache/kafka/pull/290, we added the logic to cache all 
> brokers (id and ip) that the client has ever seen. If we can't find an 
> available broker from the current Metadata, we will pick a broker that we 
> have ever seen (in NetworkClient.leastLoadedNode()).
> One potential problem this logic can introduce is the following. Suppose that 
> we have a broker with id 1 in a Kafka cluster. A producer client remembers 
> this broker in nodesEverSeen. At some point, we bring down this broker and 
> use the host in a different Kafka cluster. Then, the producer client uses 
> this broker from nodesEverSeen to refresh metadata. It will find the metadata 
> in a different Kafka cluster and start producing data there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3068) NetworkClient may connect to a different Kafka cluster than originally configured

Reply via email to