[ 
https://issues.apache.org/jira/browse/KAFKA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16872476#comment-16872476
 ] 

Sönke Liebau commented on KAFKA-5115:
-------------------------------------

Hi [~MiniMizer],

we've just discussed this today and while the change itself would be fairly 
simple, I believe there are a lot of areas that would need investigating / 
testing before this could be recommended for a production deployment.

Specifically everything around transactions and idempotent producers seem to me 
to be worth a dedicated look.

On the consumer side, the immediate concern I think is offsets, stored offsets 
might not create issues (but may also not work) - but anything cached inside 
the Fetcher cause havoc..

Bottom line: it is a good idea that I'd fully support, but probably needs more 
work than is immediately apparent.

> Use bootstrap.servers to refresh metadata
> -----------------------------------------
>
>                 Key: KAFKA-5115
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5115
>             Project: Kafka
>          Issue Type: Improvement
>    Affects Versions: 0.10.2.0
>            Reporter: Dan
>            Priority: Major
>
> Currently, it seems that bootstrap.servers list is used only when the 
> producer starts, to discover the cluster, and subsequent metadata refreshes 
> go to the discovered brokers directly.
> We would like to use the bootstrap.servers list for metadata refresh to 
> support a failover mechanism by providing a VIP which can dynamically 
> redirect requests to a secondary Kafka cluster if the primary is down.
> Consider the following use case, where "kafka-cluster.local" is a VIP on a 
> load balancer with priority server pools that point to two different Kafka 
> clusters (so when all servers of cluster #1 are down, it automatically 
> redirects to servers from cluster #2).
> bootstrap.servers: kafka-cluster.local:9092
> 1) Producer starts, connects to kafka-cluster.local and discovers all servers 
> from cluster #1
> 2) Producer starts producing to cluster #1
> 3) cluster #1 goes down
> 4) Producer detects the failure, refreshes metadata from kafka-cluster.local 
> (which now returns nodes from cluster #2)
> 5) Producer starts producing to cluster #2
> 6) cluster #1 is brought back online, and kafka-cluster.local now points to 
> it again
> In the current state, it seems that the producer will never revert to cluster 
> #1 because it continues to refresh its metadata from the brokers of cluster 
> #2, even though kafka-cluster.local no longer points to that cluster.
> If we could force the metadata refresh to happen against 
> "kafka-cluster.local", it would enable automatic failover and failback 
> between the clusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to