Thank you, Yash. That additional documentation helps to further my 
understanding.

In case it helps in any way, I am currently setting the 
rest.advertised.host.name, and listener properties to a private IP address that 
is resolvable within each data center. However, each data center can only 
communicate with each other using a load balancer.

Is there any configuration I can set to help with this setup? 

For example, if the worker sends the request to the load balancer of the data 
center where the leader resides, I believe that it would work network-wise

Thank you again for taking the time to help.





---- On Tue, 26 Sep 2023 07:44:06 -0400 Yash Mayya <yash.ma...@gmail.com> wrote 
---



Hi Yeikel, 
 
> To clarify, who initiates the step that assigns a 
>  connector to a specific worker? If this process 
> is controlled by the leader, wouldn't it result in a 
> failure to assign tasks to workers with whom it 
> cannot communicate? 
 
This happens via the group rebalance process where each Kafka Connect 
worker communicates with the Kafka broker that has been chosen as the group 
co-ordinator for the Kafka Connect cluster. The assignment is indeed 
computed by the leader Connect worker but it is disseminated to the other 
Connect workers via the group coordinator [1]. 
 
> I should not find myself in a situation where a 
> connector is assigned to a worker who cannot 
> communicate with the leader 
 
This can unfortunately happen, since the assignments aren't done directly 
through leader -> non-leader Connect worker communication but via the Kafka 
broker designated as the group co-ordinator for the Connect cluster. 
 
[1] - 
https://medium.com/streamthoughts/apache-kafka-rebalance-protocol-or-the-magic-behind-your-streams-applications-e94baf68e4f2
 
 
On Tue, Sep 26, 2023 at 8:25 AM Yeikel Santana <mailto:em...@yeikel.com> wrote: 
 
> Thank you, Yash. Your explanation makes sense 
> 
> To clarify, who initiates the step that assigns a connector to a specific 
> worker? If this process is controlled by the leader, wouldn't it result in 
> a failure to assign tasks to workers with whom it cannot communicate? 
> 
> Although it is not ideal, it is acceptable for now if some workers remain 
> inactive as long as the data center where the leader resides remains active 
> and continues to handle task assignments. I should not find myself in a 
> situation where a connector is assigned to a worker who cannot communicate 
> with the leader as that would render it useless as you mentioned 
> 
> Thank you for taking the time 
> 
> 
> 
> 
> 
> 
> 
> ---- On Mon, 25 Sep 2023 11:41:18 -0400 Yash Mayya 
> <mailto:yash.ma...@gmail.com> 
> wrote --- 
> 
> 
> 
> Hi Yeikel, 
> 
> Heartbeats and group coordination in Kafka Connect do occur through Kafka, 
> but a Kafka Connect cluster where all workers cannot communicate with 
> each other won't work very well. You'll be able to create / update / 
> delete 
> connectors by making requests to any workers that can communicate with the 
> leader like you noted. However, certain internal operations require cross 
> Connect worker network access as well. For instance, after a connector is 
> started, it needs to spawn tasks that do the actual work. The tasks are 
> created via a POST request to the leader worker from the worker that is 
> running the connector. When you issue a create connector request to a 
> worker, a group rebalance ensues and the connector is assigned to a worker 
> in the cluster (it could be any worker, not necessarily the one to 
> which the request was issued). So if the connector that you created lands 
> on a Connect worker that cannot communicate with the leader worker, it 
> won't be able to create its tasks which will render the connector 
> essentially useless. 
> 
> Thanks, 
> Yash 
> 
> On Mon, Sep 25, 2023 at 7:51 PM Yeikel Santana 
> <mailto:mailto:em...@yeikel.com> 
> wrote: 
> 
> > Thank you, Nikhil. 
> > 
> > I did notice that challenge you're describing with the REST updates when 
> I 
> > had more than one worker within the same datacenter. 
> > 
> > Luckily, solving that was relatively simple as all my workers can 
> > communicate within the same data center, and all I need to do is to 
> ensure 
> > that the update is initiated from the same datacenter as the leader. 
> From 
> > what I tested so far, this seems to work fine. 
> > 
> > My biggest concern was regarding other operations such as heartbeats or 
> > general coordination. If that happens through Kafka, then I should be 
> > fine.Thank you for taking the time ---- On Mon, 25 Sep 2023 09:45:43 
> -0400 
> > mailto:mailto:nikhilsrivastava4...@gmail.com wrote ----Hi Yeikel, 
> > 
> > Sharing my two cents. Would let others chime in to add to this. 
> > 
> > Based on my understanding, if connect workers (which are all part of the 
> > same cluster) can communicate with the kafka brokers (which happens to 
> be 
> > the Group Coordinator and facilitates Connect Leader Election via Group 
> > Membership Protocol), then only 1 connect worker will be elected as 
> leader 
> > amongst all others in the cluster. Outside of that, I believe a bunch of 
> > REST calls to connect workers are forwarded to the connect leader (if 
> the 
> > REST request lands on a connect worker which isn't a leader). In case of 
> a 
> > non-retriable network partition between the non-leader worker and leader 
> > worker, those REST requests will fail. I'm referring to REST requests 
> like 
> > CREATE / UPDATE / DELETE. 
> > 
> > Hope this helps a little. 
> > 
> > Thanks, 
> > -Nikhil 
> > 
> > On Sun, 24 Sept 2023 at 06:36, Yeikel Santana 
> > <mailto:mailto:em...@yeikel.com> 
> wrote: 
> > 
> > > Hello everyone,I'm currently designing a new Kafka Connect cluster, 
> and 
> > > I'm trying to understand how connectivity functions among workers.In 
> my 
> > > setup, I have a single Kafka Connect cluster connected to the same 
> Kafka 
> > > topics and Kafka cluster. However, the workers are deployed in 
> > > geographically separated data centers, each of which is fully isolated 
> at 
> > > the networkI suspect that this setup might not work with Kafka Connect 
> > > because my current understanding is that ALL workers need to 
> communicate 
> > > with the leader for task coordination and heartbeats.In terms of 
> leader 
> > > election, can this result in multiple leaders and other potential 
> > > issues?Any input and suggestions would be appreciated 
> >

Reply via email to