[jira] [Commented] (KAFKA-6433) Connect distributed workers should fail if their config is "incompatible" with leader's

2018-02-22 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373961#comment-16373961
 ] 

Ewen Cheslack-Postava commented on KAFKA-6433:
--

This needs a lot of thought around upgrades, compatibility, and debuggability. 
There are all sorts of weird issues you can get into with something like this.

I agree that the general goal of checking that important configs are aligned is 
absolutely the right thing to do. Today, unless I'm forgetting something, we 
basically only check that the group and config offset match. Lots of other 
things could potentially mismatch and cause problems.

But things "matching" can be tricky. Topic names are pretty straightforward and 
we can validate easily. Validating anything like "the same set of connectors" 
is tricky given both versioning and upgrading a cluster with a *new* connector. 
Same for converters and transformations. We'd need to define clear rules for 
what "compatibility" means here and when a node is allowed to run a 
connector/task. And who is the source of truth? Who defines what's new?

Personally, I'd argue it's actually clearer to have a log message saying 
"couldn't start connector X because class not found" from node Y than have to 
determine why all connectors/tasks are running on node Z because node W wasn't 
allowed to join worker group N for some mismatch of connectors. It might fail 
faster, but it tells you exactly what the problem is and leads to a clear 
resolution.

 

 

> Connect distributed workers should fail if their config is "incompatible" 
> with leader's
> ---
>
> Key: KAFKA-6433
> URL: https://issues.apache.org/jira/browse/KAFKA-6433
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 1.0.0
>Reporter: Randall Hauch
>Priority: Major
>  Labels: needs-kip
>
> Currently, each distributed worker config must have the same `worker.id` and 
> must use the same internal topics for configs, offsets, and status. 
> Additionally, each worker must be configured to have the same connectors, 
> SMTs, and converters; confusing error messages will result when some workers 
> are able to deploy connector tasks with SMTs while others fail when they are 
> missing plugins the other workers do have.
> Ideally, a Connect workers would only be allowed to join the cluster if it 
> were "compatible" with the the existing cluster, where "compatible" perhaps 
> includes using the same internal topics and having the same set of plugins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6433) Connect distributed workers should fail if their config is "incompatible" with leader's

2018-01-08 Thread Randall Hauch (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317329#comment-16317329
 ] 

Randall Hauch commented on KAFKA-6433:
--

KAFKA-5505 will likely be implemented as a change/evolution of Connect's 
rebalance subprotocol, and the requirements to solve this issue should be 
considered as part of that effort.

> Connect distributed workers should fail if their config is "incompatible" 
> with leader's
> ---
>
> Key: KAFKA-6433
> URL: https://issues.apache.org/jira/browse/KAFKA-6433
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 1.0.0
>Reporter: Randall Hauch
>  Labels: needs-kip
>
> Currently, each distributed worker config must have the same `worker.id` and 
> must use the same internal topics for configs, offsets, and status. 
> Additionally, each worker must be configured to have the same connectors, 
> SMTs, and converters; confusing error messages will result when some workers 
> are able to deploy connector tasks with SMTs while others fail when they are 
> missing plugins the other workers do have.
> Ideally, a Connect workers would only be allowed to join the cluster if it 
> were "compatible" with the the existing cluster, where "compatible" perhaps 
> includes using the same internal topics and having the same set of plugins.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6433) Connect distributed workers should fail if their config is "incompatible" with leader's

2018-01-08 Thread Randall Hauch (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317325#comment-16317325
 ] 

Randall Hauch commented on KAFKA-6433:
--

Incidentally, it would not help to change what Connect stores in the status 
topic, because if the workers are using different status topics they would read 
different status information. A better option might be to ship this additional 
information in the metadata used in Connect's rebalance subprotocol. We can't 
do this today, but we're talking about evolving the protocol for incremental 
rebalance, and it'd be great to also add some additional worker metadata during 
that evolution as well as tolerate optional metadata, enabling adding more 
metadata fields that may not be necessary across the cluster.

> Connect distributed workers should fail if their config is "incompatible" 
> with leader's
> ---
>
> Key: KAFKA-6433
> URL: https://issues.apache.org/jira/browse/KAFKA-6433
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 1.0.0
>Reporter: Randall Hauch
>
> Currently, each distributed worker config must have the same `worker.id` and 
> must use the same internal topics for configs, offsets, and status. 
> Additionally, each worker must be configured to have the same connectors, 
> SMTs, and converters; confusing error messages will result when some workers 
> are able to deploy connector tasks with SMTs while others fail when they are 
> missing plugins the other workers do have.
> Ideally, a Connect workers would only be allowed to join the cluster if it 
> were "compatible" with the the existing cluster, where "compatible" perhaps 
> includes using the same internal topics and having the same set of plugins.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)