Hi Paul, Thanks for the KIP. A few comments.
To me, biggest question here is if we can fix this behavior without adding a config. In particular, today, we don't even set the client.id for the producer and consumer at all, right? The *only* way it is set is if you include an override in the worker config, but in that case you need to be explicitly opting in with a `producer.` or `consumer.` prefix, i.e. the settings are `producer.client.id` and `consumer.client.id`. Otherwise, I think we're getting the default behavior where we generate unique, per-process IDs, i.e. via this logic https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L662-L664 If that's the case, would it maybe be possible to compatibly change the default to use task IDs in the client ID, but only if we don't see an existing override from the worker config? This would only change the behavior when someone is using the default, but since the default would just use what is effectively a random ID that is useless for monitoring metrics, presumably this wouldn't affect any existing users. I think that would avoid having to introduce the config, give better out of the box behavior, and still be a safe, compatible change to make. Other than that, just two minor comments. On the config naming, not sure about a better name, but I think the config name could be a bit clearer if we need to have it. Maybe something including "task" like "task.based.client.ids" or something like that (or change the type to be an enum and make it something like task.client.ids=[default|task] and leave it open for extension in the future if needed). Finally, you have this: *"Allow overriding client.id <http://client.id/> on a per-connector basis"* > > This is a much more complex change, and would require individual > connectors to be updated to support the change. In contrast, the proposed > approach would immediately allow detailed consumer/producer monitoring for > all existing connectors. > I don't think this is quite accurate. I think the reason to reject is that for your particular requirement for metrics, it simply doesn't give enough granularity (there's only one value per entire connector), but it doesn't require any changes to connectors. The framework allocates all of these and there are already framework-defined config values that all connectors share (some for only sinks or sources), so the framework can handle all of this without changes to connectors. Further, with connector-specific overrides, you could get task-specific values if interpolation were supported in the config value (as we now do with managed secrets). For example, it could support something like client.id=connector-${taskId} and the task ID would be substituted automatically into the string. I don't necessarily like that solution (seems complicated and not a great user experience), but it could work. -Ewen On Thu, Dec 20, 2018 at 5:05 PM Paul Davidson <pdavid...@salesforce.com> wrote: > Hi everyone, > > I would like to start a discussion around the following KIP: > * > https://cwiki.apache.org/confluence/display/KAFKA/KIP-411%3A+Add+option+to+make+Kafka+Connect+task+client+ID+values+unique > < > https://cwiki.apache.org/confluence/display/KAFKA/KIP-411%3A+Add+option+to+make+Kafka+Connect+task+client+ID+values+unique > >* > > This proposes a small change to allow Kafka Connect the option to > auto-generate unique client IDs for each task. This enables granular > monitoring of the producer / consumer client in each task. > > Feedback is appreciated, thanks in advance! > > Paul >