Chris Egerton created KAFKA-10876:
-------------------------------------
Summary: Duplicate connector/task create requests lead to
incorrect FAILED status
Key: KAFKA-10876
URL: https://issues.apache.org/jira/browse/KAFKA-10876
Project: Kafka
Issue Type: Bug
Components: KafkaConnect
Reporter: Chris Egerton
If a Connect worker tries to start a connector or task that it is already
running, an error will be logged and the connector/task will be marked as
{{FAILED}}. This logic is implemented in several places:
*
[https://github.com/apache/kafka/blob/300909d9e60eb1d5e80f4d744d3662a105ac0c15/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L257-L262]
*
[https://github.com/apache/kafka/blob/300909d9e60eb1d5e80f4d744d3662a105ac0c15/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L299-L306]
*
[https://github.com/apache/kafka/blob/300909d9e60eb1d5e80f4d744d3662a105ac0c15/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L511-L512]
*
[https://github.com/apache/kafka/blob/300909d9e60eb1d5e80f4d744d3662a105ac0c15/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L570-L572]
Although it's certainly abnormal for a worker to run into this case and an
{{ERROR}}-level log message is warranted when it occurs, the connector/task
should not be marked as {{FAILED}}, as there is still an instance of that
connector/task still running on the worker.
Either the worker logic should be updated to avoid marking connectors/tasks as
{{FAILED}} in this case, or it should manually halt the existing connector/task
before creating a new instance in its place. The first option is easier and
more intuitive, but if it's ever possible that the already-running
connector/task instance has an outdated configuration and the to-be-created
connector/task has an up-to-date configuration, the second option would have
correct behavior (while the first would not).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)