Anil Dasari created KAFKA-18669: ----------------------------------- Summary: Connector is restarted before it is started Key: KAFKA-18669 URL: https://issues.apache.org/jira/browse/KAFKA-18669 Project: Kafka Issue Type: Bug Components: connect Affects Versions: 3.3.1 Reporter: Anil Dasari Attachments: trace-failed-9e08297d427e4e36be3909a393f1a4b4.csv, trace-successful-2465899102e54ff4bde6bd03b5c72a66.csv
I am setting up multiple Kafka Connect (KC) clusters on Amazon ECS, each with 2 nodes and one connector per cluster. When all KC clusters and connectors are created simultaneously, one of the connectors unexpectedly restarts. The issue is reproducible with min of 4 clusters. One of the KC clusters connector is only restarted event the cluster count is increased to 10. I am unable to determine the root cause of this restart. It seems that the incremental balancing process is handling the request as if a configuration update is present, even though no actual config changes have occurred during the connector's startup phase. Environment details # ECS KC cluster with node count 2 # One connector per KC cluster # KC container Kafka version is 7.3.1-ccs (i.e Kafka 3.3.1) # Kafka version : 3.7.x Logs: (timestamp, log message, level, thread) {code:java} "Jan 29, 2025 @ 09:57:35.936","[Worker clientId=connect-1, groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Starting connector c3ff31cf87ff4de0b63f3669546bf283",INFO,"StartAndStopExecutor-connect-1-1","i-0c2dd435cdbcec182" "Jan 29, 2025 @ 09:57:35.938","Resolving value for path dbconfig/cdc/9e08297d427e4e36be3909a393f1a4b4 & property aliases: database_user,database_name,database_port,database_password,database_hostname",INFO,"StartAndStopExecutor-connect-1-1","i-0c2dd435cdbcec182""Jan 29, 2025 @ "Jan 29, 2025 @ 09:57:38.414","Creating connector c3ff31cf87ff4de0b63f3669546bf283 of type com.abc.cdc.connect.postgres.CdcPostgresConnector",INFO,"StartAndStopExecutor-connect-1-1","i-0c2dd435cdbcec182" "Jan 29, 2025 @ 09:57:38.426","Instantiated connector c3ff31cf87ff4de0b63f3669546bf283 with version 2.4.2.Final-cdc of type class com.abc.cdc.connect.postgres.CdcPostgresConnector",INFO,"StartAndStopExecutor-connect-1-1","i-0c2dd435cdbcec182" "Jan 29, 2025 @ 09:57:38.432","Finished creating connector c3ff31cf87ff4de0b63f3669546bf283",INFO,"StartAndStopExecutor-connect-1-1","i-0c2dd435cdbcec182" "Jan 29, 2025 @ 09:57:38.432","[Worker clientId=connect-1, groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Finished starting connectors and tasks",INFO,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182" "Jan 29, 2025 @ 09:57:38.432","[Worker clientId=connect-1, groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Handling config updates with incremental cooperative rebalancing",TRACE,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182" "Jan 29, 2025 @ 09:57:38.432","[Worker clientId=connect-1, groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Requesting rebalance due to reconfiguration of tasks (needsReconfigRebalance: true)",DEBUG,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182" "Jan 29, 2025 @ 09:57:38.432","[Worker clientId=connect-1, groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Request joining group due to: connect worker requested rejoin",DEBUG,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182" "Jan 29, 2025 @ 09:57:38.432","Running with cdc-debezium 2.4.2.Final...",INFO,"connector-thread-c3ff31cf87ff4de0b63f3669546bf283","i-0c2dd435cdbcec182" "Jan 29, 2025 @ 09:57:38.432","[Worker clientId=connect-1, groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Processing connector config updates; currently-owned connectors are [c3ff31cf87ff4de0b63f3669546bf283], and to-be-updated connectors are [c3ff31cf87ff4de0b63f3669546bf283]",TRACE,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182" "Jan 29, 2025 @ 09:57:38.433","[Worker clientId=connect-1, groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Handling connector-only config update by restarting connector c3ff31cf87ff4de0b63f3669546bf283",INFO,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182" "Jan 29, 2025 @ 09:57:38.433","Stopping connector c3ff31cf87ff4de0b63f3669546bf283",INFO,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182"{code} Attached both successful and failed KC connector logs. -- This message was sent by Atlassian Jira (v8.20.10#820010)