Anil Dasari created KAFKA-18669:
-----------------------------------

             Summary: Connector is restarted before it is started
                 Key: KAFKA-18669
                 URL: https://issues.apache.org/jira/browse/KAFKA-18669
             Project: Kafka
          Issue Type: Bug
          Components: connect
    Affects Versions: 3.3.1
            Reporter: Anil Dasari
         Attachments: trace-failed-9e08297d427e4e36be3909a393f1a4b4.csv, 
trace-successful-2465899102e54ff4bde6bd03b5c72a66.csv

I am setting up multiple Kafka Connect (KC) clusters on Amazon ECS, each with 2 
nodes and one connector per cluster.

When all KC clusters and connectors are created simultaneously, one of the 
connectors unexpectedly restarts. The issue is reproducible with min of 4 
clusters. One of the KC clusters connector is only restarted event the cluster 
count is increased to 10. 

I am unable to determine the root cause of this restart. It seems that the 
incremental balancing process is handling the request as if a configuration 
update is present, even though no actual config changes have occurred during 
the connector's startup phase.

Environment details
 # ECS KC cluster with node count 2
 # One connector per KC cluster
 # KC container Kafka version is 7.3.1-ccs (i.e Kafka 3.3.1)
 # Kafka version : 3.7.x

 

Logs: (timestamp, log message, level, thread)

 
{code:java}
"Jan 29, 2025 @ 09:57:35.936","[Worker clientId=connect-1, 
groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Starting connector 
c3ff31cf87ff4de0b63f3669546bf283",INFO,"StartAndStopExecutor-connect-1-1","i-0c2dd435cdbcec182"

"Jan 29, 2025 @ 09:57:35.938","Resolving value for path 
dbconfig/cdc/9e08297d427e4e36be3909a393f1a4b4 & property aliases: 
database_user,database_name,database_port,database_password,database_hostname",INFO,"StartAndStopExecutor-connect-1-1","i-0c2dd435cdbcec182""Jan
 29, 2025 @ 

"Jan 29, 2025 @ 09:57:38.414","Creating connector 
c3ff31cf87ff4de0b63f3669546bf283 of type 
com.abc.cdc.connect.postgres.CdcPostgresConnector",INFO,"StartAndStopExecutor-connect-1-1","i-0c2dd435cdbcec182"
 

"Jan 29, 2025 @ 09:57:38.426","Instantiated connector 
c3ff31cf87ff4de0b63f3669546bf283 with version 2.4.2.Final-cdc of type class 
com.abc.cdc.connect.postgres.CdcPostgresConnector",INFO,"StartAndStopExecutor-connect-1-1","i-0c2dd435cdbcec182"

"Jan 29, 2025 @ 09:57:38.432","Finished creating connector 
c3ff31cf87ff4de0b63f3669546bf283",INFO,"StartAndStopExecutor-connect-1-1","i-0c2dd435cdbcec182"

"Jan 29, 2025 @ 09:57:38.432","[Worker clientId=connect-1, 
groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Finished starting 
connectors and tasks",INFO,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182"

"Jan 29, 2025 @ 09:57:38.432","[Worker clientId=connect-1, 
groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Handling config updates 
with incremental cooperative 
rebalancing",TRACE,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182"

"Jan 29, 2025 @ 09:57:38.432","[Worker clientId=connect-1, 
groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Requesting rebalance due 
to reconfiguration of tasks (needsReconfigRebalance: 
true)",DEBUG,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182"

"Jan 29, 2025 @ 09:57:38.432","[Worker clientId=connect-1, 
groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Request joining group 
due to: connect worker requested 
rejoin",DEBUG,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182"

"Jan 29, 2025 @ 09:57:38.432","Running with cdc-debezium 
2.4.2.Final...",INFO,"connector-thread-c3ff31cf87ff4de0b63f3669546bf283","i-0c2dd435cdbcec182"

"Jan 29, 2025 @ 09:57:38.432","[Worker clientId=connect-1, 
groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Processing connector 
config updates; currently-owned connectors are 
[c3ff31cf87ff4de0b63f3669546bf283], and to-be-updated connectors are 
[c3ff31cf87ff4de0b63f3669546bf283]",TRACE,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182"

"Jan 29, 2025 @ 09:57:38.433","[Worker clientId=connect-1, 
groupId=pg-group-cdc-9e08297d427e4e36be3909a393f1a4b4] Handling connector-only 
config update by restarting connector 
c3ff31cf87ff4de0b63f3669546bf283",INFO,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182"

"Jan 29, 2025 @ 09:57:38.433","Stopping connector 
c3ff31cf87ff4de0b63f3669546bf283",INFO,"DistributedHerder-connect-1-1","i-0c2dd435cdbcec182"{code}
Attached both successful and failed KC connector logs.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to