[ https://issues.apache.org/jira/browse/KAFKA-17751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Juha Mynttinen updated KAFKA-17751: ----------------------------------- Description: Hey, I'm using 3.9.0 RC0. I noticed that formatting a simple three node controller cluster with --initial-controllers and starting the controller leads to a situation where the non-leader voters consume a lot of CPU. Here are the steps to reproduce. The needed configuration files are attached. Clean up and setup the environment. rm -rf /tmp/controllers && \ mkdir -p /tmp/controllers/c1 && \ mkdir -p /tmp/controllers/c2 && \ mkdir -p /tmp/controllers/c3 export KAFKA_HOME=<your_kafka_3_9_home> Format the controllers $KAFKA_HOME/bin/kafka-storage.sh format --cluster-id 00000000-0000-0000-0000-000000000001 --initial-controllers 1001@localhost:10001:AAAAAAAAAAEAAAAAAAAAAA,1002@localhost:10002:AAAAAAAAAAEAAAAAAAAAAA,1003@localhost:10003:AAAAAAAAAAEAAAAAAAAAAA --config c1.properties $KAFKA_HOME/bin/kafka-storage.sh format --cluster-id 00000000-0000-0000-0000-000000000001 --initial-controllers 1001@localhost:10001:AAAAAAAAAAEAAAAAAAAAAA,1002@localhost:10002:AAAAAAAAAAEAAAAAAAAAAA,1003@localhost:10003:AAAAAAAAAAEAAAAAAAAAAA --config c2.properties $KAFKA_HOME/bin/kafka-storage.sh format --cluster-id 00000000-0000-0000-0000-000000000001 --initial-controllers 1001@localhost:10001:AAAAAAAAAAEAAAAAAAAAAA,1002@localhost:10002:AAAAAAAAAAEAAAAAAAAAAA,1003@localhost:10003:AAAAAAAAAAEAAAAAAAAAAA --config c3.properties Start the controllers, in separate terminals $KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c1.properties $KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c2.properties $KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c3.properties Observe two of the controllers have CPU usage at 100%. If you check which PID is which, you can see that it's the two processes that are voters that have elevated CPU. The CPU usage of the leader is fine. I did in an slightly different environment some profiling. The screenshot is attached. was: Hey, I'm using 3.9.0 RC0. I noticed that formatting a simple three node controller cluster with --initial-controllers and starting the controller leads to a situation where the non-leader voters consume a lot of CPU. Here are the steps to reproduce. The needed configuration files are attached. Clean up and setup the environment. rm -rf /tmp/controllers && \ mkdir -p /tmp/controllers/c1 && \ mkdir -p /tmp/controllers/c2 && \ mkdir -p /tmp/controllers/c3 && \ mkdir -p /tmp/controllers/c4 export KAFKA_HOME=<your_kafka_3_9_home> Format the controllers $KAFKA_HOME/bin/kafka-storage.sh format --cluster-id 00000000-0000-0000-0000-000000000001 --initial-controllers 1001@localhost:10001:AAAAAAAAAAEAAAAAAAAAAA,1002@localhost:10002:AAAAAAAAAAEAAAAAAAAAAA,1003@localhost:10003:AAAAAAAAAAEAAAAAAAAAAA --config c1.properties $KAFKA_HOME/bin/kafka-storage.sh format --cluster-id 00000000-0000-0000-0000-000000000001 --initial-controllers 1001@localhost:10001:AAAAAAAAAAEAAAAAAAAAAA,1002@localhost:10002:AAAAAAAAAAEAAAAAAAAAAA,1003@localhost:10003:AAAAAAAAAAEAAAAAAAAAAA --config c2.properties $KAFKA_HOME/bin/kafka-storage.sh format --cluster-id 00000000-0000-0000-0000-000000000001 --initial-controllers 1001@localhost:10001:AAAAAAAAAAEAAAAAAAAAAA,1002@localhost:10002:AAAAAAAAAAEAAAAAAAAAAA,1003@localhost:10003:AAAAAAAAAAEAAAAAAAAAAA --config c3.properties Start the controllers, in separate terminals $KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c1.properties $KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c2.properties $KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c3.properties Observe two of the controllers have CPU usage at 100%. If you check which PID is which, you can see that it's the two processes that are voters that have elevated CPU. The CPU usage of the leader is fine. I did in an slightly different environment some profiling. The screenshot is attached. > Contoller high CPU when formatted with --initial-controllers > ------------------------------------------------------------- > > Key: KAFKA-17751 > URL: https://issues.apache.org/jira/browse/KAFKA-17751 > Project: Kafka > Issue Type: Bug > Affects Versions: 3.9.0 > Reporter: Juha Mynttinen > Priority: Major > Attachments: Screenshot 2024-10-09 at 9.15.06.png, c1.properties, > c2.properties, c3.properties > > > Hey, > I'm using 3.9.0 RC0. > I noticed that formatting a simple three node controller cluster with > --initial-controllers and starting the controller leads to a situation where > the non-leader voters consume a lot of CPU. > Here are the steps to reproduce. The needed configuration files are attached. > Clean up and setup the environment. > rm -rf /tmp/controllers && \ > mkdir -p /tmp/controllers/c1 && \ > mkdir -p /tmp/controllers/c2 && \ > mkdir -p /tmp/controllers/c3 > export KAFKA_HOME=<your_kafka_3_9_home> > Format the controllers > $KAFKA_HOME/bin/kafka-storage.sh format --cluster-id > 00000000-0000-0000-0000-000000000001 --initial-controllers > 1001@localhost:10001:AAAAAAAAAAEAAAAAAAAAAA,1002@localhost:10002:AAAAAAAAAAEAAAAAAAAAAA,1003@localhost:10003:AAAAAAAAAAEAAAAAAAAAAA > --config c1.properties > $KAFKA_HOME/bin/kafka-storage.sh format --cluster-id > 00000000-0000-0000-0000-000000000001 --initial-controllers > 1001@localhost:10001:AAAAAAAAAAEAAAAAAAAAAA,1002@localhost:10002:AAAAAAAAAAEAAAAAAAAAAA,1003@localhost:10003:AAAAAAAAAAEAAAAAAAAAAA > --config c2.properties > $KAFKA_HOME/bin/kafka-storage.sh format --cluster-id > 00000000-0000-0000-0000-000000000001 --initial-controllers > 1001@localhost:10001:AAAAAAAAAAEAAAAAAAAAAA,1002@localhost:10002:AAAAAAAAAAEAAAAAAAAAAA,1003@localhost:10003:AAAAAAAAAAEAAAAAAAAAAA > --config c3.properties > Start the controllers, in separate terminals > $KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka > c1.properties > $KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka > c2.properties > $KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka > c3.properties > Observe two of the controllers have CPU usage at 100%. If you check which PID > is which, you can see that it's the two processes that are voters that have > elevated CPU. The CPU usage of the leader is fine. > I did in an slightly different environment some profiling. The screenshot is > attached. -- This message was sent by Atlassian Jira (v8.20.10#820010)