David Arthur created KAFKA-16007: ------------------------------------ Summary: ZK migrations can be slow for large clusters Key: KAFKA-16007 URL: https://issues.apache.org/jira/browse/KAFKA-16007 Project: Kafka Issue Type: Improvement Components: controller, kraft Reporter: David Arthur Assignee: David Arthur Fix For: 3.7.0, 3.6.2
On a large cluster with many single-partition topics, the ZK to KRaft migration took nearly half an hour: {code} [KRaftMigrationDriver id=9990] Completed migration of metadata from ZooKeeper to KRaft. 157396 records were generated in 2245862 ms across 67132 batches. The record types were {TOPIC_RECORD=66282, PARTITION_RECORD=72067, CONFIG_RECORD=17116, PRODUCER_IDS_RECORD=1, ACCESS_CONTROL_ENTRY_RECORD=1930}. The current metadata offset is now 332267 with an epoch of 19. Saw 36 brokers in the migrated metadata [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]. {code} This is a result of how we generate batches of records when traversing the ZK tree. Since we now using metadata transactions for the migration, we can re-batch these without any consistency problems. -- This message was sent by Atlassian Jira (v8.20.10#820010)