Hi Kevin, Thanks for the KIP and excuse my delay response. JS1: Can you clarify that this KIP removes the need for all Kafka nodes to be formatted pior to starting Kafka. However, this doesn't prevent users from formatting their broker with a cluster ID if they prefer. This is especially needed for Kafka nodes formatted for a cluster using an MV that doesn't support this feature.
JS2: How are you planning to implement "kafka-storage format --clusterid YYY --standalone"? Is that going to behave like it does today by writing the cluster id to the meta.properties files? Or are you planning to write the cluster id using the ClusterIdRecord to the bootstrap.checkpoint or 0-0.checkpoint (after KIP-1170)? JS3: In one of your replies you say "Discovering the cluster id value for the first time would only require a single FetchSnapshot or a Fetch of the bootstrap metadata records." This is not entirely accurate. The best we can say is that brokers need to catch up to the HWM before they can send a registration requests to the active controller or it can start a few internal component. However, the broker already had this requirement prior to this KIP, so it is not new. JS4: In the KIP you mention "if meta.properties does not exist and the node is a bootstrap controller, throw a runtime exception." Can you explain how you plan to implement this? One important aspect to consider is that in KRaft voters (controllers) are identified by the node ID and directory ID. A node can recover from a disk failure by coming back with the same node ID but a different directory ID. In this case, the controller should auto-recover if the auto-join feature is enabled. JS5: In the KIP you mention "One detail here is that observer controllers with auto-join must wait until they have a cluster id before trying to add or remove themselves." I understand the reason for this requirement. If a node auto-joins the controller cluster, you must guarantee that it knows the cluster id in case it becomes the leader and needs to write the ClusterIDRecord. Can you elaborate on your implementation plan? JS6: In the KIP you mention "This can be implemented as a MetadataPublisher that registers to the raft client alongside the MetadataLoader." Metadata publishers don't register with the KRaft client. RaftClient.Listener register with the KRaft client. Metadata publisher register with the metadata loader instead. JS7: One complexity is that there is a meta.properties per log directory and metadata log directory. This means that in the stable case the cluster ID exists in all the meta.properties files. Unfortunately, this may not be the case for several reasons: 1) the disk was replaced, 2) a new disk was added, or 3) the write operation was only partially successful. How do you plan to handle this case? Consider that the controller and the broker can run on the same JVM and use a log directory different from the metadata log directory. Controllers only read and write to the metadata log directory. JS8: In the KIP you mention "Learning of a HWM from the leader, which the leader allows for because it will send valid fetch responses back to nodes that do not have a cluster id." One implementation complexity is that KRaft can discover the HWM and send a handleCommit event without having fetched all data up to the HWM. What KRaft guarantees is that the active leader will not receive a handleLeaderChange event until it has caught up to the leader's epoch. How do you plan to implement this? Thanks, -- -José
