Hi Luke and Colin, On Mon, Apr 7, 2025 at 10:29 PM Luke Chen <show...@gmail.com> wrote: > That's why we were discussing if there's any way to "force" recover the > scenario, even if it's possible to have data loss.
Yes. There is a way. They need to configure a controller cluster that matches the voter set in the cluster metadata partition. That means a controller cluster that matches the node ids, directory ids, and the snapshot and log segments match with the consistent cluster metadata partition. They can do that manually today. I think that Colin is suggesting a tool to make this easier. The user should understand that these manual operations are extremely dangerous and can result in data loss in the cluster metadata partition. A Kafka cluster cannot recover from loss of data in the cluster metadata partition. For example, partition leader epochs can decrease because of data loss in the cluster metadata partition and Kafka brokers don't handle decreasing partition leader epochs. If the user doesn't understand kraft's protocol to some degree, it is unlikely that they can blindly follow some instruction and be successful in their recovery. I am hesitant to give users the impression that Kafka can tolerate and recover from data loss in the cluster metadata partition. What do you think? -- -José