> Re: JS2 I am not sure why you are saying that Strimzi has a limitation and doesn't provide a stable network identity. Strimzi uses an headless service and all brokers and controllers get a network identity and they are directly reachable with a usual DNS name like, for example, my-cluster-controller-0.my-namespace.svc.dns-domain (which by default is usually something like cluster.local but could be defined by the user depending on the infrastructure).
> Re: LC2 About "... some k8s operators format ... " can you provide me more information about which Kafka operators are you referring to? I think that an operator having such behavior as you describe, really lack of idempotency because in case of a controller rolling, it needs to make distinction if the new pod is starting up as a new controller cluster (the first one or an additional one) or it's just a rolling because other reasons (i.e. config change, manual restart, ...). Strimzi starts up all the controller nodes together using the bootstrap with multiple controllers which works fine. > JS4 What happens if a snapshot already exists at that log end offset? what do you mean by that? the code look at the end offset of the log, if it's the end offset how a snapshot could already exist? Maybe there is a lack of knowledge on my side here. > JS5 > This is not entirely accurate. Kafka nodes discover the active controller using controller.quorum.bootstrap.servers if defined. If it is not defined they fall back to using "controller.quorum.voters". In general, the endpoints in the voter set are used by the controllers (voters) to send KRaft election RPCs like VOTE, BEGIN_QUORUM_EPOCH, etc. To be honest, it's not what I experienced. If it was this way, my proposal was totally useless because the Strimzi operator already updates and roll all the controller nodes with the new controller.quorum.bootstrap.servers configuration with the new DNS names. But on restarting, each controller is still looking at the VotersRecord with the old DNS names and it's not taking care of the new controller.quorum.bootstrap.servers configuration with the new DNS names. Maybe Luke can confirm (or not?) what I just mentioned. > JS6 > I still don't understand why the Kafka k8s operators can't take advantage of k8s' Headless Service to have multiple DNS names for the same Kafka controller pods. Based on my research this is exactly how the etcd-operator manages etcd clusters hosted by k8s. At a high level, KRaft and etcd have very similar designs and configurations because they are both inspired by Raft. As already mentioned for JS2, Strimzi uses an headless service for the brokers and controllers and they all get a DNS name like, my-cluster-controller-0.my-namespace.svc.dns-domain. I am not sure what you mean by having a headless service with "multiple DNS names", it's not possible. Or I am misleading what you mean. Can you please provide any reference about what you found around etct-operator? Maybe it will help me understanding what you mean. Thanks! Thanks, Paolo On Thu, 18 Jun 2026 at 21:18, José Armando García Sancio via dev < [email protected]> wrote: > Hi Paolo, > > Re: JS2 > The solution you propose to address Stimiz's limitation—not providing > a stable network layer to the Kafka StatefulSet—is incompatible with > KRaft's replication and dynamic reconfiguration. In short, if the KIP > overrides the voters per node, it will cause diverging states across > the nodes when dynamic reconfiguration is present. > > It is important to distinguish between required and standard > operations like formatting the bootstrapping controller(s), and > dangerous recovery operations like overriding the voter set endpoints > without KRaft's validations and invariants. > > Re: LC2 > As Luke mentioned, we are making a concerted effort to remove the need > to format the Kafka nodes. With KIP-1262 users and k8s operators are > only required to run format on the initial/bootstrapping controllers. > For example, some k8s operators format the kafka cluster by formatting > only one controller with --standalone and then increasing the > controller cluster by adding the other controllers using the > mechanisms provided by KIP-853. > > JS4 > > Create a new snapshot at the current log end offset containing: > What happens if a snapshot already exists at that log end offset? > > JS5 > In the "Broker considerations" section you have: > "It uses these endpoints to connect to the KRaft controller quorum. > The controller.quorum.bootstrap.servers configuration is not used to > reach out the controllers." > > This is not entirely accurate. Kafka nodes discover the active > controller using controller.quorum.bootstrap.servers if defined. If it > is not defined they fall back to using "controller.quorum.voters". In > general, the endpoints in the voter set are used by the controllers > (voters) to send KRaft election RPCs like VOTE, BEGIN_QUORUM_EPOCH, > etc. > > JS6 > I still don't understand why the Kafka k8s operators can't take > advantage of k8s' Headless Service to have multiple DNS names for the > same Kafka controller pods. Based on my research this is exactly how > the etcd-operator manages etcd clusters hosted by k8s. At a high > level, KRaft and etcd have very similar designs and configurations > because they are both inspired by Raft. > > Thanks, > -Jose > > > > On Mon, May 18, 2026 at 9:56 AM Paolo Patierno <[email protected]> > wrote: > > > > Hi all, > > I would like to start a discussion on KIP-1347 > > < > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-1347*3A*Overriding*voter*set*on*storage*formatting__;JSsrKysrKw!!Ayb5sqE7!rJw9_TvAMPlGRNosHTx9GCpbIjQNdzlfi9c0kE28-lbpMJcc4ulXcH089XM47j6eDRhOMwL6aNHMGBkNsvFlg6fOA5o$ > > > > which > > is about allowing the override of the voter set through the storage > > formatting tool to recover a disaster scenario where the KRaft quorum > can't > > be formed anymore. This KIP aims to fix KAFKA-20427 > > < > https://urldefense.com/v3/__https://issues.apache.org/jira/browse/KAFKA-20427__;!!Ayb5sqE7!rJw9_TvAMPlGRNosHTx9GCpbIjQNdzlfi9c0kE28-lbpMJcc4ulXcH089XM47j6eDRhOMwL6aNHMGBkNsvFlqZsa23U$ > >. > > Any feedback is very welcome. > > > > Thanks, > > Paolo. > > > > -- > > Paolo Patierno > > > > *Senior Principal Software Engineer @ IBM**CNCF Ambassador* > > > > Twitter : @ppatierno < > https://urldefense.com/v3/__http://twitter.com/ppatierno__;!!Ayb5sqE7!rJw9_TvAMPlGRNosHTx9GCpbIjQNdzlfi9c0kE28-lbpMJcc4ulXcH089XM47j6eDRhOMwL6aNHMGBkNsvFlfwO9F0Y$ > > > > Linkedin : paolopatierno < > https://urldefense.com/v3/__http://it.linkedin.com/in/paolopatierno__;!!Ayb5sqE7!rJw9_TvAMPlGRNosHTx9GCpbIjQNdzlfi9c0kE28-lbpMJcc4ulXcH089XM47j6eDRhOMwL6aNHMGBkNsvFlA3NPTbA$ > > > > GitHub : ppatierno < > https://urldefense.com/v3/__https://github.com/ppatierno__;!!Ayb5sqE7!rJw9_TvAMPlGRNosHTx9GCpbIjQNdzlfi9c0kE28-lbpMJcc4ulXcH089XM47j6eDRhOMwL6aNHMGBkNsvFlPqZuL0A$ > > > > > > -- > -José > -- Paolo Patierno *Senior Principal Software Engineer @ IBM**CNCF Ambassador* Twitter : @ppatierno <http://twitter.com/ppatierno> Linkedin : paolopatierno <http://it.linkedin.com/in/paolopatierno> GitHub : ppatierno <https://github.com/ppatierno>
