Kevin, thanks for the KIP! This one seems obvious now that it's written :)

Overall I think it looks good, I just have a few questions.

DA1) What happens if you unregister a running controller that is in the
voter set? Will this fail?

DA2) What about an offline controller that is in the voter set?

DA3) Do we have error codes spelled out for these cases?

DA4) Maybe not for this KIP, but do we have an incarnation ID for a
controller's registration? Since a controller can now be registered,
unregistered, and then registered again, it might be useful to see the
controller's current "incarnation." In general, when objects/resources can
be deleted and recreated with the same ID, there can be weird side effects
sometimes. We've had plenty of bugs in Kafka related to this which we
usually solve by adding an epoch.

Thanks!
David A

On Tue, May 5, 2026 at 12:34 PM Kevin Wu <[email protected]> wrote:

> Hi TaiJu,
>
> Thanks for the question.
>
> TJ0: Do you mean remove the controller as a voter or unregister the
> controller? As long as a controller is able to contact the leader, it
> should attempt to register with the cluster if it does not see
> its registration in the log. This is a similar behavior to how auto-join
> works for KRaft voters.
>
> Best,
> Kevin Wu
>
> On Tue, May 5, 2026 at 9:26 AM Kevin Wu <[email protected]> wrote:
>
> > Hi Paolo,
> >
> > Yes, you would need to bring up a controller with the same node ID. I
> have
> > updated the KIP to be explicit about that.
> >
> > Best,
> > Kevin Wu
> >
> > On Tue, May 5, 2026 at 8:23 AM TaiJu Wu <[email protected]> wrote:
> >
> >> Hi Kevin,
> >>
> >> TJ0:
> >> Could we remove a controller if auto join is enabled because the
> >> controller
> >> will join to cluster as voter immediately.
> >>
> >> Best regards,
> >> TaiJuWu
> >>
> >>
> >> Paolo Patierno <[email protected]> 於 2026年5月5日週二 下午5:52寫道:
> >>
> >> > Hi Kevin,
> >> > I know you started the vote but I have just one little addition.
> >> > In the "Compatibility, Deprecation, and Migration Plan" section you
> >> > mention:
> >> >
> >> > > The main reason for not supporting this in existing clusters is that
> >> in
> >> > many environments, operators can simply bring up another controller
> >> node to
> >> > "refresh" its registration.
> >> >
> >> > I think you should specify that the new controller has to have the
> same
> >> ID
> >> > (to "refresh" its registration), it can't be a new controller with any
> >> ID.
> >> > Is that right?
> >> >
> >> > Thanks,
> >> > Paolo
> >> >
> >> > On Fri, 1 May 2026 at 20:49, Kevin Wu <[email protected]> wrote:
> >> >
> >> > > Hi Jun,
> >> > >
> >> > > Thanks for the reply.
> >> > >
> >> > > RE JR4: Sounds good. I have added this user experience to the KIP.
> >> > >
> >> > > Best,
> >> > > Kevin Wu
> >> > >
> >> > > On Fri, May 1, 2026 at 12:33 PM Jun Rao via dev <
> [email protected]
> >> >
> >> > > wrote:
> >> > >
> >> > > > Hi, Kevin,
> >> > > >
> >> > > > Thanks for the reply.
> >> > > >
> >> > > > JR4. Your explanation makes sense. Perhaps we could add another
> user
> >> > > > experience: "Remove a KRaft voter in a dynamic quorum and keep it
> >> > > > registered as an observer controller". In this case, the user will
> >> run
> >> > > > `kafka-metadata-quorum remove-controller` without the
> `--unregister`
> >> > > flag.
> >> > > >
> >> > > > Jun
> >> > > >
> >> > > > On Thu, Apr 30, 2026 at 4:59 PM Kevin Wu <[email protected]>
> >> > wrote:
> >> > > >
> >> > > > > Hi Jun,
> >> > > > >
> >> > > > > Thanks for the reply.
> >> > > > >
> >> > > > > RE JR4: To me, the main motivation for having an explicit
> >> > > `--unregister`
> >> > > > > flag is that `remove-controller` and `unregister-controller`
> >> assume
> >> > two
> >> > > > > different things about the supplied node. For removing a node
> from
> >> > the
> >> > > > > KRaft voter set, no assumption is made about whether the node is
> >> > > running
> >> > > > > anymore -- Kafka supports either case. However, the act of
> >> > > unregistering
> >> > > > a
> >> > > > > controller requires assuming that the node will "not be around
> >> soon."
> >> > > > This
> >> > > > > is because subsequent feature upgrades will no longer consider
> the
> >> > > > > supported levels of an unregistered controller.
> >> > > > >
> >> > > > > An operator may decide to keep a node around as an observer,
> >> possibly
> >> > > > with
> >> > > > > the intention to make it a voter in the future. Making the
> >> > > unregistration
> >> > > > > always occur alongside voter removal would make the observer
> >> > controller
> >> > > > in
> >> > > > > the example above unregister and then re-register because the
> >> node is
> >> > > > still
> >> > > > > around. This allows for the feature upgrade race I mentioned
> >> > previously
> >> > > > > (i.e. controller unregisters, operator upgrades a feature that
> >> should
> >> > > not
> >> > > > > be supported, controller re-registers). Therefore, I think we
> >> should
> >> > > have
> >> > > > > an explicit `--unregister` flag for `remove-controller` since
> the
> >> > > > > assumptions around the state of the cluster change compared to
> the
> >> > base
> >> > > > > command. What do you think?
> >> > > > >
> >> > > > > RE JR5: Yeah, I believe so. Thanks for catching this case. One
> >> could
> >> > > > > specify controller.quorum.bootstrap.servers instead of
> >> > > > > controller.quorum.voters on a controller in a static quorum.
> This
> >> > would
> >> > > > be
> >> > > > > a valid static config that passes the check in
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> `KafkaConfig#validateControllerQuorumVotersMustContainNodeIdForKRaftController`.
> >> > > > > I have updated the KIP with these changes.
> >> > > > >
> >> > > > > RE JR6: Yes, it should say "every Kafka node." I have updated
> the
> >> KIP
> >> > > to
> >> > > > > fix this.
> >> > > > >
> >> > > > > Best,
> >> > > > > Kevin Wu
> >> > > > >
> >> > > > > On Thu, Apr 30, 2026 at 6:12 PM Jun Rao via dev <
> >> > [email protected]>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Hi, Kevin,
> >> > > > > >
> >> > > > > > Thanks for the reply.
> >> > > > > >
> >> > > > > > JR4. Is there a use case for `kafka-metadata-quorum
> >> > > remove-controller`
> >> > > > > > without the `--unregister` flag? If not, could we remove the
> >> > > > --unregister
> >> > > > > > flag?
> >> > > > > >
> >> > > > > > JR5. For the second user experience "Unregister an observer
> >> > > controller
> >> > > > > in a
> >> > > > > > dynamic quorum", one can have and remove an observer
> controller
> >> in
> >> > > the
> >> > > > > > static quorum too, right?
> >> > > > > >
> >> > > > > > JR6. "Ensure the stopped voter is not part of
> >> > > controller.quorum.voters
> >> > > > on
> >> > > > > > any other Kafka nodes"
> >> > > > > > "any other Kafka nodes" should be "every Kafka node", right?
> >> > > > > >
> >> > > > > > Jun
> >> > > > > >
> >> > > > > > On Mon, Apr 27, 2026 at 1:33 PM Kevin Wu <
> >> [email protected]>
> >> > > > wrote:
> >> > > > > >
> >> > > > > > > Hi Jun,
> >> > > > > > >
> >> > > > > > > Thanks for the feedback.
> >> > > > > > >
> >> > > > > > > I have updated the KIP to make a separate section detailing
> >> the
> >> > > user
> >> > > > > > > experience.
> >> > > > > > >
> >> > > > > > > Best,
> >> > > > > > > Kevin Wu
> >> > > > > > >
> >> > > > > > > On Mon, Apr 27, 2026 at 12:05 PM Jun Rao via dev <
> >> > > > [email protected]
> >> > > > > >
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Hi, Kevin,
> >> > > > > > > >
> >> > > > > > > > Thanks for the reply.
> >> > > > > > > >
> >> > > > > > > > It would be useful to have a separate user experience
> >> section
> >> > > that
> >> > > > > > > > documents the steps for common scenarios involving the
> >> tools.
> >> > > > > > > >
> >> > > > > > > > The scenarios are:
> >> > > > > > > > 1. Remove a voter in dynamic KRaft quorum
> >> > > > > > > > stop the voter
> >> > > > > > > > run kafka-metadata-quorum remove-controller with
> >> --unregister
> >> > > > > > > > 2. Unregister an observer controller
> >> > > > > > > > stop the observer
> >> > > > > > > > run kafka-cluster unregister-controller
> >> > > > > > > > 3. Unregister a voter in a static KRaft quorum when the
> >> static
> >> > > > voter
> >> > > > > > set
> >> > > > > > > is
> >> > > > > > > > mistakenly configured.
> >> > > > > > > > stop the voter
> >> > > > > > > > run kafka-cluster unregister-controller
> >> > > > > > > > remove voter from controller.quorum.voters ?
> >> > > > > > > >
> >> > > > > > > > Jun
> >> > > > > > > >
> >> > > > > > > > On Fri, Apr 24, 2026 at 11:49 AM Kevin Wu <
> >> > > [email protected]>
> >> > > > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > Hi Jun,
> >> > > > > > > > >
> >> > > > > > > > > Thanks for the discussion.
> >> > > > > > > > > Yeah, those are the scenarios for using these tools. I
> >> have
> >> > > > > > documented
> >> > > > > > > > > their usage in the KIP.
> >> > > > > > > > >
> >> > > > > > > > > Best,
> >> > > > > > > > > Kevin Wu
> >> > > > > > > > >
> >> > > > > > > > > On Thu, Apr 23, 2026 at 11:51 AM Jun Rao via dev <
> >> > > > > > [email protected]
> >> > > > > > > >
> >> > > > > > > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > > > Hi, Kevin,
> >> > > > > > > > > >
> >> > > > > > > > > > Thanks for the reply.
> >> > > > > > > > > >
> >> > > > > > > > > > Your suggestion sounds good to me. It would be useful
> to
> >> > > > document
> >> > > > > > the
> >> > > > > > > > > usage
> >> > > > > > > > > > of those tools. The scenarios are:
> >> > > > > > > > > > 1. Remove a voter in dynamic KRaft quorum
> >> > > > > > > > > > 2. Unregister an observer controller
> >> > > > > > > > > > 3. Unregister a voter in a static KRaft quorum when
> the
> >> > > static
> >> > > > > > voter
> >> > > > > > > > set
> >> > > > > > > > > is
> >> > > > > > > > > > mistakenly configured.
> >> > > > > > > > > >
> >> > > > > > > > > > For item 3, could you document how it works? Does one
> >> need
> >> > to
> >> > > > > stop
> >> > > > > > > the
> >> > > > > > > > > > misconfigured voter first and then unregister it?
> >> > > > > > > > > >
> >> > > > > > > > > > Are there other scenarios?
> >> > > > > > > > > >
> >> > > > > > > > > > Jun
> >> > > > > > > > > >
> >> > > > > > > > > > On Thu, Apr 23, 2026 at 8:22 AM Kevin Wu <
> >> > > > [email protected]
> >> > > > > >
> >> > > > > > > > wrote:
> >> > > > > > > > > >
> >> > > > > > > > > > > Hi Jun,
> >> > > > > > > > > > >
> >> > > > > > > > > > > Thanks for the replies.
> >> > > > > > > > > > >
> >> > > > > > > > > > > RE JR3: I would like the design of this feature to
> not
> >> > > > > introduce
> >> > > > > > > more
> >> > > > > > > > > > > coupling of the KRaft and metadata layers. Observer
> >> > > > controllers
> >> > > > > > are
> >> > > > > > > > > > > supported, but they are a KRaft concept, so it
> should
> >> not
> >> > > be
> >> > > > > > known
> >> > > > > > > to
> >> > > > > > > > > the
> >> > > > > > > > > > > metadata layer whether or not a given controller is
> a
> >> > voter
> >> > > > or
> >> > > > > > > > > observer.
> >> > > > > > > > > > >
> >> > > > > > > > > > > What do you think about the following documentation
> >> and
> >> > > > > execution
> >> > > > > > > > > pattern
> >> > > > > > > > > > > regarding these CLI commands?
> >> > > > > > > > > > >
> >> > > > > > > > > > > `kafka-cluster unregister-controller` is a command
> for
> >> > > users
> >> > > > > when
> >> > > > > > > > they
> >> > > > > > > > > > want
> >> > > > > > > > > > > to unregister a controller from the cluster. We can
> >> > > document
> >> > > > > that
> >> > > > > > > > this
> >> > > > > > > > > is
> >> > > > > > > > > > > potentially unsafe and should only be done if the
> >> > operator
> >> > > > does
> >> > > > > > not
> >> > > > > > > > > > intend
> >> > > > > > > > > > > to bring back up that controller. `kafka-cluster
> >> > > > > > > > unregister-controller`
> >> > > > > > > > > > > works irrespective of the quorum mode.
> >> > > > > > > > > > >
> >> > > > > > > > > > > Going forward, running `kafka-metadata-quorum
> >> > > > > remove-controller`
> >> > > > > > > > > removes
> >> > > > > > > > > > a
> >> > > > > > > > > > > controller as a KRaft voter, and continues to only
> be
> >> > > > supported
> >> > > > > > in
> >> > > > > > > a
> >> > > > > > > > > > > dynamic quorum cluster. I still think the
> >> unregistering
> >> > > > > behavior
> >> > > > > > > > should
> >> > > > > > > > > > be
> >> > > > > > > > > > > an additional flag, because having an observer
> >> controller
> >> > > > that
> >> > > > > is
> >> > > > > > > > still
> >> > > > > > > > > > > registered to the cluster is a valid configuration
> in
> >> > > Kafka.
> >> > > > I
> >> > > > > > > think
> >> > > > > > > > of
> >> > > > > > > > > > > `kafka-metadata-quorum remove-controller
> --unregister`
> >> > as a
> >> > > > > > > > "built-in"
> >> > > > > > > > > > CLI
> >> > > > > > > > > > > script, since removing a voter and unregistering it
> >> from
> >> > > the
> >> > > > > > > cluster
> >> > > > > > > > is
> >> > > > > > > > > > > probably a very common usage pattern. This command
> >> will
> >> > > only
> >> > > > > send
> >> > > > > > > > > > > UnregisterController RPC if the cluster supports
> >> dynamic
> >> > > > > quorum,
> >> > > > > > so
> >> > > > > > > > the
> >> > > > > > > > > > > overall command behavior is consistent with how it
> is
> >> > today
> >> > > > > with
> >> > > > > > > > > respect
> >> > > > > > > > > > to
> >> > > > > > > > > > > the kraft.version level of the cluster. If the
> cluster
> >> > does
> >> > > > not
> >> > > > > > > > support
> >> > > > > > > > > > > dynamic quorum, the CLI can direct the user to
> instead
> >> > run
> >> > > > the
> >> > > > > > > > > > > `kafka-cluster unregister-controller` command.
> >> > > > > > > > > > >
> >> > > > > > > > > > > Best,
> >> > > > > > > > > > > Kevin Wu
> >> > > > > > > > > > >
> >> > > > > > > > > > > On Tue, Apr 21, 2026 at 5:39 PM Jun Rao via dev <
> >> > > > > > > > [email protected]>
> >> > > > > > > > > > > wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > > > Hi, Kevin,
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Thanks for the reply.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > JR2. Good point on auto-join. I think we can
> >> introduce
> >> > > the
> >> > > > > > > > > > > > new UnregisterControllerRequest and keep the
> >> auto-join
> >> > > > > behavior
> >> > > > > > > as
> >> > > > > > > > is
> >> > > > > > > > > > > > (i.e., without unregistering the controller when
> >> > removing
> >> > > > the
> >> > > > > > old
> >> > > > > > > > > > > instance
> >> > > > > > > > > > > > from the voter). The command
> "kafka-metadata-quorum
> >> > > > > > > > > remove-controller"
> >> > > > > > > > > > > will
> >> > > > > > > > > > > > send two separate RPC requests,
> >> RemoveRaftVoterRequest
> >> > > and
> >> > > > > > > > > > > > UnregisterControllerRequest as documented in the
> >> KIP.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > JR3. When will a user use the command
> "kafka-cluster
> >> > > > > > > > > > > > unregister-controller"? Is this only for
> >> unregistering
> >> > an
> >> > > > > > > observer
> >> > > > > > > > > > > > controller? If the observer controller is
> currently
> >> > > > > supported,
> >> > > > > > we
> >> > > > > > > > can
> >> > > > > > > > > > add
> >> > > > > > > > > > > > that command. It would be useful to document the
> >> usage
> >> > > for
> >> > > > > both
> >> > > > > > > > > > commands.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Jun
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > On Tue, Apr 21, 2026 at 9:25 AM Kevin Wu <
> >> > > > > > [email protected]
> >> > > > > > > >
> >> > > > > > > > > > wrote:
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > > Hi Jun,
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Thanks for the reply.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > RE JR1: Yeah, I will update KIP to touch on this
> >> > static
> >> > > > > > quorum
> >> > > > > > > > edge
> >> > > > > > > > > > > case.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > RE JR2: That seems reasonable to me, since we
> >> would
> >> > > avoid
> >> > > > > two
> >> > > > > > > RPC
> >> > > > > > > > > > hops
> >> > > > > > > > > > > > (one
> >> > > > > > > > > > > > > for RemoveVoter, one for UnregisterController).
> >> One
> >> > > thing
> >> > > > > to
> >> > > > > > > note
> >> > > > > > > > > is
> >> > > > > > > > > > > that
> >> > > > > > > > > > > > > with KIP-1186
> >> > > > > > > > > > > > > <
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-1186*3A*Update*AddRaftVoterRequest*RPC*to*support*auto-join__;JSsrKysrKw!!Ayb5sqE7!phwOrPrBZoQb1P44rCfpPBt74v80NjCTOGhgaRQx1XFXCy1x61QR9b9xw3zfvo-aFvVsFYczOxbTVtGeJkFHCg$
> >> > > > > > > > > > > > > >,
> >> > > > > > > > > > > > > besides operators manually removing controllers,
> >> > > observer
> >> > > > > > > > > controllers
> >> > > > > > > > > > > > > themselves can send `RemoveRaftVoter` to remove
> >> their
> >> > > old
> >> > > > > > > > > > incarnations
> >> > > > > > > > > > > > from
> >> > > > > > > > > > > > > the voter set as part of the auto-join feature.
> >> With
> >> > > > > > auto-join
> >> > > > > > > > and
> >> > > > > > > > > > this
> >> > > > > > > > > > > > > proposed behavior, explicitly removing a
> >> controller's
> >> > > old
> >> > > > > > > > > > registration
> >> > > > > > > > > > > > > alongside its old voter set entry can lead to
> >> > > > "unsupported"
> >> > > > > > > > > upgrades
> >> > > > > > > > > > in
> >> > > > > > > > > > > > the
> >> > > > > > > > > > > > > cluster. An operator doing these steps manually
> >> can
> >> > be
> >> > > > > argued
> >> > > > > > > as
> >> > > > > > > > > > > > > misconfiguring the cluster, but the auto-join
> >> feature
> >> > > > > > allowing
> >> > > > > > > > for
> >> > > > > > > > > > this
> >> > > > > > > > > > > > > scenario seems like a bug.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Consider the below example with auto-join
> >> enabled: 3
> >> > > > > > > controllers
> >> > > > > > > > in
> >> > > > > > > > > > the
> >> > > > > > > > > > > > > voter set (A,B,C) where A supports feature
> levels
> >> > > > X=[0-1],
> >> > > > > B
> >> > > > > > > > > supports
> >> > > > > > > > > > > > > feature levels X=[0-1], but C only supports X=0.
> >> > > > Currently,
> >> > > > > > > node
> >> > > > > > > > A
> >> > > > > > > > > is
> >> > > > > > > > > > > the
> >> > > > > > > > > > > > > active controller, all 3 controllers are
> >> registered,
> >> > > but
> >> > > > > > > > upgrading
> >> > > > > > > > > > > > feature
> >> > > > > > > > > > > > > X to feature level 1 is not supported because C
> >> does
> >> > > not
> >> > > > > > > support
> >> > > > > > > > > it.
> >> > > > > > > > > > > > > Controller C restarts with a new disk (now
> >> > represented
> >> > > as
> >> > > > > > C').
> >> > > > > > > > The
> >> > > > > > > > > > > > > auto-join code runs to first remove C from the
> >> voter
> >> > > set,
> >> > > > > and
> >> > > > > > > > then
> >> > > > > > > > > > > remove
> >> > > > > > > > > > > > > the registration for C. These records are
> >> committed
> >> > via
> >> > > > > > nodes A
> >> > > > > > > > and
> >> > > > > > > > > > B.
> >> > > > > > > > > > > > Now,
> >> > > > > > > > > > > > > from the active controller's perspective, the
> >> cluster
> >> > > > does
> >> > > > > > > > support
> >> > > > > > > > > > > > > upgrading feature X to level 1. There is a race
> >> > between
> >> > > > C'
> >> > > > > > > adding
> >> > > > > > > > > > > itself
> >> > > > > > > > > > > > > back to the KRaft voter set and re-registering
> >> > itself,
> >> > > > and
> >> > > > > a
> >> > > > > > > > > > potential
> >> > > > > > > > > > > > > feature level upgrade. Another interesting thing
> >> to
> >> > > note
> >> > > > > > after
> >> > > > > > > > > > looking
> >> > > > > > > > > > > at
> >> > > > > > > > > > > > > the code is that controllers can register even
> if
> >> > they
> >> > > do
> >> > > > > not
> >> > > > > > > > > support
> >> > > > > > > > > > > the
> >> > > > > > > > > > > > > finalized features of the cluster, which is
> >> different
> >> > > > from
> >> > > > > > > broker
> >> > > > > > > > > > > > > registration. In Kafka's current code, the
> >> original
> >> > > > > > > registration
> >> > > > > > > > > for
> >> > > > > > > > > > C
> >> > > > > > > > > > > > > stays in the log after C is removed as a voter
> by
> >> > > > > auto-join,
> >> > > > > > > > which
> >> > > > > > > > > > > > prevents
> >> > > > > > > > > > > > > an upgrade of feature X. At some point, the
> >> > > registration
> >> > > > > for
> >> > > > > > C
> >> > > > > > > is
> >> > > > > > > > > > > updated
> >> > > > > > > > > > > > > by C' because C' is a different process
> >> incarnation,
> >> > > but
> >> > > > a
> >> > > > > > > > > > registration
> >> > > > > > > > > > > > > that blocks X's upgrade is always in the log.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Therefore, Kafka should not unregister a
> >> controller
> >> > > when
> >> > > > > > > > auto-join
> >> > > > > > > > > > > > removes
> >> > > > > > > > > > > > > a controller from the voter set. This means
> >> > including a
> >> > > > new
> >> > > > > > RPC
> >> > > > > > > > > > version
> >> > > > > > > > > > > > for
> >> > > > > > > > > > > > > `RemoveRaftVoter` that introduces a boolean
> field
> >> > > telling
> >> > > > > the
> >> > > > > > > > > active
> >> > > > > > > > > > > > > controller whether to also unregister the
> >> controller.
> >> > > > This
> >> > > > > > > field
> >> > > > > > > > > > would
> >> > > > > > > > > > > be
> >> > > > > > > > > > > > > completely ignored by the raft layer, and
> instead
> >> > would
> >> > > > be
> >> > > > > > > > handled
> >> > > > > > > > > at
> >> > > > > > > > > > > the
> >> > > > > > > > > > > > > ControllerApis level. I think it is fine to
> >> > unregister
> >> > > a
> >> > > > > > > > controller
> >> > > > > > > > > > > > > whenever the operator runs
> `kafka-metadata-quorum
> >> > > > > > > > > remove-controller`
> >> > > > > > > > > > > for
> >> > > > > > > > > > > > a
> >> > > > > > > > > > > > > smooth UX with dynamic quorum. What do you
> think?
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > RE JR3: Maybe we can document this better as
> part
> >> of
> >> > > the
> >> > > > > code
> >> > > > > > > > > changes
> >> > > > > > > > > > > to
> >> > > > > > > > > > > > > this KIP, but in my opinion, the kafka-cluster
> >> tool
> >> > > deals
> >> > > > > > with
> >> > > > > > > > > > cluster
> >> > > > > > > > > > > > > membership (brokers and controllers), which is a
> >> > > metadata
> >> > > > > > layer
> >> > > > > > > > > > > concept.
> >> > > > > > > > > > > > If
> >> > > > > > > > > > > > > you look at the `list-endpoints` command, you
> can
> >> > list
> >> > > > out
> >> > > > > > the
> >> > > > > > > > > > > registered
> >> > > > > > > > > > > > > controller endpoints. Alternatively, the
> >> > > > > > kafka-metadata-quorum
> >> > > > > > > > tool
> >> > > > > > > > > > > deals
> >> > > > > > > > > > > > > with KRaft, which knows about concepts like
> >> leader,
> >> > > > voter,
> >> > > > > > and
> >> > > > > > > > > > > observers.
> >> > > > > > > > > > > > > The `add-controller` and `remove-controller`
> >> > > sub-commands
> >> > > > > > > > > > inadvertently
> >> > > > > > > > > > > > > deal with controllers (since controllers can be
> >> > > voters),
> >> > > > > but
> >> > > > > > > the
> >> > > > > > > > > > > > `describe`
> >> > > > > > > > > > > > > sub-command tree also shows information about
> >> > brokers,
> >> > > > > which
> >> > > > > > > are
> >> > > > > > > > > > > > observers
> >> > > > > > > > > > > > > to KRaft. My decision to include the
> >> > > > > `unregister-controller`
> >> > > > > > > > > command
> >> > > > > > > > > > in
> >> > > > > > > > > > > > the
> >> > > > > > > > > > > > > `kafka-cluster` tool is mainly motivated by this
> >> > > > > distinction.
> >> > > > > > > > > > > > Additionally,
> >> > > > > > > > > > > > > if we only send `RemoveVoterRequest` in
> >> > > > > `remove-controller`,
> >> > > > > > it
> >> > > > > > > > > seems
> >> > > > > > > > > > > > hacky
> >> > > > > > > > > > > > > to direct users to use that command for
> >> unregistering
> >> > > any
> >> > > > > > > > > controller,
> >> > > > > > > > > > > > since
> >> > > > > > > > > > > > > for observers, the remove voter logic of that
> >> request
> >> > > > will
> >> > > > > > > always
> >> > > > > > > > > > fail
> >> > > > > > > > > > > in
> >> > > > > > > > > > > > > the raft layer. What do you think?
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Best,
> >> > > > > > > > > > > > > Kevin Wu
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > On Tue, Apr 21, 2026 at 8:17 AM Paolo Patierno <
> >> > > > > > > > > > > [email protected]
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > wrote:
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Hi Kevin,
> >> > > > > > > > > > > > > > thanks for the KIP.
> >> > > > > > > > > > > > > > From reading it, it's not clear because not
> >> > explicit,
> >> > > > > but I
> >> > > > > > > > would
> >> > > > > > > > > > > > assume
> >> > > > > > > > > > > > > > you are going to expose a new
> >> unregisterController
> >> > > > method
> >> > > > > > > > through
> >> > > > > > > > > > the
> >> > > > > > > > > > > > > > AdminClient API as well, is my assumption
> right?
> >> > > > > > > > > > > > > > I expect it would be used underneath by the
> >> tools
> >> > you
> >> > > > are
> >> > > > > > > going
> >> > > > > > > > > to
> >> > > > > > > > > > > > > modify.
> >> > > > > > > > > > > > > > Having such support within the AdminClient API
> >> is
> >> > > > > important
> >> > > > > > > > when
> >> > > > > > > > > > the
> >> > > > > > > > > > > > > > operator is not a human to run the tool but a
> >> > > > Kubernetes
> >> > > > > > > > operator
> >> > > > > > > > > > > (i.e.
> >> > > > > > > > > > > > > > Strimzi) with the need to unregister a
> >> controller.
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Thanks,
> >> > > > > > > > > > > > > > Paolo.
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > On Mon, 20 Apr 2026 at 21:57, Kevin Wu <
> >> > > > > > > [email protected]
> >> > > > > > > > >
> >> > > > > > > > > > > wrote:
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > Hi Jun,
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > Thanks for the reply.
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > RE JR1: I would say the main use case is
> >> dynamic
> >> > > > > quorums,
> >> > > > > > > > since
> >> > > > > > > > > > the
> >> > > > > > > > > > > > > > concept
> >> > > > > > > > > > > > > > > of the observer controller becomes a thing
> in
> >> > that
> >> > > > > world.
> >> > > > > > > > > > However,
> >> > > > > > > > > > > > > there
> >> > > > > > > > > > > > > > is
> >> > > > > > > > > > > > > > > a static quorum edge case if the operator
> >> > > > misconfigures
> >> > > > > > > > > > > > > > > `controller.quorum.voters`. If a new
> >> controller
> >> > > voter
> >> > > > > > > > > mistakenly
> >> > > > > > > > > > > > joins
> >> > > > > > > > > > > > > > the
> >> > > > > > > > > > > > > > > cluster, it will also persist a registration
> >> > > record.
> >> > > > In
> >> > > > > > my
> >> > > > > > > > > > opinion,
> >> > > > > > > > > > > > > there
> >> > > > > > > > > > > > > > > should be a way to remove a controller
> >> > registration
> >> > > > via
> >> > > > > > > > > > AdminClient
> >> > > > > > > > > > > > CLI
> >> > > > > > > > > > > > > > in
> >> > > > > > > > > > > > > > > all quorum modes.
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > RE JR2: Yes, the existing command only
> removes
> >> > the
> >> > > > > voter,
> >> > > > > > > but
> >> > > > > > > > > > does
> >> > > > > > > > > > > > not
> >> > > > > > > > > > > > > > > unregister the controller. I left it as a
> >> > separate
> >> > > > flag
> >> > > > > > for
> >> > > > > > > > now
> >> > > > > > > > > > > > because
> >> > > > > > > > > > > > > > > they are "separate" operations in that
> being a
> >> > raft
> >> > > > > voter
> >> > > > > > > is
> >> > > > > > > > a
> >> > > > > > > > > > > subset
> >> > > > > > > > > > > > > of
> >> > > > > > > > > > > > > > > being a controller in dynamic quorums, but I
> >> am
> >> > not
> >> > > > > > opposed
> >> > > > > > > > to
> >> > > > > > > > > > > making
> >> > > > > > > > > > > > > > this
> >> > > > > > > > > > > > > > > command try to do both (remove voter and
> >> > unregister
> >> > > > the
> >> > > > > > > > > > controller)
> >> > > > > > > > > > > > by
> >> > > > > > > > > > > > > > > default. In my opinion, an observer
> >> controller is
> >> > > > > > "useless"
> >> > > > > > > > in
> >> > > > > > > > > > that
> >> > > > > > > > > > > > it
> >> > > > > > > > > > > > > > does
> >> > > > > > > > > > > > > > > not participate in the leader election or
> >> > > replication
> >> > > > > > parts
> >> > > > > > > > of
> >> > > > > > > > > > the
> >> > > > > > > > > > > > > KRaft
> >> > > > > > > > > > > > > > > protocol, so I see no issue with doing both
> >> > > > operations
> >> > > > > > > > always.
> >> > > > > > > > > > > > However,
> >> > > > > > > > > > > > > > an
> >> > > > > > > > > > > > > > > operator may want observer controllers
> around
> >> for
> >> > > > other
> >> > > > > > > > reasons
> >> > > > > > > > > > > like
> >> > > > > > > > > > > > > > > redundancy. Do you (or others) have any
> >> insight
> >> > > into
> >> > > > > how
> >> > > > > > > > users
> >> > > > > > > > > > may
> >> > > > > > > > > > > be
> >> > > > > > > > > > > > > > > configuring clusters with observer
> >> controllers?
> >> > If
> >> > > > > not, I
> >> > > > > > > > think
> >> > > > > > > > > > it
> >> > > > > > > > > > > is
> >> > > > > > > > > > > > > > okay
> >> > > > > > > > > > > > > > > to remove the flag and make it the default
> >> > behavior
> >> > > > of
> >> > > > > > > > > > > > > > > `kafka-metadata-quorum remove-controller`.
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > RE JR3: Not exactly. The
> >> `kafka-metadata-quorum
> >> > > > > > > > > remove-controller
> >> > > > > > > > > > > ...
> >> > > > > > > > > > > > > > > --unregister` sends 2 RPCs to the active
> >> > > controller,
> >> > > > > one
> >> > > > > > to
> >> > > > > > > > > > remove
> >> > > > > > > > > > > a
> >> > > > > > > > > > > > > node
> >> > > > > > > > > > > > > > > from the voter set, and another to
> unregister
> >> the
> >> > > > node.
> >> > > > > > The
> >> > > > > > > > > > > > > > `kafka-cluster
> >> > > > > > > > > > > > > > > unregister-controller` command just sends 1
> >> RPC
> >> > to
> >> > > > the
> >> > > > > > > active
> >> > > > > > > > > > > > > controller
> >> > > > > > > > > > > > > > to
> >> > > > > > > > > > > > > > > unregister the node. My motivation for
> having
> >> two
> >> > > > > > separate
> >> > > > > > > > > > commands
> >> > > > > > > > > > > > is
> >> > > > > > > > > > > > > > > because `remove-controller` is associated
> with
> >> > > > dynamic
> >> > > > > > > > quorum,
> >> > > > > > > > > > > since
> >> > > > > > > > > > > > > the
> >> > > > > > > > > > > > > > > `RemoveRaftVoterRPC` will fail if the
> >> > > > kraft.version=0.
> >> > > > > > What
> >> > > > > > > > do
> >> > > > > > > > > > you
> >> > > > > > > > > > > > > think?
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > RE JR4: I have updated the sections for the
> >> CLI
> >> > > > > commands
> >> > > > > > in
> >> > > > > > > > the
> >> > > > > > > > > > KIP
> >> > > > > > > > > > > > to
> >> > > > > > > > > > > > > > add
> >> > > > > > > > > > > > > > > this information.
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > RE JR5: This is describing the current
> >> > > implementation
> >> > > > > of
> >> > > > > > > the
> >> > > > > > > > > > > > > > > ControllerRegistrationManager, which will
> >> listen
> >> > to
> >> > > > the
> >> > > > > > > > > metadata
> >> > > > > > > > > > > log
> >> > > > > > > > > > > > > and
> >> > > > > > > > > > > > > > > send ControllerRegistrationRequest when the
> >> local
> >> > > > node
> >> > > > > id
> >> > > > > > > is
> >> > > > > > > > > not
> >> > > > > > > > > > > > > > registered
> >> > > > > > > > > > > > > > > in the log. It looks like this is slightly
> >> > > different
> >> > > > > from
> >> > > > > > > how
> >> > > > > > > > > we
> >> > > > > > > > > > > > handle
> >> > > > > > > > > > > > > > > broker registration in
> BrokerLifecycleManager.
> >> > > > > Currently,
> >> > > > > > > > this
> >> > > > > > > > > > code
> >> > > > > > > > > > > > > path
> >> > > > > > > > > > > > > > > never executes because controller
> >> registrations
> >> > > > cannot
> >> > > > > be
> >> > > > > > > > > > removed.
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > Best,
> >> > > > > > > > > > > > > > > Kevin Wu
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > On Fri, Apr 17, 2026 at 2:08 PM Jun Rao via
> >> dev <
> >> > > > > > > > > > > > [email protected]>
> >> > > > > > > > > > > > > > > wrote:
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > Hi, Kevin,
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > Thanks for the KIP. A few comments.
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > JR1. I guess this is only intended for
> >> dynamic
> >> > > > KRaft
> >> > > > > > > > quorums?
> >> > > > > > > > > > If
> >> > > > > > > > > > > > so,
> >> > > > > > > > > > > > > it
> >> > > > > > > > > > > > > > > > would be useful to clarify that.
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > JR2. kafka-metadata-quorum
> remove-controller
> >> > > > > > > > --controller-id
> >> > > > > > > > > > 9990
> >> > > > > > > > > > > > > > > > --controller-directory-id EXAMPLE_UUID
> >> > > --unregister
> >> > > > > > > > > > > > > > > > So, the existing remove-controller logic
> >> only
> >> > > > changes
> >> > > > > > the
> >> > > > > > > > > voter
> >> > > > > > > > > > > > set,
> >> > > > > > > > > > > > > > but
> >> > > > > > > > > > > > > > > > doesn't unregister the controller? Should
> we
> >> > just
> >> > > > > > always
> >> > > > > > > do
> >> > > > > > > > > > these
> >> > > > > > > > > > > > two
> >> > > > > > > > > > > > > > > > together? Is there a use case for only
> >> > removing a
> >> > > > > > > > controller
> >> > > > > > > > > > from
> >> > > > > > > > > > > > the
> >> > > > > > > > > > > > > > > voter
> >> > > > > > > > > > > > > > > > set, but not unregsitering?
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > JR3. Is kafka-cluster
> unregister-controller
> >> > > > > equivalent
> >> > > > > > to
> >> > > > > > > > > > > > > > > > kafka-metadata-quorum remove-controller
> >> > > > > --controller-id
> >> > > > > > > > 9990
> >> > > > > > > > > > > > > > > > --controller-directory-id EXAMPLE_UUID
> >> > > > --unregister?
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > JR4. Could you describe the underlying
> >> workflow
> >> > > for
> >> > > > > > each
> >> > > > > > > > new
> >> > > > > > > > > > > > command
> >> > > > > > > > > > > > > > > (RPCs
> >> > > > > > > > > > > > > > > > sent, metadata records generated, actions
> >> taken
> >> > > by
> >> > > > > the
> >> > > > > > > > > > > controller,
> >> > > > > > > > > > > > > > etc)?
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > JR5. "The registration manager of an
> >> > unregistered
> >> > > > > > > > controller
> >> > > > > > > > > > > > already
> >> > > > > > > > > > > > > > > > attempts to re-register with the active
> >> > > controller.
> >> > > > > > This
> >> > > > > > > is
> >> > > > > > > > > to
> >> > > > > > > > > > > > > prevent
> >> > > > > > > > > > > > > > > > accidental unregistrations."
> >> > > > > > > > > > > > > > > > I don't quite understand this. Why will an
> >> > > > > unregistered
> >> > > > > > > > > > > controller
> >> > > > > > > > > > > > > > > attempt
> >> > > > > > > > > > > > > > > > to re-register?
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > Jun
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > On Fri, Apr 3, 2026 at 11:31 AM Kevin Wu <
> >> > > > > > > > > > [email protected]
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > > > wrote:
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > Hi all,
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > I would like to start a discussion on
> >> > KIP-1312:
> >> > > > > > Support
> >> > > > > > > > > > > > > unregistering
> >> > > > > > > > > > > > > > > > > controllers. Below is the KIP link.
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-1312*3A*Support*unregistering*controllers__;JSsrKw!!Ayb5sqE7!phwOrPrBZoQb1P44rCfpPBt74v80NjCTOGhgaRQx1XFXCy1x61QR9b9xw3zfvo-aFvVsFYczOxbTVtFeUg-7gg$
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > Thanks,
> >> > > > > > > > > > > > > > > > > Kevin Wu
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > --
> >> > > > > > > > > > > > > > Paolo Patierno
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > *Senior Principal Software Engineer @
> IBM**CNCF
> >> > > > > Ambassador*
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Twitter : @ppatierno <
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://urldefense.com/v3/__http://twitter.com/ppatierno__;!!Ayb5sqE7!phwOrPrBZoQb1P44rCfpPBt74v80NjCTOGhgaRQx1XFXCy1x61QR9b9xw3zfvo-aFvVsFYczOxbTVtHGG-mS-Q$
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > > > Linkedin : paolopatierno <
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://urldefense.com/v3/__http://it.linkedin.com/in/paolopatierno__;!!Ayb5sqE7!phwOrPrBZoQb1P44rCfpPBt74v80NjCTOGhgaRQx1XFXCy1x61QR9b9xw3zfvo-aFvVsFYczOxbTVtFcWWCD5g$
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > > > GitHub : ppatierno <
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://urldefense.com/v3/__https://github.com/ppatierno__;!!Ayb5sqE7!phwOrPrBZoQb1P44rCfpPBt74v80NjCTOGhgaRQx1XFXCy1x61QR9b9xw3zfvo-aFvVsFYczOxbTVtEK-wncPw$
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >> >
> >> > --
> >> > Paolo Patierno
> >> >
> >> > *Senior Principal Software Engineer @ IBM**CNCF Ambassador*
> >> >
> >> > Twitter : @ppatierno <
> https://urldefense.com/v3/__http://twitter.com/ppatierno__;!!Ayb5sqE7!oZRzwnaATPfi2VSJp6vrqOWg2-cS1dzm3yMf7omBnhC3cpAuwrIfbNJWLtTE8w9D90jKF93cA9wCXqWAuKpo274Y8wKw$
> >
> >> > Linkedin : paolopatierno <
> https://urldefense.com/v3/__http://it.linkedin.com/in/paolopatierno__;!!Ayb5sqE7!oZRzwnaATPfi2VSJp6vrqOWg2-cS1dzm3yMf7omBnhC3cpAuwrIfbNJWLtTE8w9D90jKF93cA9wCXqWAuKpo2zgf1gJm$
> >
> >> > GitHub : ppatierno <
> https://urldefense.com/v3/__https://github.com/ppatierno__;!!Ayb5sqE7!oZRzwnaATPfi2VSJp6vrqOWg2-cS1dzm3yMf7omBnhC3cpAuwrIfbNJWLtTE8w9D90jKF93cA9wCXqWAuKpo2zoIqt5M$
> >
> >> >
> >>
> >
>


-- 
-David

Reply via email to