Hi everyone,

I've updated the KIP with the following changes:

- Using the in-memory LeaderState and ClusterImage to
  retrieve the directory UUID and controller endpoints. The previous
  approach has been moved to the "Rejected Alternatives" section.

- When both --command-config and --controller-id are provided in
  the add-controller command, the config file will only be applied
  during Admin initialization.

One point for discussion:
When adding controllers, there could be a scenario where observers
have different directory UUIDs but share the same node ID. In such
cases, I propose returning a non-retryable IllegalStateException and
notifying the user that they should explicitly specify the directory
UUID when adding a voter.

In KIP-853, this scenario can occur during a node disk failure and is
also permitted during Kafka controller startup. For example, we can
start a standalone controller with node ID 3000, then start two
observers with the same node ID 3001 but different directory UUIDs.

Best,
Kuan-Po Tseng

On 2025/08/01 07:47:43 Luke Chen wrote:
> Hi Kuan-Po,
> 
> > L03
> > In KAFKA-19563, you mentioned introducing a new
> --add-controller-command-config option to separate admin and
> controller configs. That sounds like a solid approach.
> 
> Yes, that can resolve the issue in KAFKA-19563, but it conflicts with the
> motivation in the KIP:
>    1. Limited Accessibility: The node executing the tool must have direct
> access to the metadata path of the node being added or removed. This
> restricts the ability to use node A to manage node B, as node A may not
> have access to the metadata folder on node B.
>    2. Dependency on Node Configuration: The tool requires access to the
> configuration of the node being managed.
> 
> > That said, I was thinking of an alternative:
>  what if we change the logic so that *when both* --command-config
> *and* --controller-id are provided, the command config only applies
> to admin initialization? Users can add extra admin properties in
> --command-config.
> 
> I'm good with it.
> 
> >L04
> > As jose mentioned, the controller (observer) endpoints are stored in the
> ClusterImage
> 
> Yes, I like this idea since we already have endpoint information stored in
> the active controller.
> 
> Thank you.
> Luke
> 
> 
> On Fri, Aug 1, 2025 at 12:40 AM Chia-Ping Tsai <chia7...@gmail.com> wrote:
> 
> > hi Kuan-Po
> >
> > Yeah, I agree — to make things easier, we could enhance the logic.
> > If the user provides --bootstrap-server, we can retrieve
> > controller.quorum.bootstrap.servers from the broker configs
> > using admin.describeConfigs, and use that to get the required
> > controller configs.
> >
> >
> > As jose mentioned, the controller (observer) endpoints are stored in the
> > ClusterImage.
> >
> > Another (more complicated) approach would be to make the endpoints in
> > AddRaftVoterRequest optional.
> >
> > Initially, I thought this KIP could address this issue in a simple way,
> > without requiring any changes to the RPC :)
> >
> > Best,
> > Chia-Ping
> >
> > Kuan-Po Tseng <brandb...@gmail.com> 於 2025年8月1日 週五 上午12:05寫道:
> >
> > > Hi Luke,
> > >
> > > Appreciate the feedback — glad to finally see some voices in this thread!
> > > :)
> > >
> > > *L01, L02:*
> > > Sure thing!
> > >
> > > *L03:*
> > > Thanks for the heads-up — I hadn’t thought of that before.
> > > In KAFKA-19563, you mentioned introducing a new
> > > --add-controller-command-config option to separate admin and
> > > controller configs. That sounds like a solid approach.
> > >
> > > That said, I was thinking of an alternative:
> > > what if we change the logic so that *when both* --command-config
> > > *and* --controller-id are provided, the command config only applies
> > > to admin initialization? Users can add extra admin properties in
> > > --command-config.
> > >
> > > What do you think?
> > >
> > > *L04:*
> > > Yeah, I agree — to make things easier, we could enhance the logic.
> > > If the user provides --bootstrap-server, we can retrieve
> > > controller.quorum.bootstrap.servers from the broker configs
> > > using admin.describeConfigs, and use that to get the required
> > > controller configs.
> > >
> > > Then we could allow users to run something like:
> > >
> > > bin/kafka-metadata-quorum.sh --bootstrap-server localhost:9092 \
> > >   add-controller --controller-id <id>
> > >
> > >
> > > Best,
> > > Kuan-Po
> > >
> > > On Thu, Jul 31, 2025 at 3:24 PM Luke Chen <show...@gmail.com> wrote:
> > >
> > > > Hi Kuan-Po,
> > > >
> > > > Thanks for the KIP!
> > > > We also faced the similar issue recently.
> > > >
> > > > Some suggestions/questions:
> > > > 1. Could you also include the issue in  KAFKA-19563
> > > > <https://issues.apache.org/jira/browse/KAFKA-19563> as the motivation
> > in
> > > > the KIP?
> > > > 2. Could you make it clear in the KIP that after the KIP, the
> > > > add-controller request from kafka-metadata-quorum.sh will take more
> > time
> > > > than before because of more API calls needed?
> > > > 3. "If —-command-config is provided, fallback to the existing behavior,
> > > and
> > > > —-controller-id option will be ignored."
> > > > On this, it looks like it still doesn't resolve the issue described in
> > > > KAFKA-19563 <https://issues.apache.org/jira/browse/KAFKA-19563>
> > because
> > > if
> > > > the client wants to pass the "client.id", and provide a properties
> > file
> > > in
> > > > --command-config, it 'll be treated as the controller.properties and
> > fail
> > > > the request, right? Ex:
> > > > `bin/kafka-metadata-quorum.sh --bootstrap-controller localhost:9093
> > > > add-controller --controller-id <id> --command-config admin.properties`
> > > > 4. "This can only be done with bootstrap controller option since we
> > can‘t
> > > > use bootstrap.server in Admin#describeConfigs to get controller
> > configs."
> > > > I understand the reason but it makes the script more confusing in my
> > > > opinion.
> > > > I'm wondering why we need the controller endpoint when talking with the
> > > > broker with --bootstrap-server? The controller endpoint should already
> > > > exist in broker itself, right?
> > > >
> > > > Thank you.
> > > > Luke
> > > >
> > > > On Sat, Apr 19, 2025 at 11:32 PM Kuan Po Tseng <brandb...@apache.org>
> > > > wrote:
> > > >
> > > > > Hi José,
> > > > >
> > > > > Apologies for the delayed response. Do you still have any questions
> > > > > regarding the describeConfig part? As Chia-Ping mentioned, we are
> > able
> > > to
> > > > > retrieve all broker configurations through that method. (
> > > > >
> > > >
> > >
> > https://github.com/apache/kafka/blob/da46cf6e79afbbed1da2bae831e0f70992e85f9b/core/src/main/scala/kafka/server/ConfigHelper.scala#L121-L123
> > > > > )
> > > > >
> > > > > Please feel free to reach out if you have any further questions or
> > need
> > > > > clarification. Thank you again for your valuable feedback!
> > > > >
> > > > > Best,
> > > > > Kuan-Po Tseng
> > > > >
> > > > > On 2025/04/03 16:04:36 José Armando García Sancio wrote:
> > > > > > Hi Chia,
> > > > > >
> > > > > > On Thu, Apr 3, 2025 at 10:24 AM Chia-Ping Tsai <
> > chia7...@apache.org>
> > > > > wrote:
> > > > > > > We propose to use `Admin#describeConfigs` to get the configs for
> > > > > specific controller if the bootstrap.controllers is configured. This
> > > > > approach is similar to what `MetadataQuorumCommand` does, and the
> > > > > difference is `MetadataQuorumCommand` read those configs from local
> > > file
> > > > > and this KIP gets those configs by `Admin#describeConfigs`
> > > > > >
> > > > > > I am not sure. I have to look at that code but doesn't
> > > > > > "Admin#describeConfigs" only return dynamic configuration for the
> > > > > > controller? Most users configure the controller using the server
> > > > > > properties file. My current understanding is that values coming
> > from
> > > > > > the properties file won't show up in Admin#describeConfigs.
> > > > > >
> > > > > > Thanks,
> > > > > > --
> > > > > > -José
> > > > > >
> > > > >
> > > >
> > >
> >
> 

Reply via email to