Hi Paolo,

When the controller nodes are formatted as `--initial-controllers`, they
will treat themselves as voters. And because of this, they will try to send
out vote requests to other voters to try to form a quorum when no quorum is
formed yet. If there are 3 controllers [0,1,2] format with this like
documented in the kafka doc
<https://kafka.apache.org/42/operations/kraft/#bootstrap-with-multiple-controllers>
,

bin/kafka-storage.sh format --cluster-id ${CLUSTER_ID} \
                     --initial-controllers
"0@controller-0:1234:${CONTROLLER_0_UUID},1@controller-1:1234:${CONTROLLER_1_UUID},2@controller-2:1234:${CONTROLLER_2_UUID}"
\
                     --config config/controller.properties

When the nodes 0, 1, 2 start up, they will form a quorum because no
active quorum is formed yet. But if the 4th node is formatted as
`--initial-controllers`, when starts up,
it'll talk to a bootstrap server and get the current quorum status and
later fetch from the quorum leader to get the current voters list,
which only contains [0,1,2].

But if, for example, the controller 0 already formed a quorum. Later,
we try to format controller [1, 2] with the `--initial-controllers` as
above. Then, what might happen is:

1. Controller 0 is up and healthy, the controller 1, 2 finds out the
controller 0 is up and it is the quorum leader, so they both fetch
from controller 0 to get the current metadata logs.

2. Controller 0 is down, or inaccessible, the controller 1, 2 don't
know the existence of quorum formed by controller 0, so the controller
[1, 2] forms a quorum successfully. The split brain issue happens.

In summary, it's risky to format new added controllers by
`--initial-controllers`. We should follow what the document said to
format with `--no-initial-controllers`.

Thank you,

Luke



On Wed, Feb 18, 2026 at 8:47 PM Paolo Patierno <[email protected]>
wrote:

> Hi all,
> the current official documentation differentiates between formatting
> storage for multiple controllers when starting a new cluster versus adding
> a new controller during scale-up operations.
> For a new cluster, the "initial controllers" must be provided (-I flag in
> the storage tool), which creates a checkpoint file that controllers use to
> discover each other and form the quorum.
> For scale-up, the documentation indicates formatting should be done without
> providing initial controllers (using the -N flag instead).
>
> What I've discovered is that using --initial-controllers during scale-up
> also works correctly. For example:
> - Start with 3 controllers (0, 1, 2) formatted with --initial-controllers
> 0,1,2
> - When scaling to 4 controllers, format controller 3 with
> --initial-controllers 0,1,2,3
> - The new controller successfully discovers the quorum by fetching metadata
> (the local checkpoint file is created but not used for discovery)
> - The add-controller operation via admin API is still required
>
> As you can see, the new controller starts, fetches the metadata and becomes
> an observer. It doesn't join the quorum automatically until the
> add-controller operation.
>
> This approach provides consistency, especially valuable in automated
> deployment environments, as there's no need to distinguish between initial
> cluster setup and scale-up operations.
>
> I've opened PR #21507 <https://github.com/apache/kafka/pull/21507> with a
> test demonstrating this works as expected.
>
> Are there any thoughts about this or concerns from the community to
> consider this procedure still valid?
>
> Thanks,
> Paolo Patierno
>
> --
> Paolo Patierno
>
> *Senior Principal Software Engineer @ IBM**CNCF Ambassador*
>
> Twitter : @ppatierno <http://twitter.com/ppatierno>
> Linkedin : paolopatierno <http://it.linkedin.com/in/paolopatierno>
> GitHub : ppatierno <https://github.com/ppatierno>
>

Reply via email to