Re: [DISCUSS] KIP-1262: Enable auto-formatting directories

Kevin Wu Wed, 25 Feb 2026 14:23:22 -0800

Hi Alyssa,

Thanks for the replies and questions.


RE AH1: The goal of the KIP (at least for now) is to remove the need to run
`kafka-storage format` before running kafka on brokers and observer
controllers. Sure, I can change the wording.

RE AH2: I don't think you would need to? Mainly because going forward,
meta.properties V2 will be written on newly formatted nodes +
non-formatting nodes. Additionally, starting kafka on any node requires
cluster.id to be present in meta.properties. Otherwise, the node crashes.

RE AH5: I specifically mentioned the `.checkpoint` file created from
formatting as part of this write path because the active controller writes
its contents to the log when it becomes leader. If there is no
bootstrap/0-0.checkpoint on the leader, it will default to the latest MV,
and after this KIP, that would mean the active controller would write
ClusterIdRecord alongside the MV during the bootstrap metadata write.

RE AH6: Right now, starting kafka on any node requires cluster.id to be
present in meta.properties. Otherwise, the node crashes. As part of this
KIP, we no longer want that to be the case for every node. There are two
approaches for this, one that I think is more suitable for a minor release,
and one I think is better for 5.0 due to compatibility cases (although this
is definitely still up for discussion and I'd like to know everyone's
thoughts).
The former is what I described in the KIP, which is that any kafka node
that can become the active controller must have meta.properties with a
cluster.id to start kafka successfully. This removes the formatting
requirement for broker-only nodes and controllers who are not part of the
KRaft voter set. In this case, we do not need to cover for the case where
the leader does not have a meta.properties, because this node would have
crashed before becoming leader.
The latter is to remove the requirement for all nodes to format (keep in
mind that dynamic quorums currently must have at least one node format to
elect a leader in a proper configuration), but formatting would still be
optional to allow for control over feature levels etc.. In this case, kafka
would need to write its own cluster id upon electing an active controller.
However, we would then need to support dynamic quorum bootstrapping during
startup too. I don't think this is a good idea at the moment, as it would
mean introducing a static config to determine who are the "bootstrap
controllers," and the formatting logic for dynamic quorum is then
duplicated at startup. What happens if both the static config exist and the
node was formatted? You can imagine this same problem for every other
feature too. Basically, I think it is really confusing to users and bad
design if kafka can work without formatting any nodes, but also you can
format if you want. If we want to remove the requirement of any node having
to format, we should just remove the `kafka-storage format` and probably
the whole storage tool altogether, but that should be in a major release.
If we go with this approach now, another concern I have is about older MV
clusters whose nodes have software versions that support this feature. How
would a new software version broker persist the cluster id if the operator
did not format the node? I think it is easier to communicate how the latter
approach works in a major release.

Best,
Kevin Wu

On Wed, Feb 25, 2026 at 9:57 PM Alyssa Huang via dev <[email protected]>
wrote:

> Hey Kevin, thanks for the KIP!
>
> AH1:
>
> > The reason for this KIP is to remove the requirement of brokers needed to
> > run the storage tool before starting Kafka.
>
> I'd like to clarify whether you're saying the goal of the KIP is to remove
> the need of running just the storage format command or the storage tool at
> all. Also, I misunderstood
>
> > Persisting this data does not need to be done before starting kafka.
>
>  as saying we don't need to persist this data at all and that actually
> messed up my understanding of the rest of your KIP on my first read - can
> we change the wording to "Persisting this data does not have to be done
> during storage formatting and can be done later during startup".
>
> AH2: In the MV upgrade case, do we see any value in updating the
> meta.properties file to v2?
>
> AH3:
>
> > Since this feature is associated with a new metadata record and
> > MetadataVersion, broker bootstrapping with cluster ID is required on all
> > MVs < X where X is the first MV that supports this feature. Because some
> > MetadataVersion is resolved during each node's formatting, we can
> determine
> > at format time if a ClusterIdRecord is needed as part of a controller's
> > 0-0/bootstrap.checkpoint.
>
> Can you go into some detail on how the MV is resolved during a node's
> formatting? This might be interesting for readers to know in the
> provisioning and re-formatting/adding a single node case.
>
> AH4: Under Proposed Changes the first header is:
>
> > meta.properties can be written during kafka broker/controller startup if
> > it doesn't exist already (from formatting)
>
> Nit, but should this be 'meta.properties' *will* be written during kafka
> broker/controller startup if it doesn't exist already?
>
> AH5: Could the following be expanded a bit? "write-path" might be too
> generic since all nodes will write the MV they are formatted with to the
> bootstrap metadata (but only on leader election will the elected node
> attempt to write the bootstrap metadata MV to the metadata log). I found
> "formatting a node (specifically controllers who can become leader)"
> especially confusing, since I'm assuming you're saying the
> write-to-metadata-log path for MV.
>
> > This is enforceable along the write-path for MV, which occurs at two
> > points: formatting a node (specifically controllers who can become
> leader)
> > and upgrading the MV using kafka-features upgrade
>
>
> AH6:
>
> > However, kafka should still be able to handle the case where a leader is
> > elected who does not have clusterId in meta.properties, which can occur
> if
> > a majority of voters do not have a clusterId
>
> The wording makes it sound like this is the exceptional case, but before I
> read JR6 and your respond I had assumed this would be the normal case for a
> cluster provisioned with KIP-1262 support. Can we make this more clear in
> the KIP? Why are we making cluster id optional in meta.properties v2 if we
> expect to still write it?
>
> We can still enforce that bootstrap controllers must have formatted (and
> > therefore persisted a cluster id to meta.properties) prior to starting
> > kafka.
>
> I'm confused if this is saying we will continue to enforce persisting
> cluster id to meta.properties (given that you've also said kafka will
> handle the case where a leader is elected w/o clusterid in its
> meta.properties)
>
> AH7:
>
> > Nodes discover persist cluster id to meta.properties from metadata
> > publishing pipeline
> >
> Seems like there's an extra word in this header
>
> Thanks in advance for your responses :)
> Alyssa
>
> On Wed, Feb 18, 2026 at 8:49 AM Kevin Wu <[email protected]> wrote:
>
> > Hi Jun,
> >
> > Thanks for the replies and questions.
> >
> > RE JR1: Updated the KIP with the record schema for ClusterIdRecord. One
> > thing I'm not sure about yet is whether or not the record field should be
> > of UUID or String type. This is because kafka's quickstart docs refer to
> > setting `--cluster-id` to a UUID in the storage tool. However, many
> places
> > in kafka broker/controller code (e.g. the raft client, broker lifecycle
> > manager, and even the formatter itself) only require this type to be a
> > String. Since not all Strings are valid UUIDs, making this record field
> of
> > type UUID might be too restrictive and complicate upgrading the MV for
> > existing clusters, since they might have a non-UUID cluster id string,
> but
> > need to write this record when upgrading to an MV that supports this
> > feature. Let me know what you think.
> >
> > RE JR2: Any controller node formatted with `--standalone,
> > --initial-controllers` or who is part of the static voter set defined by
> > `controller.quorum.voters` can write the ClusterIdRecord by including the
> > `--cluster-id` argument to `kafka-storage format`. However, if the MV of
> > the cluster supports it, there is exactly one writer of this record to
> the
> > cluster metadata partition. The writer is the first active controller,
> who
> > writes this record alongside other bootstrap metadata records (e.g.
> > metadata version) during controller activation. At this point, we already
> > depend on MV existing, since the active controller writes these bootstrap
> > metadata records as a transaction if the MV supports it. I think writing
> > the cluster id record would follow a similar pattern.
> >
> > RE JR3: When a node formats, it will write the meta.properties file.
> During
> > formatting, a node must resolve the MV it wants to format with, which is
> > explained more in RE JR5. I need to think about this more, but I think we
> > should keep `--cluster-id` as a required flag for invoking the format
> > command. If a broker/observer controller does not format, meta.properties
> > is written without cluster id immediately after startup (i.e. where we
> read
> > it from disk now in KafkaRaftServer).
> >
> > RE JR4: Yeah, will do. In this context, when I say observers I'm
> referring
> > to any controllers who are not part of the KRaft voter set when they
> start
> > kafka, or any brokers. I will make this explicit in the KIP. From the
> > perspective of this feature and KRaft leader election, controller nodes
> who
> > format with `--no-initial-controllers`, controller nodes who are not part
> > of `controller.quorum.voters`, and brokers, all do not "need" to format,
> > since they cannot become the active controller. This means they can
> resolve
> > metadata like the cluster id after discovering the leader. We have a
> > similar pattern with how controller nodes who format with
> > `--no-initial-controllers` discover the kraft version of the cluster.
> >
> > RE JR5: If a node formats, it must resolve a metadata version with which
> to
> > format. This comes from the `--release-version/--feature` flag and
> defaults
> > to the latest production MV. Therefore, when a node formats with a
> metadata
> > version that supports this feature, it will write the ClusterIdRecord to
> > its `0-0/bootstrap.checkpoint`. If the node formats with a metadata
> version
> > that does not support this feature, it does not write ClusterIdRecord to
> > its `0-0/bootstrap.checkpoint`. If a node skips formatting, it is assumed
> > that this node is part of a cluster whose MV supports this. Otherwise,
> this
> > is a misconfiguration and the node will fail to register with the leader
> > since there is no way for it to persist cluster id to its meta.properties
> > without formatting.
> >
> > Although I did not specify this yet on the KIP explicitly, after some
> > offline discussion I think it makes sense to enforce the following
> > invariant as part of the feature design: if the persisted metadata
> version
> > supports this feature, the ClusterId record must also be persisted. This
> is
> > enforceable on the write-path for MV, which occurs at two points-- during
> > formatting and during feature upgrades. There is a similar pattern with
> > kraft.version, as it gets written to disk at the same two points.
> >
> > RE JR6: The main motivation for writing cluster id to meta.properties as
> > well is because it can act as a projection of the cluster metadata
> > partition which essentially only exposes the cluster id to readers. For
> > example, the raft layer needs to be aware of the cluster id for its own
> RPC
> > handling/validation, but raft cannot read metadata records. There are
> many
> > readers of this cluster id value during the startup of the cluster.
> > Therefore, avoiding a read of the metadata partition to discover the
> value
> > of this metadata will prevent more complications of the startup code.
> >
> > Best,
> > Kevin Wu
> >
> >
> > On Tue, Feb 17, 2026 at 7:35 PM Jun Rao via dev <[email protected]>
> > wrote:
> >
> > > Hi, Kevin,
> > >
> > > Thanks for the KIP. A few comments.
> > >
> > > JR1. ClusterIdRecord : Could you define the record format?
> > >
> > > JR2. "a new MetadataVersion that supports encoding/decoding this
> record.
> > > This means that during formatting, the bootstrap ClusterIdRecord is
> only
> > > written if the cluster is formatted with a MV that supports this
> > feature."
> > > Could you describe who writes the ClusterIdRecord? Is it the leader
> > > controller? Also, when is the record written? Do we guarantee that MV
> is
> > > available at that time?
> > >
> > > JR3. "meta.properties can be written during kafka broker/controller
> > startup
> > > if it doesn't exist already (from formatting)"
> > > Could you describe when meta.properties is written? Is MV available at
> > that
> > > time?
> > >
> > > JR4. "Introduce a metadata record for cluster id + observers persist
> > > cluster id to meta.properties from metadata publishing pipeline"
> > > Could you clarify what observers are? Are they observer controllers or
> > are
> > > they brokers (which are referred to as observers to the controller)?
> > >
> > > JR5. "Bootstrap controllers can add a mandatory “cluster id” record
> > during
> > > formatting"
> > > This sounds like adding a ClusterIdRecord is optional. If so, could you
> > > describe when a record will be added and when a record will not be
> added?
> > >
> > > JR6. "However, kafka should still be able to handle the case where a
> > leader
> > > is elected without a cluster id in meta.properties , since KRaft does
> not
> > > need cluster.id  in order to elect a leader.
> > >           In this case, the active controller will write a cluster id
> > > record during the bootstrap metadata write."
> > > Hmm, earlier, the KIP says "Upon discovering the cluster ID for the
> first
> > > time, these nodes need to persist this to meta.properties". Why do we
> > need
> > > to introduce a separate place to write the cluster id to
> > meta.properties.
> > >
> > > Jun
> > >
> > >
> > > On Wed, Feb 11, 2026 at 10:21 AM Kevin Wu <[email protected]>
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > Manually bumping this thread after finalizing a design.
> > > > KIP link:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1262%3A+Enable+auto-formatting+directories
> > > >
> > > > Best,
> > > > Kevin Wu
> > > >
> > > > On Tue, Jan 6, 2026 at 7:18 AM Kevin Wu <[email protected]>
> > wrote:
> > > >
> > > > > Hello all,
> > > > >
> > > > > I would like to start a discussion on KIP-1262, which proposes
> > removing
> > > > > the formatting requirement for brokers and observer controllers.
> > > > Currently,
> > > > > I am considering two high-level designs, and would appreciate
> > community
> > > > > feedback on both approaches to decide on a final design.
> > > > >
> > > > > KIP link:
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1262%3A+Enable+auto-formatting+directories
> > > > >
> > > > > Best,
> > > > > Kevin Wu
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-1262: Enable auto-formatting directories

Reply via email to