Thinking about this some more, I have changed the error code on receipt of an 
incorrect cluster ID to REBOOTSTRAP_REQUIRED, matching incorrect node ID. This 
is because I have heard of situations in which people use rebootstrapping to 
switch clusters for recovery purposes so it's important that a retriable error 
is used. Logging on client and server will indicate when the checks fail, so 
the KIP's aim of making misconfiguration diagnosis easier will be satisfied 
while making the clients tolerant of intentional changes which should drive 
rebootstrapping.

Unless there are further comments, I will start voting on this KIP next week.

Thanks,
Andrew

On 2026/03/02 18:36:07 Rajini Sivaram wrote:
> Hi Andrew,
> 
> Thanks for the update, looks good.
> 
> Regards,
> 
> Rajini
> 
> On Mon, Mar 2, 2026 at 1:57 PM Andrew Schofield <[email protected]>
> wrote:
> 
> > Hi Rajini,
> > Thanks for your comments.
> >
> > I have changed the KIP such that the client discards cluster ID and node
> > information when rebootstrapping begins.
> >
> > I have also added a common client configuration to disable sending of the
> > cluster ID and node ID information, just in case there is a situation in
> > which the assumptions behind this KIP do not apply to an existing
> > deployment.
> >
> > Thanks,
> > Andrew
> >
> > On 2026/03/02 12:09:46 Rajini Sivaram wrote:
> > > Hi Andrew,
> > >
> > > Thanks for the KIP.
> > >
> > > The KIP says:
> > > If the client is bootstrapping, it does not supply ClusterId  or NodeId .
> > > After bootstrapping, during which it learns the information from its
> > initial
> > >  Metadata  response, it supplies both.
> > >
> > > It will be good to clarify the behaviour during re-bootstrapping. We
> > clear
> > > the current metadata during re-bootstrap and revert to bootstrap
> > metadata.
> > > At this point, we don't retain node ids or cluster id from previous
> > > metadata responses. I think this makes sense because we want
> > > re-bootstrapping to behave in the same way as the first bootstrap. If we
> > > retain this behaviour, validation of cluster id and node-id will be based
> > > on the Metadata response of the last bootstrap, which is not necessarily
> > > the initial Metadata response. I think this is the desired behaviour, can
> > > we clarify in the KIP?
> > >
> > > Kafka clients have always supported cluster id change without requiring
> > > restart. Do we need an opt-out in case some deployments rely on this
> > > feature? If re-bootstrapping is enabled, clients would re-bootstrap if
> > > connections consistently fail. So as long as we continue to clear old
> > > metadata on re-bootstrap, we should be fine. Not sure if we need an
> > > explicit opt-out for the case where re-bootstrapping is disabled.
> > >
> > > Thanks,
> > >
> > > Rajini
> > >
> > >
> > > On Thu, Feb 12, 2026 at 1:43 PM Andrew Schofield <[email protected]>
> > > wrote:
> > >
> > > > Hi David,
> > > > Thanks for your question.
> > > >
> > > > Here's one elderly JIRA I've unearthed which is related
> > > > https://issues.apache.org/jira/browse/KAFKA-15828.
> > > >
> > > > I am also aware of suspected problems in the networking for cloud
> > > > providers which occasionally seem to route connections to the wrong
> > place.
> > > >
> > > > The KIP is aiming to get some basic diagnosis and recovery into the
> > Kafka
> > > > protocol where today there is none. As you can imagine, there is total
> > > > mayhem when a client confidently thinks it's talking to one broker when
> > > > actually it's talking to quite another. Diagnosis of this kind of
> > problem
> > > > would really help in getting to the bottom of rare issues such as
> > these.
> > > >
> > > > Thanks,
> > > > Andrew
> > > >
> > > > On 2026/02/11 16:12:50 David Arthur wrote:
> > > > > Thanks for the KIP, Andrew. I'm all for making the client more robust
> > > > > against networking and deployment weirdness
> > > > >
> > > > > I'm not sure I fully grok the scenario you are covering here. It
> > sounds
> > > > > like you're guarding against a hostname being reused by a different
> > > > broker.
> > > > > Does the client not learn about the new broker hostnames when it
> > > > refreshes
> > > > > metadata periodically?
> > > > >
> > > > > -David
> > > > >
> > > > > On Thu, Nov 20, 2025 at 5:59 AM Andrew Schofield <
> > > > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > > I would like to start discussion of a new KIP for detecting and
> > > > handling
> > > > > > misrouted connections from Kafka clients. The Kafka protocol does
> > not
> > > > > > contain any information for working out when the broker metadata
> > > > > > information in a client is inconsistent or stale. This KIP
> > proposes a
> > > > way
> > > > > > to address this.
> > > > > >
> > > > > >
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1242%3A+Detection+and+handling+of+misrouted+connections
> > > > > >
> > > > > > Thanks,
> > > > > > Andrew
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > David Arthur
> > > > >
> > > >
> > >
> >
> 

Reply via email to