Hey all,

While I understand your points Divij, I am also not in favor of having two
official release branches being developed at the same time.
If we are really concerned about the metrics change or any other JIRA
ticket, we can have a separate branch for that, rather than a new release
branch. I think it will be difficult to merge with trunk when it is time to
do a real release.

I'm interested to hear from Ismael's new proposal :)

Justine



On Thu, Dec 21, 2023 at 8:00 AM Divij Vaidya <divijvaidy...@gmail.com>
wrote:

> Fair point David. The point of experimental release was to allow users to
> test the initial major version and allow for developers to start working on
> the major version. Even if we don't release, I think that there is value in
> starting a 4.x branch (separate from trunk).
>
> Having a 4.x branch will allow us to start developing (or removing) things
> that we are currently unable to do due to constraints of having to maintain
> backward compatibility of JDK 8 and other deprecated APIs/dependencies. If
> we don't do it right now and instead choose to do it after 3.8, there is
> very limited time (~3-4 months) for that branch to bake and make the
> required changes.
>
> As an example, our metrics library (metrics-core) is still running a
> version (2.2.0) from 2012. Upgrading it is a breaking change (long story,
> not relevant to this thread) and hence, we can't merge it to trunk right
> now. So, we will have to schedule this change between 3.8 & 4.0. What if we
> don't have developer bandwidth to work on this change during that 3 month
> window? With a 4.x branch, we can start building (and more importantly,
> testing!) changes for the next major version right away. There are numerous
> other things (I came across another one
> https://issues.apache.org/jira/browse/KAFKA-16041) that we can start doing
> now for 4.x.
>
> What do you think?
>
> --
> Divij Vaidya
>
>
>
> On Thu, Dec 21, 2023 at 4:30 PM David Jacot <dja...@confluent.io.invalid>
> wrote:
>
> > Hi Divij,
> >
> > > Release 4.0 as an "experimental" release
> >
> > I don't think that this is something that we should do. If we need more
> > time, we should just do a 3.8 release and then release 4.0 when we are
> > ready. An experimental major release will be more confusing than anything
> > else. We should also keep in mind that major releases are also adopted
> with
> > more scrutiny in general. I don't think that many users will jump to 4.0
> > anyway. They will likely wait for 4.0.1 or even 4.1.
> >
> > Best,
> > David
> >
> > On Thu, Dec 21, 2023 at 3:59 PM Divij Vaidya <divijvaidy...@gmail.com>
> > wrote:
> >
> > > Hi folks
> > >
> > > I am late to the conversation but I would like to add my point of view
> > > here.
> > >
> > > I have three main concerns:
> > >
> > > 1\ Durability/availability bugs in kraft - Even though kraft has been
> > > around for a while, we keep finding bugs that impact availability and
> > data
> > > durability in it almost with every release [1] [2]. It's a complex
> > feature
> > > and such bugs are expected during the stabilization phase. But we can't
> > > remove the alternative until we see stabilization in kraft i.e. no new
> > > stability/durability bugs for at least 2 releases.
> > > 2\ Parity with Zk - There are also pending bugs [3] which are in the
> > > category of Zk parity. Removing Zk from Kafka without having full
> feature
> > > parity with Zk will leave some Kafka users with no upgrade path.
> > > 3\ Test coverage - We also don't have sufficient test coverage for
> kraft
> > > since quite a few tests are Zk only at this stage.
> > >
> > > Given these concerns, I believe we need to reach 100% Zk parity and
> allow
> > > new feature stabilisation (such as scram, JBOD) for at least 1 version
> > > (maybe more if we find bugs in that feature) before we remove Zk. I
> also
> > > agree with the point of view that we can't delay 4.0 indefinitely and
> we
> > > need a clear cut line.
> > >
> > > Hence, I propose the following:
> > > 1\ Keep trunk with 3.x. Release 3.8 and potentially 3.9 if we find
> major
> > > (durability/availability related) bugs in 3.8. This will help users
> > > continue to use their tried and tested Kafka setup until we have a
> proven
> > > alternative from feature parity & stability point of view.
> > > 2\ Release 4.0 as an "experimental" release along with 3.8 "stable"
> > > release. This will help get user feedback on the feasibility of
> removing
> > Zk
> > > completely right now.
> > > 3\ Create a criteria for moving 4.1 as "stable" release instead of
> > > "experimental". This list should include 100% Zk parity and 100% Kafka
> > > tests operating with kraft. It will also include other community
> feedback
> > > from this & other threads.
> > > 4\ When the 4.x version is "stable", move the trunk to 4.x and stop all
> > > development on the 3.x branch.
> > >
> > > I acknowledge that earlier in the community, we have decided to make
> 3.7
> > as
> > > the last release in the 3.x series. But, IMO we have learnt a lot since
> > > then based on the continuous improvements in kraft. I believe we should
> > be
> > > flexible with our earlier stance here and allow for greater stability
> > > before forcing users to a completely new functionality.
> > >
> > > [1] https://issues.apache.org/jira/browse/KAFKA-15495
> > > [2] https://issues.apache.org/jira/browse/KAFKA-15489
> > > [3] https://issues.apache.org/jira/browse/KAFKA-14874
> > >
> > > --
> > > Divij Vaidya
> > >
> > >
> > >
> > > On Wed, Dec 20, 2023 at 4:59 PM Josep Prat <josep.p...@aiven.io.invalid
> >
> > > wrote:
> > >
> > > > Hi Justine, Luke, and others,
> > > >
> > > > I believe a 3.8 version would make sense, and I would say KIP-853
> > should
> > > be
> > > > part of it as well.
> > > >
> > > > Best,
> > > >
> > > > On Wed, Dec 20, 2023 at 4:11 PM Justine Olshan
> > > > <jols...@confluent.io.invalid>
> > > > wrote:
> > > >
> > > > > Hey Luke,
> > > > >
> > > > > I think your point is valid. This is another good reason to have a
> > 3.8
> > > > > release.
> > > > > Would you say that implementing KIP-966 in 3.8 would be an
> acceptable
> > > way
> > > > > to move forward?
> > > > >
> > > > > Thanks,
> > > > > Justine
> > > > >
> > > > >
> > > > > On Tue, Dec 19, 2023 at 4:35 AM Luke Chen <show...@gmail.com>
> wrote:
> > > > >
> > > > > > Hi Justine,
> > > > > >
> > > > > > Thanks for your reply.
> > > > > >
> > > > > > > I think that for folks that want to prioritize availability
> over
> > > > > > durability, the aggressive recovery strategy from KIP-966 should
> be
> > > > > > preferable to the old unclean leader election configuration.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas#KIP966:EligibleLeaderReplicas-Uncleanrecovery
> > > > > >
> > > > > > Yes, I'm aware that we're going to implement the new way of
> leader
> > > > > election
> > > > > > in KIP-966.
> > > > > > But obviously, KIP-966 is not included in v3.7.0.
> > > > > > What I'm worried about is the users who prioritize availability
> > over
> > > > > > durability and enable the unclean leader election in ZK mode.
> > > > > > Once they migrate to KRaft, there will be availability impact
> when
> > > > > unclean
> > > > > > leader election is needed.
> > > > > > And like you said, they can run unclean leader election via CLI,
> > but
> > > > > again,
> > > > > > the availability is already impacted, which might be unacceptable
> > in
> > > > some
> > > > > > cases.
> > > > > >
> > > > > > IMO, we should prioritize this missing feature and include it in
> > 3.x
> > > > > > release.
> > > > > > Including in 3.x release means users can migrate to KRaft in
> > > dual-write
> > > > > > mode, and run it for a while to make sure everything works fine,
> > > before
> > > > > > they decide to upgrade to 4.0.
> > > > > >
> > > > > > Does that make sense?
> > > > > >
> > > > > > Thanks.
> > > > > > Luke
> > > > > >
> > > > > > On Tue, Dec 19, 2023 at 12:15 AM Justine Olshan
> > > > > > <jols...@confluent.io.invalid> wrote:
> > > > > >
> > > > > > > Hey Luke --
> > > > > > >
> > > > > > > There were some previous discussions on the mailing list about
> > this
> > > > but
> > > > > > > looks like we didn't file the ticket
> > > > > > >
> https://lists.apache.org/thread/sqsssos1d9whgmo92vdn81n9r5woy1wk
> > > > > > >
> > > > > > > When I asked some of the folks who worked on Kraft about this,
> > they
> > > > > > > communicated to me that it was intentional to make unclean
> leader
> > > > > > election
> > > > > > > a manual action.
> > > > > > >
> > > > > > > I think that for folks that want to prioritize availability
> over
> > > > > > > durability, the aggressive recovery strategy from KIP-966
> should
> > be
> > > > > > > preferable to the old unclean leader election configuration.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas#KIP966:EligibleLeaderReplicas-Uncleanrecovery
> > > > > > >
> > > > > > > Let me know if we don't think this is sufficient.
> > > > > > >
> > > > > > > Justine
> > > > > > >
> > > > > > > On Mon, Dec 18, 2023 at 4:39 AM Luke Chen <show...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > We found that currently (the latest trunk branch), the
> unclean
> > > > leader
> > > > > > > > election is not supported in KRaft mode.
> > > > > > > > That is, when users enable `unclean.leader.election.enable`
> in
> > > > KRaft
> > > > > > > mode,
> > > > > > > > the config won't take effect and just behave like
> > > > > > > > `unclean.leader.election.enable` is disabled.
> > > > > > > > KAFKA-12670 <
> https://issues.apache.org/jira/browse/KAFKA-12670
> > >
> > > > was
> > > > > > > opened
> > > > > > > > for this and is still not resolved.
> > > > > > > >
> > > > > > > > I think this is a regression issue in KRaft mode, and we
> should
> > > > > > complete
> > > > > > > > this missing feature in 3.x release, instead of adding it in
> > 4.0.
> > > > > > > > Does anyone know what's status for this issue?
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > > > Luke
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Nov 27, 2023 at 4:38 PM Colin McCabe <
> > cmcc...@apache.org
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > On Fri, Nov 24, 2023, at 03:47, Anton Agestam wrote:
> > > > > > > > > > In your last message you wrote:
> > > > > > > > > >
> > > > > > > > > > > But, on the KRaft side, I still maintain that nothing
> is
> > > > > missing
> > > > > > > > except
> > > > > > > > > > > JBOD, which we already have a plan for.
> > > > > > > > > >
> > > > > > > > > > But earlier in this thread you mentioned an issue with
> > "torn
> > > > > > writes",
> > > > > > > > > > possibly missing tests, as well as the fact that the
> > > > recommended
> > > > > > > method
> > > > > > > > > of
> > > > > > > > > > replacing controller nodes is undocumented. Would you
> mind
> > > > > > clarifying
> > > > > > > > > what
> > > > > > > > > > your stance is on these three issues? Do you think that
> > they
> > > > are
> > > > > > > > > important
> > > > > > > > > > enablers of upgrade paths or not?
> > > > > > > > >
> > > > > > > > > Hi Anton,
> > > > > > > > >
> > > > > > > > > There shouldn't be anything blocking controller disk
> > > replacement
> > > > > now.
> > > > > > > > From
> > > > > > > > > memory (not looking at the code now), we do log recovery on
> > our
> > > > > > single
> > > > > > > > log
> > > > > > > > > directory every time we start the controller, so it should
> > > handle
> > > > > > > partial
> > > > > > > > > records there. I do agree that a test would be good, and
> some
> > > > > > > > > documentation. I'll probably take a look at that this week
> > if I
> > > > get
> > > > > > > some
> > > > > > > > > time.
> > > > > > > > >
> > > > > > > > > > > Well, the line was drawn in KIP-833. If we redraw it,
> > what
> > > is
> > > > > to
> > > > > > > stop
> > > > > > > > > us
> > > > > > > > > > > from redrawing it again and again?
> > > > > > > > > >
> > > > > > > > > > I'm fairly new to the Kafka community so please forgive
> me
> > if
> > > > I'm
> > > > > > > > missing
> > > > > > > > > > things that have been said in earlier discussions, but
> > > reading
> > > > up
> > > > > > on
> > > > > > > > that
> > > > > > > > > > KIP I see it has language like "Note: this timeline is
> very
> > > > rough
> > > > > > and
> > > > > > > > > > subject to change." in the section of versions, but it
> also
> > > > says
> > > > > > "As
> > > > > > > > > > outlined above, we expect to close these gaps soon" with
> > > > relation
> > > > > > to
> > > > > > > > the
> > > > > > > > > > outstanding features. From my perspective this doesn't
> > really
> > > > > look
> > > > > > > like
> > > > > > > > > an
> > > > > > > > > > agreement that dynamic quorum membership changes shall
> not
> > > be a
> > > > > > > blocker
> > > > > > > > > for
> > > > > > > > > > 4.0.
> > > > > > > > >
> > > > > > > > > The timeline was rough because we wrote that in 2022,
> trying
> > to
> > > > > look
> > > > > > > > > forward multiple releases. The gaps that were discussed
> have
> > > all
> > > > > been
> > > > > > > > > closed -- except for JBOD, which we are working on this
> > > quarter.
> > > > > > > > >
> > > > > > > > > The set of features needed for 4.0 is very clearly
> described
> > in
> > > > > > > KIP-833.
> > > > > > > > > There's no uncertainty on that point.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > To answer the specific question you pose here, "what is
> to
> > > stop
> > > > > us
> > > > > > > from
> > > > > > > > > > redrawing it again and again?", wouldn't the suggestion
> of
> > > > > parallel
> > > > > > > > work
> > > > > > > > > > lanes brought up by Josep address this concern?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > It's very important not to fragment the community by
> > supporting
> > > > > > > multiple
> > > > > > > > > long-running branch lines. At the end of the day, once
> branch
> > > 3's
> > > > > > time
> > > > > > > > has
> > > > > > > > > come, it needs to fade away, just like JDK 6 support or the
> > old
> > > > > Scala
> > > > > > > > > producer.
> > > > > > > > >
> > > > > > > > > best,
> > > > > > > > > Colin
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > BR,
> > > > > > > > > > Anton
> > > > > > > > > >
> > > > > > > > > > Den tors 23 nov. 2023 kl 05:48 skrev Colin McCabe <
> > > > > > > cmcc...@apache.org
> > > > > > > > >:
> > > > > > > > > >
> > > > > > > > > >> On Tue, Nov 21, 2023, at 19:30, Luke Chen wrote:
> > > > > > > > > >> > Yes, KIP-853 and disk failure support are both very
> > > > important
> > > > > > > > missing
> > > > > > > > > >> > features. For the disk failure support, I don't think
> > this
> > > > is
> > > > > a
> > > > > > > > > >> > "good-to-have-feature", it should be a "must-have"
> IMO.
> > We
> > > > > can't
> > > > > > > > > announce
> > > > > > > > > >> > the 4.0 release without a good solution for disk
> failure
> > > in
> > > > > > KRaft.
> > > > > > > > > >>
> > > > > > > > > >> Hi Luke,
> > > > > > > > > >>
> > > > > > > > > >> Thanks for the reply.
> > > > > > > > > >>
> > > > > > > > > >> Controller disk failure support is not missing from
> > KRaft. I
> > > > > > > described
> > > > > > > > > how
> > > > > > > > > >> to handle controller disk failures earlier in this
> thread.
> > > > > > > > > >>
> > > > > > > > > >> I should note here that the broker in ZooKeeper mode
> also
> > > > > requires
> > > > > > > > > manual
> > > > > > > > > >> handling of disk failures. Restarting a broker with the
> > same
> > > > ID,
> > > > > > but
> > > > > > > > an
> > > > > > > > > >> empty disk, breaks the invariants of replication when in
> > ZK
> > > > > mode.
> > > > > > > > > Consider:
> > > > > > > > > >>
> > > > > > > > > >> 1. Broker 1 goes down. A ZK state change notification
> for
> > > > > /brokers
> > > > > > > > fires
> > > > > > > > > >> and goes on the controller queue.
> > > > > > > > > >>
> > > > > > > > > >> 2. Broker 1 comes back up with an empty disk.
> > > > > > > > > >>
> > > > > > > > > >> 3. The controller processes the zk state change
> > notification
> > > > for
> > > > > > > > > /brokers.
> > > > > > > > > >> Since broker 1 is up no action is taken.
> > > > > > > > > >>
> > > > > > > > > >> 4. Now broker 1 is in the ISR for any partitions it was
> > > > > > previously,
> > > > > > > > but
> > > > > > > > > >> has no data. If it is or becomes leader for any
> > partitions,
> > > > > > > > irreversable
> > > > > > > > > >> data loss will occur.
> > > > > > > > > >>
> > > > > > > > > >> This problem is more than theoretical. We at Confluent
> > have
> > > > > > observed
> > > > > > > > it
> > > > > > > > > in
> > > > > > > > > >> production and put in place special workarounds for the
> ZK
> > > > > > clusters
> > > > > > > we
> > > > > > > > > >> still have.
> > > > > > > > > >>
> > > > > > > > > >> KRaft has never had this problem because brokers are
> > removed
> > > > > from
> > > > > > > ISRs
> > > > > > > > > >> when a new incarnation of the broker registers.
> > > > > > > > > >>
> > > > > > > > > >> So perhaps ZK mode is not ready for production for
> Aiven?
> > > > Since
> > > > > > disk
> > > > > > > > > >> failures do in fact require special handling there.
> > (And/or
> > > > > > bringing
> > > > > > > > up
> > > > > > > > > new
> > > > > > > > > >> nodes with empty disks, which seems to be their main
> > > concern.)
> > > > > > > > > >>
> > > > > > > > > >> >
> > > > > > > > > >> > It’s also worth thinking about how Apache Kafka users
> > who
> > > > > depend
> > > > > > > on
> > > > > > > > > JBOD
> > > > > > > > > >> > might look at the risks of not having a 3.8 release.
> > JBOD
> > > > > > support
> > > > > > > on
> > > > > > > > > >> KRaft
> > > > > > > > > >> > is planned to be added in 3.7, and is still in
> progress
> > so
> > > > > far.
> > > > > > So
> > > > > > > > > it’s
> > > > > > > > > >> > hard to say it’s a blocker or not. But in practice,
> even
> > > if
> > > > > the
> > > > > > > > > feature
> > > > > > > > > >> is
> > > > > > > > > >> > made into 3.7 in time, a lot of new code for this
> > feature
> > > is
> > > > > > > > unlikely
> > > > > > > > > to
> > > > > > > > > >> be
> > > > > > > > > >> > entirely bug free. We need to maintain the confidence
> of
> > > > those
> > > > > > > > users,
> > > > > > > > > and
> > > > > > > > > >> > forcing them to migrate through 3.7 where this new
> code
> > is
> > > > > > hardly
> > > > > > > > > >> > battle-tested doesn’t appear to do that.
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > > >> As Ismael said, if there are JBOD bugs in 3.7, we will
> do
> > > > > > follow-on
> > > > > > > > > point
> > > > > > > > > >> releases to address them.
> > > > > > > > > >>
> > > > > > > > > >> > Our goal for 4.0 should be that all the “main”
> features
> > in
> > > > > KRaft
> > > > > > > are
> > > > > > > > > in
> > > > > > > > > >> > production ready state. To reach the goal, I think
> > having
> > > > one
> > > > > > more
> > > > > > > > > >> release
> > > > > > > > > >> > makes sense. We can have different opinions about what
> > the
> > > > > “main
> > > > > > > > > >> features”
> > > > > > > > > >> > in KRaft are, but we should all agree, JBOD is one of
> > > them.
> > > > > > > > > >>
> > > > > > > > > >> The current plan is for JBOD to be production-ready in
> the
> > > 3.7
> > > > > > > branch.
> > > > > > > > > >>
> > > > > > > > > >> The other features of KRaft have been in
> production-ready
> > > > state
> > > > > > > since
> > > > > > > > > the
> > > > > > > > > >> 3.3 release. (Well, except for delegation tokens and
> > SCRAM,
> > > > > which
> > > > > > > were
> > > > > > > > > >> implemented in 3.5 and 3.6)
> > > > > > > > > >>
> > > > > > > > > >> > I totally agree with you. We can keep delaying the 4.0
> > > > release
> > > > > > > > > forever.
> > > > > > > > > >> I'd
> > > > > > > > > >> > also like to draw a line to it. So, in my opinion, the
> > 3.8
> > > > > > release
> > > > > > > > is
> > > > > > > > > the
> > > > > > > > > >> > line. No 3.9, 3.10 releases after that. If this is the
> > > > > decision,
> > > > > > > > will
> > > > > > > > > >> your
> > > > > > > > > >> > concern about this infinite loop disappear?
> > > > > > > > > >>
> > > > > > > > > >> Well, the line was drawn in KIP-833. If we redraw it,
> what
> > > is
> > > > to
> > > > > > > stop
> > > > > > > > us
> > > > > > > > > >> from redrawing it again and again?
> > > > > > > > > >>
> > > > > > > > > >> >
> > > > > > > > > >> > Final note: Speaking of the missing features, I can
> > always
> > > > > > > cooperate
> > > > > > > > > with
> > > > > > > > > >> > you and all other community contributors to make them
> > > > happen,
> > > > > > like
> > > > > > > > we
> > > > > > > > > >> have
> > > > > > > > > >> > discussed earlier. Just let me know.
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > > >> Thanks, Luke. I appreciate the offer.
> > > > > > > > > >>
> > > > > > > > > >> But, on the KRaft side, I still maintain that nothing is
> > > > missing
> > > > > > > > except
> > > > > > > > > >> JBOD, which we already have a plan for.
> > > > > > > > > >>
> > > > > > > > > >> best,
> > > > > > > > > >> Colin
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> > Thank you.
> > > > > > > > > >> > Luke
> > > > > > > > > >> >
> > > > > > > > > >> > On Wed, Nov 22, 2023 at 2:54 AM Colin McCabe <
> > > > > > cmcc...@apache.org>
> > > > > > > > > wrote:
> > > > > > > > > >> >
> > > > > > > > > >> >> On Tue, Nov 21, 2023, at 03:47, Josep Prat wrote:
> > > > > > > > > >> >> > Hi Colin,
> > > > > > > > > >> >> >
> > > > > > > > > >> >> > I think it's great that Confluent runs KRaft
> clusters
> > > in
> > > > > > > > > production,
> > > > > > > > > >> >> > and it means that it is production ready for
> > Confluent
> > > > and
> > > > > > it's
> > > > > > > > > users.
> > > > > > > > > >> >> > But luckily for Kafka, the community is bigger than
> > > this
> > > > > > (self
> > > > > > > > > managed
> > > > > > > > > >> >> > in the cloud or in-prem, or customers of other SaaS
> > > > > > companies).
> > > > > > > > > >> >>
> > > > > > > > > >> >> Hi Josep,
> > > > > > > > > >> >>
> > > > > > > > > >> >> Confluent is not the only company using or developing
> > > > KRaft.
> > > > > > Most
> > > > > > > > of
> > > > > > > > > the
> > > > > > > > > >> >> big organizations developing Kafka are involved. I
> > > > mentioned
> > > > > > > > > Confluent's
> > > > > > > > > >> >> deployments because I wanted to be clear that KRaft
> > mode
> > > is
> > > > > not
> > > > > > > > > >> >> experimental or new. Talking about software in
> > production
> > > > is
> > > > > a
> > > > > > > good
> > > > > > > > > way
> > > > > > > > > >> to
> > > > > > > > > >> >> clear up these misconceptions.
> > > > > > > > > >> >>
> > > > > > > > > >> >> Indeed, KRaft mode is many years old. It started
> around
> > > > 2020,
> > > > > > and
> > > > > > > > > became
> > > > > > > > > >> >> production-ready in AK 3.5 in 2022. ZK mode was
> > > deprecated
> > > > in
> > > > > > AK
> > > > > > > > 3.5,
> > > > > > > > > >> which
> > > > > > > > > >> >> was released June 2023. If we release AK 4.0 around
> > April
> > > > (or
> > > > > > > > maybe a
> > > > > > > > > >> month
> > > > > > > > > >> >> or two later) then that will be almost a full year
> > > between
> > > > > > > > > deprecation
> > > > > > > > > >> and
> > > > > > > > > >> >> removal of ZK mode. We've talked about this a lot, in
> > > KIPs,
> > > > > in
> > > > > > > > Apache
> > > > > > > > > >> blog
> > > > > > > > > >> >> posts, at conferences, and so forth.
> > > > > > > > > >> >>
> > > > > > > > > >> >> > We've heard at least from 1 SaaS company, Aiven
> > > > > (disclaimer,
> > > > > > it
> > > > > > > > is
> > > > > > > > > my
> > > > > > > > > >> >> > employer) where the current feature set makes it
> not
> > > > > trivial
> > > > > > to
> > > > > > > > > >> >> > migrate. This same issue might happen not only at
> > Aiven
> > > > but
> > > > > > > with
> > > > > > > > > any
> > > > > > > > > >> >> > user of Kafka who uses immutable infrastructure.
> > > > > > > > > >> >>
> > > > > > > > > >> >> Can you discuss why you feel it is "not trivial to
> > > > migrate"?
> > > > > > From
> > > > > > > > the
> > > > > > > > > >> >> discussion above, the main gap is that we should
> > improve
> > > > the
> > > > > > > > > >> documentation
> > > > > > > > > >> >> for handling failed disks.
> > > > > > > > > >> >>
> > > > > > > > > >> >> > Another case is for
> > > > > > > > > >> >> > users that have hundreds (or more) of clusters and
> > more
> > > > > than
> > > > > > > 100k
> > > > > > > > > >> nodes
> > > > > > > > > >> >> > experience node failures multiple times during a
> > single
> > > > > day.
> > > > > > In
> > > > > > > > > this
> > > > > > > > > >> >> > situation, not having KIP 853 makes these power
> users
> > > > > unable
> > > > > > to
> > > > > > > > > join
> > > > > > > > > >> >> > the game as  introducing a new error-prone manual
> (or
> > > > > needed
> > > > > > to
> > > > > > > > > >> >> > automate) operation is usually a huge no-go.
> > > > > > > > > >> >>
> > > > > > > > > >> >> We have thousands of KRaft clusters in production and
> > > > haven't
> > > > > > > seen
> > > > > > > > > these
> > > > > > > > > >> >> problems, as I described above.
> > > > > > > > > >> >>
> > > > > > > > > >> >> best,
> > > > > > > > > >> >> Colin
> > > > > > > > > >> >>
> > > > > > > > > >> >> >
> > > > > > > > > >> >> > But I hear the concerns of delaying 4.0 for
> another 3
> > > to
> > > > 4
> > > > > > > > months.
> > > > > > > > > >> >> > Would it help if we would aim at shortening the
> > > timeline
> > > > > for
> > > > > > > > 3.8.0
> > > > > > > > > and
> > > > > > > > > >> >> > start with the 4.0.0 a bit earlier help?
> > > > > > > > > >> >> > Maybe we could work on 3.8.0 almost in parallel
> with
> > > > 4.0.0:
> > > > > > > > > >> >> > - Start with 3.8.0 release process
> > > > > > > > > >> >> > - After a small time (let's say a week) create the
> > > > release
> > > > > > > branch
> > > > > > > > > >> >> > - Start with 4.0.0 release process as usual
> > > > > > > > > >> >> > - Cherry pick KRaft related issues to 3.8.0
> > > > > > > > > >> >> > - Release 3.8.0
> > > > > > > > > >> >> > I suspect 4.0.0 will need a bit more time than
> usual
> > to
> > > > > > ensure
> > > > > > > > the
> > > > > > > > > >> code
> > > > > > > > > >> >> > is cleaned up of deprecated classes and methods on
> > top
> > > of
> > > > > the
> > > > > > > > usual
> > > > > > > > > >> >> > work we have. For this reason I think there would
> be
> > > > enough
> > > > > > > time
> > > > > > > > > >> >> > between releasing 3.8.0 and 4.0.0.
> > > > > > > > > >> >> >
> > > > > > > > > >> >> > What do you all think?
> > > > > > > > > >> >> >
> > > > > > > > > >> >> > Best,
> > > > > > > > > >> >> > Josep Prat
> > > > > > > > > >> >> >
> > > > > > > > > >> >> > On 2023/11/20 20:03:18 Colin McCabe wrote:
> > > > > > > > > >> >> >> Hi Josep,
> > > > > > > > > >> >> >>
> > > > > > > > > >> >> >> I think there is some confusion here. Quorum
> > > > > reconfiguration
> > > > > > > is
> > > > > > > > > not
> > > > > > > > > >> >> needed for KRaft to become production ready.
> Confluent
> > > runs
> > > > > > > > > thousands of
> > > > > > > > > >> >> KRaft clusters without quorum reconfiguration, and
> has
> > > for
> > > > > > years.
> > > > > > > > > While
> > > > > > > > > >> >> dynamic quorum reconfiguration is a nice feature, it
> > > > doesn't
> > > > > > > block
> > > > > > > > > >> >> anything: not migration, not deployment. As best as I
> > > > > > understand
> > > > > > > > it,
> > > > > > > > > the
> > > > > > > > > >> >> use-case Aiven has isn't even reconfiguration per se,
> > > just
> > > > > > > wiping a
> > > > > > > > > >> disk.
> > > > > > > > > >> >> There are ways to handle this -- I discussed some
> > earlier
> > > > in
> > > > > > the
> > > > > > > > > >> thread. I
> > > > > > > > > >> >> think it would be productive to continue that
> > discussion
> > > --
> > > > > > > > > especially
> > > > > > > > > >> the
> > > > > > > > > >> >> part around documentation and testing of these cases.
> > > > > > > > > >> >> >>
> > > > > > > > > >> >> >> A lot of people have done a lot of work to get
> Kafka
> > > 4.0
> > > > > > > ready.
> > > > > > > > I
> > > > > > > > > >> would
> > > > > > > > > >> >> not want to delay that because we want an additional
> > > > feature.
> > > > > > And
> > > > > > > > we
> > > > > > > > > >> will
> > > > > > > > > >> >> always want additional features. So I am concerned we
> > > will
> > > > > end
> > > > > > up
> > > > > > > > in
> > > > > > > > > an
> > > > > > > > > >> >> infinite loop of people asking for "just one more
> > > feature"
> > > > > > before
> > > > > > > > > they
> > > > > > > > > >> >> migrate.
> > > > > > > > > >> >> >>
> > > > > > > > > >> >> >> best,
> > > > > > > > > >> >> >> Colin
> > > > > > > > > >> >> >>
> > > > > > > > > >> >> >>
> > > > > > > > > >> >> >> On Mon, Nov 20, 2023, at 04:15, Josep Prat wrote:
> > > > > > > > > >> >> >> > Hi all,
> > > > > > > > > >> >> >> >
> > > > > > > > > >> >> >> > I wanted to share my opinion regarding this
> > topic. I
> > > > > know
> > > > > > > some
> > > > > > > > > >> >> >> > discussions happened some time ago (over a year)
> > > but I
> > > > > > > believe
> > > > > > > > > it's
> > > > > > > > > >> >> >> > wise to reflect and re-evaluate if those
> decisions
> > > are
> > > > > > still
> > > > > > > > > valid.
> > > > > > > > > >> >> >> > KRaft, as of Kafka 3.6.x and 3.7.x, has not yet
> > > > feature
> > > > > > > parity
> > > > > > > > > with
> > > > > > > > > >> >> >> > Zookeeper. By dropping Zookeeper altogether
> before
> > > > > > achieving
> > > > > > > > > such
> > > > > > > > > >> >> >> > parity, we are opening the door to leaving a
> chunk
> > > of
> > > > > > Apache
> > > > > > > > > Kafka
> > > > > > > > > >> >> >> > users without an easy way to upgrade to 4.0.
> > > > > > > > > >> >> >> > In pro of making upgrades as smooth as
> possible, I
> > > > > propose
> > > > > > > to
> > > > > > > > > have
> > > > > > > > > >> a
> > > > > > > > > >> >> >> > Kafka version where KIP-853 is merged and
> > Zookeeper
> > > > > still
> > > > > > is
> > > > > > > > > >> >> supported.
> > > > > > > > > >> >> >> > This will enable community members who can't
> > migrate
> > > > yet
> > > > > > to
> > > > > > > > > KRaft
> > > > > > > > > >> to
> > > > > > > > > >> >> do
> > > > > > > > > >> >> >> > so in a safe way (rolling back is something goes
> > > > wrong).
> > > > > > > > > >> >> Additionally,
> > > > > > > > > >> >> >> > this will give us more confidence on having
> KRaft
> > > > > > replacing
> > > > > > > > > >> >> >> > successfully Zookeeper without any big problems
> by
> > > > > > > discovering
> > > > > > > > > and
> > > > > > > > > >> >> >> > fixing bugs or by confirming that KRaft works as
> > > > > expected.
> > > > > > > > > >> >> >> > For this I strongly believe we should have a
> 3.8.x
> > > > > version
> > > > > > > > > before
> > > > > > > > > >> >> 4.0.x.
> > > > > > > > > >> >> >> >
> > > > > > > > > >> >> >> > What do other think in this regard?
> > > > > > > > > >> >> >> >
> > > > > > > > > >> >> >> > Best,
> > > > > > > > > >> >> >> >
> > > > > > > > > >> >> >> > On 2023/11/14 20:47:10 Colin McCabe wrote:
> > > > > > > > > >> >> >> >> On Tue, Nov 14, 2023, at 04:37, Anton Agestam
> > > wrote:
> > > > > > > > > >> >> >> >> > Hi Colin,
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > Thank you for your thoughtful and
> comprehensive
> > > > > > response.
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> >> KIP-853 is not a blocker for either 3.7 or
> > 4.0.
> > > We
> > > > > > > > discussed
> > > > > > > > > >> this
> > > > > > > > > >> >> in
> > > > > > > > > >> >> >> >> >> several KIPs that happened this year and
> last
> > > > year.
> > > > > > The
> > > > > > > > most
> > > > > > > > > >> >> notable was
> > > > > > > > > >> >> >> >> >> probably KIP-866, which was approved in May
> > > 2022.
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > I understand this is the case, I'm raising my
> > > > concern
> > > > > > > > > because I
> > > > > > > > > >> was
> > > > > > > > > >> >> >> >> > foreseeing some major pain points as a
> > > consequence
> > > > of
> > > > > > > this
> > > > > > > > > >> >> decision. Just
> > > > > > > > > >> >> >> >> > to make it clear though: I am not asking for
> > > anyone
> > > > > to
> > > > > > do
> > > > > > > > > work
> > > > > > > > > >> for
> > > > > > > > > >> >> me, and
> > > > > > > > > >> >> >> >> > I understand the limitations of resources
> > > available
> > > > > to
> > > > > > > > > implement
> > > > > > > > > >> >> features.
> > > > > > > > > >> >> >> >> > What I was asking is rather to consider the
> > > > > > implications
> > > > > > > of
> > > > > > > > > >> >> _removing_
> > > > > > > > > >> >> >> >> > features before there exists a replacement
> for
> > > > them.
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > I understand that the timeframe for 3.7 isn't
> > > > > feasible,
> > > > > > > and
> > > > > > > > > >> >> because of that
> > > > > > > > > >> >> >> >> > I think what I was asking is rather: can we
> > make
> > > > sure
> > > > > > > that
> > > > > > > > > there
> > > > > > > > > >> >> are more
> > > > > > > > > >> >> >> >> > 3.x releases until controller quorum online
> > > > resizing
> > > > > is
> > > > > > > > > >> >> implemented?
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > From your response, I gather that your stance
> > is
> > > > that
> > > > > > > it's
> > > > > > > > > >> >> important to
> > > > > > > > > >> >> >> >> > drop ZK support sooner rather than later and
> > that
> > > > the
> > > > > > > > > necessary
> > > > > > > > > >> >> pieces for
> > > > > > > > > >> >> >> >> > doing so are already in place.
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> Hi Anton,
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> Yes. I'm basically just repeating what we
> agreed
> > > upon
> > > > > in
> > > > > > > 2022
> > > > > > > > > as
> > > > > > > > > >> >> part of KIP-833.
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > ---
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > I want to make sure I've understood your
> > > suggested
> > > > > > > sequence
> > > > > > > > > for
> > > > > > > > > >> >> controller
> > > > > > > > > >> >> >> >> > node replacement. I hope the mentions of
> > > Kubernetes
> > > > > are
> > > > > > > > > rather
> > > > > > > > > >> for
> > > > > > > > > >> >> examples
> > > > > > > > > >> >> >> >> > of how to carry things out, rather than
> saying
> > > > "this
> > > > > is
> > > > > > > > only
> > > > > > > > > >> >> supported on
> > > > > > > > > >> >> >> >> > Kubernetes"?
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> Apache Kafka is supported in lots of
> > environments,
> > > > > > > including
> > > > > > > > > >> non-k8s
> > > > > > > > > >> >> ones. I was just pointing out that using k8s means
> that
> > > you
> > > > > > > control
> > > > > > > > > your
> > > > > > > > > >> >> own DNS resolution, which simplifies matters. If you
> > > don't
> > > > > > > control
> > > > > > > > > DNS
> > > > > > > > > >> >> there are some extra steps for changing the quorum
> > > voters.
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > Given we have three existing nodes as such:
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > - a.local -> 192.168.0.100
> > > > > > > > > >> >> >> >> > - b.local -> 192.168.0.101
> > > > > > > > > >> >> >> >> > - c.local -> 192.168.0.102
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > As well as a candidate node 192.168.0.103
> that
> > we
> > > > > want
> > > > > > to
> > > > > > > > > >> replace
> > > > > > > > > >> >> for the
> > > > > > > > > >> >> >> >> > role of c.local.
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > 1. Shut down controller process on node .102
> > (to
> > > > make
> > > > > > > sure
> > > > > > > > we
> > > > > > > > > >> >> don't "go
> > > > > > > > > >> >> >> >> > back in time").
> > > > > > > > > >> >> >> >> > 2. rsync state from leader to .103.
> > > > > > > > > >> >> >> >> > 3. Start controller process on .103.
> > > > > > > > > >> >> >> >> > 4. Point the c.local entry at .103.
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > I have a few questions about this sequence:
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > 1. Would this sequence be safe against
> > leadership
> > > > > > > changes?
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> If the leader changes, the new leader should
> have
> > > all
> > > > > of
> > > > > > > the
> > > > > > > > > >> >> committed entries that the old leader had.
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> > 2. Does it work
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> Probably the biggest issue is dealing with
> "torn
> > > > > writes"
> > > > > > > that
> > > > > > > > > >> happen
> > > > > > > > > >> >> because you're copying the current log segment while
> > it's
> > > > > being
> > > > > > > > > written
> > > > > > > > > >> to.
> > > > > > > > > >> >> The system should be robust against this. However, we
> > > don't
> > > > > > > > > regularly do
> > > > > > > > > >> >> this, so there hasn't been a lot of testing.
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> I think Jose had a PR for improving the
> handling
> > of
> > > > > this
> > > > > > > > which
> > > > > > > > > we
> > > > > > > > > >> >> might want to dig up. We'd want the system to
> > > auto-truncate
> > > > > the
> > > > > > > > > partial
> > > > > > > > > >> >> record at the end of the log, if there is one.
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> > 3. By "state", do we mean `metadata.log.dir`?
> > > > > Something
> > > > > > > > else?
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> Yes, the state of the metadata.log.dir. Keep in
> > > mind
> > > > > you
> > > > > > > will
> > > > > > > > > need
> > > > > > > > > >> >> to change the node ID in meta.properties after
> copying,
> > > of
> > > > > > > course.
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> > 4. What are the effects on cluster
> > availability?
> > > (I
> > > > > > think
> > > > > > > > > this
> > > > > > > > > >> is
> > > > > > > > > >> >> the same
> > > > > > > > > >> >> >> >> > as asking what happens if a or b crashes
> during
> > > the
> > > > > > > > process,
> > > > > > > > > or
> > > > > > > > > >> if
> > > > > > > > > >> >> network
> > > > > > > > > >> >> >> >> > partitions occur).
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> Cluster metadata state tends to be pretty
> small.
> > > > > > typically
> > > > > > > a
> > > > > > > > > >> hundred
> > > > > > > > > >> >> megabytes or so. Therefore, I do not think it will
> take
> > > > more
> > > > > > > than a
> > > > > > > > > >> second
> > > > > > > > > >> >> or two to copy from one node to another. However, if
> > you
> > > do
> > > > > > > > > experience a
> > > > > > > > > >> >> crash when one node out of three is down, then you
> will
> > > be
> > > > > > > > > unavailable
> > > > > > > > > >> >> until you can bring up a second node to regain a
> > > majority.
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > ---
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > If this is considered the official way of
> > > handling
> > > > > > > > controller
> > > > > > > > > >> node
> > > > > > > > > >> >> >> >> > replacements, does it make sense to improve
> > > > > > documentation
> > > > > > > > in
> > > > > > > > > >> this
> > > > > > > > > >> >> area? Is
> > > > > > > > > >> >> >> >> > there already a plan for this documentation
> > layed
> > > > out
> > > > > > in
> > > > > > > > some
> > > > > > > > > >> >> KIPs? This is
> > > > > > > > > >> >> >> >> > something I'd be happy to contribute to.
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> Yes, I think we should have official
> > documentation
> > > > > about
> > > > > > > > this.
> > > > > > > > > >> We'd
> > > > > > > > > >> >> be happy to review anything in that area.
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> >> To circle back to KIP-853, I think it
> stands a
> > > > good
> > > > > > > chance
> > > > > > > > > of
> > > > > > > > > >> >> making it
> > > > > > > > > >> >> >> >> >> into AK 4.0.
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > This sounds good, but the point I was making
> > was
> > > if
> > > > > we
> > > > > > > > could
> > > > > > > > > >> have
> > > > > > > > > >> >> a release
> > > > > > > > > >> >> >> >> > with both KRaft and ZK supporting this
> feature
> > to
> > > > > ease
> > > > > > > the
> > > > > > > > > >> >> migration out of
> > > > > > > > > >> >> >> >> > ZK.
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> The problem is, supporting multiple controller
> > > > > > > > implementations
> > > > > > > > > is
> > > > > > > > > >> a
> > > > > > > > > >> >> huge burden. So we don't want to extend the 3.x
> release
> > > > past
> > > > > > the
> > > > > > > > > point
> > > > > > > > > >> >> that's needed to complete all the must-dos (SCRAM,
> > > > delegation
> > > > > > > > tokens,
> > > > > > > > > >> JBOD)
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> best,
> > > > > > > > > >> >> >> >> Colin
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >> >> > BR,
> > > > > > > > > >> >> >> >> > Anton
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> > Den tors 9 nov. 2023 kl 23:04 skrev Colin
> > McCabe
> > > <
> > > > > > > > > >> >> cmcc...@apache.org>:
> > > > > > > > > >> >> >> >> >
> > > > > > > > > >> >> >> >> >> Hi Anton,
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> It rarely makes sense to scale up and down
> the
> > > > > number
> > > > > > of
> > > > > > > > > >> >> controller nodes
> > > > > > > > > >> >> >> >> >> in the cluster. Only one controller node
> will
> > be
> > > > > > active
> > > > > > > at
> > > > > > > > > any
> > > > > > > > > >> >> given time.
> > > > > > > > > >> >> >> >> >> The main reason to use 5 nodes would be to
> be
> > > able
> > > > > to
> > > > > > > > > tolerate
> > > > > > > > > >> 2
> > > > > > > > > >> >> failures
> > > > > > > > > >> >> >> >> >> instead of 1.
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> At Confluent, we generally run KRaft with 3
> > > > > > controllers.
> > > > > > > > We
> > > > > > > > > >> have
> > > > > > > > > >> >> not seen
> > > > > > > > > >> >> >> >> >> problems with this setup, even with
> thousands
> > of
> > > > > > > clusters.
> > > > > > > > > We
> > > > > > > > > >> have
> > > > > > > > > >> >> >> >> >> discussed using 5 node controller clusters
> on
> > > > > certain
> > > > > > > very
> > > > > > > > > big
> > > > > > > > > >> >> clusters,
> > > > > > > > > >> >> >> >> >> but we haven't done that yet. This is all
> very
> > > > > similar
> > > > > > > to
> > > > > > > > > ZK,
> > > > > > > > > >> >> where most
> > > > > > > > > >> >> >> >> >> deployments were 3 nodes as well.
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> KIP-853 is not a blocker for either 3.7 or
> > 4.0.
> > > We
> > > > > > > > discussed
> > > > > > > > > >> this
> > > > > > > > > >> >> in
> > > > > > > > > >> >> >> >> >> several KIPs that happened this year and
> last
> > > > year.
> > > > > > The
> > > > > > > > most
> > > > > > > > > >> >> notable was
> > > > > > > > > >> >> >> >> >> probably KIP-866, which was approved in May
> > > 2022.
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> Many users these days run in a Kubernetes
> > > > > environment
> > > > > > > > where
> > > > > > > > > >> >> Kubernetes
> > > > > > > > > >> >> >> >> >> actually controls the DNS. This makes
> changing
> > > the
> > > > > set
> > > > > > > of
> > > > > > > > > >> voters
> > > > > > > > > >> >> less
> > > > > > > > > >> >> >> >> >> important than it was historically.
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> For example, in a world with static DNS, you
> > > might
> > > > > > have
> > > > > > > to
> > > > > > > > > >> change
> > > > > > > > > >> >> the
> > > > > > > > > >> >> >> >> >> controller.quorum.voters setting from:
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> 100@a.local:9073,101@b.local
> :9073,102@c.local
> > > > :9073
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> to:
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> 100@a.local:9073,101@b.local
> :9073,102@d.local
> > > > :9073
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> In a world with k8s controlling the DNS, you
> > > > simply
> > > > > > > remap
> > > > > > > > > >> c.local
> > > > > > > > > >> >> to point
> > > > > > > > > >> >> >> >> >> ot the IP address of your new pod for
> > controller
> > > > > 102,
> > > > > > > and
> > > > > > > > > >> you're
> > > > > > > > > >> >> done. No
> > > > > > > > > >> >> >> >> >> need to update controller.quorum.voters.
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> Another question is whether you re-create
> the
> > > pod
> > > > > data
> > > > > > > > from
> > > > > > > > > >> >> scratch every
> > > > > > > > > >> >> >> >> >> time you add a new node. If you store the
> > > > controller
> > > > > > > data
> > > > > > > > > on an
> > > > > > > > > >> >> EBS volume
> > > > > > > > > >> >> >> >> >> (or cloud-specific equivalent), you really
> > only
> > > > have
> > > > > > to
> > > > > > > > > detach
> > > > > > > > > >> it
> > > > > > > > > >> >> from the
> > > > > > > > > >> >> >> >> >> previous pod and re-attach it to the new
> pod.
> > > k8s
> > > > > also
> > > > > > > > > handles
> > > > > > > > > >> >> this
> > > > > > > > > >> >> >> >> >> automatically, of course.
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> If you want to reconstruct the full
> controller
> > > pod
> > > > > > state
> > > > > > > > > each
> > > > > > > > > >> >> time you
> > > > > > > > > >> >> >> >> >> create a new pod (for example, so that you
> can
> > > use
> > > > > > only
> > > > > > > > > >> instance
> > > > > > > > > >> >> storage),
> > > > > > > > > >> >> >> >> >> you should be able to rsync that state from
> > the
> > > > > > leader.
> > > > > > > In
> > > > > > > > > >> >> general, the
> > > > > > > > > >> >> >> >> >> invariant that we want to maintain is that
> the
> > > > state
> > > > > > > > should
> > > > > > > > > not
> > > > > > > > > >> >> "go back in
> > > > > > > > > >> >> >> >> >> time" -- if controller 102 promised to hold
> > all
> > > > log
> > > > > > data
> > > > > > > > up
> > > > > > > > > to
> > > > > > > > > >> >> offset X, it
> > > > > > > > > >> >> >> >> >> should come back with committed data at at
> > least
> > > > > that
> > > > > > > > > offset.
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> There are lots of new features we'd like to
> > > > > implement
> > > > > > > for
> > > > > > > > > >> KRaft,
> > > > > > > > > >> >> and Kafka
> > > > > > > > > >> >> >> >> >> in general. If you have some you really
> would
> > > like
> > > > > to
> > > > > > > > see, I
> > > > > > > > > >> >> think everyone
> > > > > > > > > >> >> >> >> >> in the community would be happy to work with
> > > you.
> > > > > The
> > > > > > > flip
> > > > > > > > > >> side,
> > > > > > > > > >> >> of course,
> > > > > > > > > >> >> >> >> >> is that since there are an unlimited number
> of
> > > > > > features
> > > > > > > we
> > > > > > > > > >> could
> > > > > > > > > >> >> do, we
> > > > > > > > > >> >> >> >> >> can't really block the release for any one
> > > > feature.
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> To circle back to KIP-853, I think it
> stands a
> > > > good
> > > > > > > chance
> > > > > > > > > of
> > > > > > > > > >> >> making it
> > > > > > > > > >> >> >> >> >> into AK 4.0. Jose, Alyssa, and some other
> > people
> > > > > have
> > > > > > > > > worked on
> > > > > > > > > >> >> it. It
> > > > > > > > > >> >> >> >> >> definitely won't make it into 3.7, since we
> > have
> > > > > only
> > > > > > a
> > > > > > > > few
> > > > > > > > > >> weeks
> > > > > > > > > >> >> left
> > > > > > > > > >> >> >> >> >> before that release happens.
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> best,
> > > > > > > > > >> >> >> >> >> Colin
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> On Thu, Nov 9, 2023, at 00:20, Anton Agestam
> > > > wrote:
> > > > > > > > > >> >> >> >> >> > Hi Luke,
> > > > > > > > > >> >> >> >> >> >
> > > > > > > > > >> >> >> >> >> > We have been looking into what switching
> > from
> > > ZK
> > > > > to
> > > > > > > > KRaft
> > > > > > > > > >> will
> > > > > > > > > >> >> mean for
> > > > > > > > > >> >> >> >> >> > Aiven.
> > > > > > > > > >> >> >> >> >> >
> > > > > > > > > >> >> >> >> >> > We heavily depend on an “immutable
> > > > infrastructure”
> > > > > > > model
> > > > > > > > > for
> > > > > > > > > >> >> deployments.
> > > > > > > > > >> >> >> >> >> > This means that, when we perform upgrades,
> > we
> > > > > > > introduce
> > > > > > > > > new
> > > > > > > > > >> >> nodes to our
> > > > > > > > > >> >> >> >> >> > clusters, scale the cluster up to
> > incorporate
> > > > the
> > > > > > new
> > > > > > > > > nodes,
> > > > > > > > > >> >> and then
> > > > > > > > > >> >> >> >> >> phase
> > > > > > > > > >> >> >> >> >> > the old ones out once all partitions are
> > moved
> > > > to
> > > > > > the
> > > > > > > > new
> > > > > > > > > >> >> generation.
> > > > > > > > > >> >> >> >> >> This
> > > > > > > > > >> >> >> >> >> > allows us, and anyone else using a similar
> > > > model,
> > > > > to
> > > > > > > do
> > > > > > > > > >> >> upgrades as well
> > > > > > > > > >> >> >> >> >> as
> > > > > > > > > >> >> >> >> >> > cluster resizing with zero downtime.
> > > > > > > > > >> >> >> >> >> >
> > > > > > > > > >> >> >> >> >> > Reading up on KRaft and the ZK-to-KRaft
> > > > migration
> > > > > > > path,
> > > > > > > > > this
> > > > > > > > > >> is
> > > > > > > > > >> >> somewhat
> > > > > > > > > >> >> >> >> >> > worrying for us. It seems like, if KIP-853
> > is
> > > > not
> > > > > > > > included
> > > > > > > > > >> >> prior to
> > > > > > > > > >> >> >> >> >> > dropping support for ZK, we will
> essentially
> > > > have
> > > > > no
> > > > > > > > > >> satisfying
> > > > > > > > > >> >> upgrade
> > > > > > > > > >> >> >> >> >> > path. Even if KIP-853 is included in 4.0,
> > I’m
> > > > > unsure
> > > > > > > if
> > > > > > > > > that
> > > > > > > > > >> >> would allow
> > > > > > > > > >> >> >> >> >> a
> > > > > > > > > >> >> >> >> >> > migration path for us, since a new cluster
> > > > > > generation
> > > > > > > > > would
> > > > > > > > > >> not
> > > > > > > > > >> >> be able
> > > > > > > > > >> >> >> >> >> to
> > > > > > > > > >> >> >> >> >> > use ZK during the migration step.
> > > > > > > > > >> >> >> >> >> > On the other hand, if KIP-853 was released
> > in
> > > a
> > > > > > > version
> > > > > > > > > prior
> > > > > > > > > >> >> to dropping
> > > > > > > > > >> >> >> >> >> > ZK support, because it allows online
> > resizing
> > > of
> > > > > > KRaft
> > > > > > > > > >> >> clusters, this
> > > > > > > > > >> >> >> >> >> would
> > > > > > > > > >> >> >> >> >> > allow us and others that use an immutable
> > > > > > > infrastructure
> > > > > > > > > >> >> deployment
> > > > > > > > > >> >> >> >> >> model,
> > > > > > > > > >> >> >> >> >> > to provide a zero downtime migration path.
> > > > > > > > > >> >> >> >> >> >
> > > > > > > > > >> >> >> >> >> > For that reason, we’d like to raise
> > awareness
> > > > > around
> > > > > > > > this
> > > > > > > > > >> issue
> > > > > > > > > >> >> and
> > > > > > > > > >> >> >> >> >> > encourage considering the implementation
> of
> > > > > KIP-853
> > > > > > or
> > > > > > > > > >> >> equivalent a
> > > > > > > > > >> >> >> >> >> blocker
> > > > > > > > > >> >> >> >> >> > not only for 4.0, but for the last version
> > > prior
> > > > > to
> > > > > > > 4.0.
> > > > > > > > > >> >> >> >> >> >
> > > > > > > > > >> >> >> >> >> > BR,
> > > > > > > > > >> >> >> >> >> > Anton
> > > > > > > > > >> >> >> >> >> >
> > > > > > > > > >> >> >> >> >> > On 2023/10/11 12:17:23 Luke Chen wrote:
> > > > > > > > > >> >> >> >> >> >> Hi all,
> > > > > > > > > >> >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> >> While Kafka 3.6.0 is released, I’d like
> to
> > > > start
> > > > > > the
> > > > > > > > > >> >> discussion for the
> > > > > > > > > >> >> >> >> >> >> “road to Kafka 4.0”. Based on the plan in
> > > > KIP-833
> > > > > > > > > >> >> >> >> >> >> <
> > > > > > > > > >> >> >> >> >> >
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >>
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-833%3A+Mark+KRaft+as+Production+Ready#KIP833:MarkKRaftasProductionReady-Kafka3.7
> > > > > > > > > >> >> >> >> >> >>,
> > > > > > > > > >> >> >> >> >> >> the next release 3.7 will be the final
> > > release
> > > > > > before
> > > > > > > > > moving
> > > > > > > > > >> >> to Kafka
> > > > > > > > > >> >> >> >> >> 4.0
> > > > > > > > > >> >> >> >> >> >> to remove the Zookeeper from Kafka.
> Before
> > > > making
> > > > > > > this
> > > > > > > > > major
> > > > > > > > > >> >> change, I'd
> > > > > > > > > >> >> >> >> >> >> like to get consensus on the "must-have
> > > > > > > features/fixes
> > > > > > > > > for
> > > > > > > > > >> >> Kafka 4.0",
> > > > > > > > > >> >> >> >> >> to
> > > > > > > > > >> >> >> >> >> >> avoid some users being surprised when
> > > upgrading
> > > > > to
> > > > > > > > Kafka
> > > > > > > > > >> 4.0.
> > > > > > > > > >> >> The intent
> > > > > > > > > >> >> >> >> >> > is
> > > > > > > > > >> >> >> >> >> >> to have a clear communication about what
> to
> > > > > expect
> > > > > > in
> > > > > > > > the
> > > > > > > > > >> >> following
> > > > > > > > > >> >> >> >> >> > months.
> > > > > > > > > >> >> >> >> >> >> In particular we should be signaling what
> > > > > features
> > > > > > > and
> > > > > > > > > >> >> configurations
> > > > > > > > > >> >> >> >> >> are
> > > > > > > > > >> >> >> >> >> >> not supported, or at risk (if no one is
> > able
> > > to
> > > > > add
> > > > > > > > > support
> > > > > > > > > >> or
> > > > > > > > > >> >> fix known
> > > > > > > > > >> >> >> >> >> >> bugs).
> > > > > > > > > >> >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> >> Here is the JIRA tickets list
> > > > > > > > > >> >> >> >> >> >> <
> > > > > > > > > >> >>
> > > > > > > > >
> > > > > >
> > > https://issues.apache.org/jira/issues/?jql=labels%20%3D%204.0-blocker>
> > > > > > > > > >> >> >> >> >> I
> > > > > > > > > >> >> >> >> >> >> labeled for "4.0-blocker". The criteria I
> > > > labeled
> > > > > > as
> > > > > > > > > >> >> “4.0-blocker” are:
> > > > > > > > > >> >> >> >> >> >> 1. The feature is supported in Zookeeper
> > > Mode,
> > > > > but
> > > > > > > not
> > > > > > > > > >> >> supported in
> > > > > > > > > >> >> >> >> >> KRaft
> > > > > > > > > >> >> >> >> >> >> mode, yet (ex: KIP-858: JBOD in KRaft)
> > > > > > > > > >> >> >> >> >> >> 2. Critical bugs in KRaft, (ex:
> > KAFKA-15489 :
> > > > > split
> > > > > > > > > brain in
> > > > > > > > > >> >> KRaft
> > > > > > > > > >> >> >> >> >> >> controller quorum)
> > > > > > > > > >> >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> >> If you disagree with my current list,
> > welcome
> > > > to
> > > > > > have
> > > > > > > > > >> >> discussion in the
> > > > > > > > > >> >> >> >> >> >> specific JIRA ticket. Or, if you think
> > there
> > > > are
> > > > > > some
> > > > > > > > > >> tickets
> > > > > > > > > >> >> I missed,
> > > > > > > > > >> >> >> >> >> >> welcome to start a discussion in the JIRA
> > > > ticket
> > > > > > and
> > > > > > > > > ping me
> > > > > > > > > >> >> or other
> > > > > > > > > >> >> >> >> >> >> people. After we get the consensus, we
> can
> > > > > > > > label/unlabel
> > > > > > > > > it
> > > > > > > > > >> >> afterwards.
> > > > > > > > > >> >> >> >> >> >> Again, the goal is to have an open
> > > > communication
> > > > > > with
> > > > > > > > the
> > > > > > > > > >> >> community
> > > > > > > > > >> >> >> >> >> about
> > > > > > > > > >> >> >> >> >> >> what will be coming in 4.0.
> > > > > > > > > >> >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> >> Below is the high level category of the
> > list
> > > > > > content:
> > > > > > > > > >> >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> >> 1. Recovery from disk failure
> > > > > > > > > >> >> >> >> >> >> KIP-856
> > > > > > > > > >> >> >> >> >> >> <
> > > > > > > > > >> >> >> >> >> >
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >>
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-856:+KRaft+Disk+Failure+Recovery
> > > > > > > > > >> >> >> >> >> >>:
> > > > > > > > > >> >> >> >> >> >> KRaft Disk Failure Recovery
> > > > > > > > > >> >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> >> 2. Prevote to support controllers more
> > than 3
> > > > > > > > > >> >> >> >> >> >> KIP-650
> > > > > > > > > >> >> >> >> >> >> <
> > > > > > > > > >> >> >> >> >> >
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >>
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-650%3A+Enhance+Kafkaesque+Raft+semantics
> > > > > > > > > >> >> >> >> >> >>:
> > > > > > > > > >> >> >> >> >> >> Enhance Kafkaesque Raft semantics
> > > > > > > > > >> >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> >> 3. JBOD support
> > > > > > > > > >> >> >> >> >> >> KIP-858
> > > > > > > > > >> >> >> >> >> >> <
> > > > > > > > > >> >> >> >> >> >
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >>
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-858%3A+Handle+JBOD+broker+disk+failure+in+KRaft
> > > > > > > > > >> >> >> >> >> >>:
> > > > > > > > > >> >> >> >> >> >> Handle
> > > > > > > > > >> >> >> >> >> >> JBOD broker disk failure in KRaft
> > > > > > > > > >> >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> >> 4. Scale up/down Controllers
> > > > > > > > > >> >> >> >> >> >> KIP-853
> > > > > > > > > >> >> >> >> >> >> <
> > > > > > > > > >> >> >> >> >> >
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >>
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-853%3A+KRaft+Controller+Membership+Changes
> > > > > > > > > >> >> >> >> >> >>:
> > > > > > > > > >> >> >> >> >> >> KRaft Controller Membership Changes
> > > > > > > > > >> >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> >> 5. Modifying dynamic configurations on
> the
> > > > KRaft
> > > > > > > > > controller
> > > > > > > > > >> >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> >> 6. Critical bugs in KRaft
> > > > > > > > > >> >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> >> Does this make sense?
> > > > > > > > > >> >> >> >> >> >> Any feedback is welcomed.
> > > > > > > > > >> >> >> >> >> >>
> > > > > > > > > >> >> >> >> >> >> Thank you.
> > > > > > > > > >> >> >> >> >> >> Luke
> > > > > > > > > >> >> >> >> >> >>
> > > > > > > > > >> >> >> >> >>
> > > > > > > > > >> >> >> >>
> > > > > > > > > >> >> >>
> > > > > > > > > >> >>
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > [image: Aiven] <https://www.aiven.io>
> > > >
> > > > *Josep Prat*
> > > > Open Source Engineering Director, *Aiven*
> > > > josep.p...@aiven.io   |   +491715557497
> > > > aiven.io <https://www.aiven.io>   |   <
> > > https://www.facebook.com/aivencloud
> > > > >
> > > >   <https://www.linkedin.com/company/aiven/>   <
> > > > https://twitter.com/aiven_io>
> > > > *Aiven Deutschland GmbH*
> > > > Alexanderufer 3-7, 10117 Berlin
> > > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > >
> > >
> >
>

Reply via email to