Thanks Jun

I agree with the assessment.
I've added it in the document:
https://cwiki.apache.org/confluence/display/KAFKA/The+Path+Forward+for+Saving+Cross-AZ+Replication+Costs+KIPs

Thank you.
Luke



On Thu, Aug 7, 2025 at 1:06 AM Jun Rao <j...@confluent.io.invalid> wrote:

> Hi, Luke,
>
> Thanks for starting the discussion. I took a look at all three proposals
> and the following is my assessment.
>
> KIP-1150 (diskless):
> Pros:
> * has the most benefits to the users.
> -- most complete saving of cross zone network cost (enabled by leader less
> design)
> -- better durability (by leveraging block storage)
> -- best scalability (by separating data from the metadata)
> * clean architecture (no unnatural intrusive changes to existing code base)
> Cons: large effort, but arguable this is what's needed to build a true
> cloud native architecture
>
> KIP-1176 (tier active segment)
> Pros:
> * limited benefits to the users
> --saving of cross zone network cost (limited saving on the producer side)
> * small effort
> Cons:
> * the current availability story is weak
> * it's not clear if the effort is still small once details on correctness,
> cost, cleanness are figured out
>
> KIP-1183 (share storage)
> Pros:
> * moderate benefits to the users
> -- saving of cross zone network cost (limited saving on the producer side
> and the consumer side)
> -- better durability (by leveraging block storage)
> -- improved scalability
> Cons:
> * weaker availability (no hot standby)
> * scalability not as good as KIP-1150
> * effort to build the plugin is too large
>
> Thanks,
>
> Jun
>
> On Wed, Aug 6, 2025 at 12:58 AM Luke Chen <show...@gmail.com> wrote:
>
> > Hi Josep,
> >
> > Thanks for the update.
> >
> > > Luke, thank you for being proactive and caring about this topic!
> > I believe many community users are also caring about this topic! :)
> >
> > Look forward to seeing the updated KIP!
> >
> >
> > Hi Stanislav,
> >
> > Yes, it'd be good for the community to decide which way we want to go,
> > Leaderless or leader-based is absolutely one of the decisions.
> > And yes, more than one KIP is also good to me. It's just that we need a
> way
> > to move them forward.
> > Otherwise, suppose one of the KIPs is ready for voting, we can anticipate
> > requests to wait for the other two related KIPs.
> > Any good suggestions?
> >
> > Hi Xinyu,
> >
> > Thanks for the reply.
> > Look forward to seeing the updated KIP!
> >
> > > If the community plans to adopt a leaderless architecture, will the
> focus
> > be on a complete transition to leaderless, or will both architectures
> > coexist in the long term?
> >
> > I don't think we will abandon the leader-based design as a lot of users
> are
> > still relying on it.
> > Besides, KIP-1150 also claims the existing leader-based protocol works as
> > usual.
> > So, I think they should coexist in the long term.
> >
> >
> > Thank you.
> > Luke
> >
> >
> > On Wed, Aug 6, 2025 at 10:13 AM Xinyu Zhou <yu...@apache.org> wrote:
> >
> > > Hi Luke,
> > >
> > > Thank you for creating this dedicated thread; we definitely need a
> space
> > to
> > > discuss future steps for these topics. I apologize for my delay on
> > KIP-1183
> > > and will provide more details in the coming weeks.
> > >
> > > I agree with Stanislav that we should first focus on the community's
> > > direction. Specifically, should we consider introducing a leaderless
> > > architecture to Kafka, given that it currently relies on a partitioned,
> > > leader-based model?
> > >
> > > From my own perspective, I’m particularly interested in how Leaderless
> > and
> > > Leader-based architectures differ when it comes to handling data
> > > locality—which directly affects batching and fetch efficiency—and in
> the
> > > way core features are implemented. For instance, ordering, compaction,
> > > transactions, idempotent producers, and queues all have to be realized
> on
> > > the Coordinator in a Leaderless design, whereas in a Leader-based
> design
> > > they are handled by the Leader Partition.
> > >
> > > If the community plans to adopt a leaderless architecture, will the
> focus
> > > be on a complete transition to leaderless, or will both architectures
> > > coexist in the long term?
> > >
> > > I welcome discussions on this topic and am eager to hear diverse
> > opinions.
> > >
> > > Regards,
> > > Xinyu
> > >
> > > On Wed, Aug 6, 2025 at 3:05 AM Stanislav Kozlovski <
> > > stanislavkozlov...@apache.org> wrote:
> > >
> > > > Thank you Luke for this wonderful summary and taking initiative.
> > > >
> > > > To me, it seems like a large differentiator from KIP-1150 and others
> is
> > > > the leaderless design. The other two don’t allow for it.
> > > >
> > > > It sounds productive to focus the discussion on whether the
> leaderless
> > > > design is worth it on top of the replication cost savings.
> > > >
> > > > I’m of the opinion that it’s worth pursuing - both for the truly zero
> > > > network cost (no producer cross az) but perhaps even more importantly
> > the
> > > > zero state architecture that promises to significantly simplify
> > > operations,
> > > > including auto scaling brokers and scaling throughput per partition
> > > >
> > > > It would be great if the folks at Aiven could address the concerns
> > > > regarding queue and transactions support. I’m not of the opinion that
> > > these
> > > > things need to ship with v1, but it would be wise to ensure nothing
> in
> > > the
> > > > architecture blocks these features from being shipped in the future
> > > >
> > > > KIP-1176 is also very cool, addressing the acks=1 case will still be
> > > > necessary. I think it’s a necessary feature to implement, but I’d be
> > > > disappointed if that’s the only diskless solution the community
> agrees
> > > on.
> > > >
> > > > A good path, if possible, may be to merge KIP-1150 and KIP-1176.
> > > >
> > > > If instead the community decides leaderless isn’t necessary, then
> > > KIP-1183
> > > > seems fit.
> > > >
> > > > That’s my opinion. Happy to hear if anyone disagrees.
> > > >
> > > > On 2025/08/05 14:30:45 Josep Prat wrote:
> > > > > Hi Luke and community!
> > > > >
> > > > > Luke, thank you for being proactive and caring about this topic!
> > > > >
> > > > > In the meantime we have been keeping ourselves busy pushing our
> > > > > implementation of KIP-1150 to production to validate our
> assumptions
> > > and
> > > > > confirm its strengths while discovering its weaknesses.
> > > > > Now, after gathering some experience running it, we are (as I'm
> > writing
> > > > > this, gathered in the same room) working on an improved proposal
> for
> > > > > KIP-1150 that also addresses the concerns from the community.
> > > > > We expect to share the updated KIP in the next couple of weeks.
> > > > >
> > > > > We apologize for the recent period of silence and are committed to
> > more
> > > > > regular communication as we move forward.
> > > > >
> > > > > Best,
> > > > >
> > > > >
> > > > > On Tue, Aug 5, 2025 at 10:31 AM Luke Chen <show...@gmail.com>
> wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > The Kafka community is currently seeing an unprecedented
> situation
> > > with
> > > > > > three KIPs (KIP-1150, IP-1176, KIP-1183) simultaneously
> addressing
> > > the
> > > > same
> > > > > > challenge of high replication costs when running Kafka across
> > > multiple
> > > > > > cloud availability zones. Each KIP offers a different solution to
> > > this
> > > > > > issue. While diversity of innovative ideas is a key strength of
> > > > open-source
> > > > > > projects, it creates a burden for reviewers and users who must
> > > compare
> > > > and
> > > > > > comment on multiple proposals simultaneously. Furthermore,
> > discussion
> > > > > > around the three KIPs has stalled for over two months now. This
> > could
> > > > be
> > > > > > due to the authors being hesitant to proceed due to the existence
> > of
> > > > > > alternative, potentially conflicting, solutions. Addressing
> > > replication
> > > > > > cost is a key concern of Kafka’s userbase and we should try to
> move
> > > the
> > > > > > conversation forward if we can.
> > > > > >
> > > > > > From what I understand, these three KIPs are not mutually
> > exclusive.
> > > > But
> > > > > > adopting all three KIPs in the community might not be what we
> > expect.
> > > > Thus,
> > > > > > I would like to *start a discussion on how we could move the
> > > > conversation
> > > > > > forward*.
> > > > > >
> > > > > > To save time for the KIP readers/reviewers, I have created this
> > > > document
> > > > > > <
> > > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/The+Path+Forward+for+Saving+Cross-AZ+Replication+Costs+KIPs
> > > > > > >[1]
> > > > > > to help summarize each of the KIPs and describe their current
> > status.
> > > > *Hope
> > > > > > to get some suggestions/feedback from the community*.
> > > > > >
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/The+Path+Forward+for+Saving+Cross-AZ+Replication+Costs+KIPs
> > > > > >
> > > > > > KIP-1150:
> > > > > >
> > > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics
> > > > > > KIP-1176
> > > > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+TopicsKIP-1176
> > > > >
> > > > > > :
> > > > > >
> > > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1176%3A+Tiered+Storage+for+Active+Log+Segment
> > > > > > KIP-1183
> > > > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1176%3A+Tiered+Storage+for+Active+Log+SegmentKIP-1183
> > > > >
> > > > > > :
> > > > > >
> > > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1183%3A+Unified+Shared+Storage
> > > > > >
> > > > > >
> > > > > > Thank you.
> > > > > > Luke
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > [image: Aiven] <https://www.aiven.io>
> > > > >
> > > > > *Josep Prat*
> > > > > Sr. Engineering Director, Streaming Services, *Aiven*
> > > > > josep.p...@aiven.io   |   +491715557497
> > > > > aiven.io <https://www.aiven.io>   |   <
> > > > https://www.facebook.com/aivencloud>
> > > > >   <https://www.linkedin.com/company/aiven/>   <
> > > > https://twitter.com/aiven_io>
> > > > > *Aiven Deutschland GmbH*
> > > > > Alexanderufer 3-7, 10117 Berlin
> > > > >
> > > > > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> > > > >
> > > > >  Kenneth Chen
> > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > >
> > > >
> > >
> >
>

Reply via email to