Thanks for the KIP.
I've reviewed the updated KIP and agree with the motivation behind
KIP-1150, overall LGTM.
It seems KIP-1163 and KIP-1164 require more details, which we can discuss
in those respective threads.

+1(binding) for KIP-1150.

~Satish.

On Fri, 27 Feb 2026 at 23:28, Jun Rao via dev <[email protected]> wrote:

> Hi, Anatolii,
>
> Thanks for the KIP. The link you posted for KIP-1150 seems incorrect and it
> points to KIP-1163. Otherwise, +1.
>
> Jun
>
> On Wed, Feb 25, 2026 at 2:59 PM vaquar khan <[email protected]> wrote:
>
> > Fair point, Chris. I agree with that architectural boundary. KIP-1150
> > successfully sets the high-level mandate , and we can rigorously tackle
> the
> > exact EOS and RPC mechanics over in the KIP-1164 thread .
> >
> > Andrew, I am fully aligned with you on the massive operational value of
> > eliminating those cross-AZ replication costs. It is absolutely the right
> > strategic direction for Kafka.
> >
> > Since my initial concerns on the storage side are resolved, and we are
> > aligned on where the transactional interfaces will be finalized, I am
> > officially withdrawing my objection.
> > +1 (non-binding) for KIP-1150.
> >
> > I will migrate my open questions over to the KIP-1164 discussion thread
> so
> > we can lock down the data safety details there.
> >
> > Regards,
> > Vaquar Khan
> >
> > On Wed, 25 Feb 2026 at 15:24, Chris Egerton <[email protected]>
> > wrote:
> >
> > > Hi Vaquar,
> > >
> > > > Let me know what you guys think about locking down the text for these
> > > interfaces.
> > >
> > > I think this KIP has the appropriate level of detail and any concerns
> > about
> > > EOS can be addressed in the relevant sub-KIP.
> > >
> > > Chris
> > >
> > > On Wed, Feb 25, 2026 at 4:20 PM vaquar khan <[email protected]>
> > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > First off, thanks to the authors for the Feb 12th updates to
> KIP-1163 .
> > > > Adding the periodic reconciliation loop clears up my concerns about
> the
> > > > orphaned "Upload-then-Commit" segments, so I'm officially withdrawing
> > my
> > > > objection on the storage leak issue .
> > > >
> > > > Chris and Greg- since you both mentioned digging into the 1164
> > details, I
> > > > wanted to pick your brains on how Exactly-Once Semantics (EOS) is
> going
> > > to
> > > > safely operate here. In standard Kafka, the Partition Leader is our
> > > single
> > > > serialization point. It receives the data, tracks ongoing
> transactions
> > > via
> > > > the ProducerStateManager, and calculates the Last Stable Offset (LSO)
> > > > locally . Since KIP-1150 removes the leader, the Batch Coordinator
> > takes
> > > > over. But as I read through the current text, a few critical
> > > > synchronization barriers seem to be missing to me:
> > > >
> > > > 1. LSO Calculation: How exactly will the Batch Coordinator maintain
> and
> > > > calculate the LSO? Justine Olshan brought this up earlier too . Will
> > the
> > > > coordinator run its own ProducerStateManager to track ongoing
> > > transactions,
> > > > or is there a totally different state machine planned?
> > > >
> > > > 2. RPC Protocol: What's the exact synchronization protocol between
> the
> > > > legacy Transaction Coordinator and the new Batch Coordinator? When
> the
> > > Txn
> > > > Coordinator sends a commit marker, how does the Batch Coordinator
> > > actually
> > > > verify it has received all the prerequisite data batches for that
> > > specific
> > > > transaction epoch?
> > > >
> > > > 3. Delayed Data Race Condition: Let's say a broker hits a GC pause
> > right
> > > > *after
> > > > *uploading a batch to object storage, but *before* committing the
> > > > coordinates . If the transaction commit marker arrives at the
> > Coordinator
> > > > first, what happens? Does the Coordinator wait? If not, couldn't the
> > > > transaction commit with missing data, completely violating
> > read_committed
> > > > isolation?
> > > >
> > > > The KIP vaguely mentions *transactional checks* but leaves the actual
> > > > commit protocol and public interfaces undefined right now . I'm not
> > > saying
> > > > the design itself is broken, but I really think myself and others
> need
> > to
> > > > see these RPC flows explicitly documented before we implement and
> > adopt
> > > > this. Otherwise, we risk baking in some severe data isolation
> headaches
> > > > down the line.
> > > >
> > > > Let me know what you guys think about locking down the text for these
> > > > interfaces.
> > > >
> > > > Regards,
> > > > Vaquar Khan
> > > >
> > > > On Wed, 25 Feb 2026 at 10:33, Greg Harris via dev <
> > [email protected]>
> > > > wrote:
> > > >
> > > > > Hey all,
> > > > >
> > > > > I'm excited to discuss more details in 1163 and 1164 with everyone.
> > > > >
> > > > > +1 (binding)
> > > > >
> > > > > Thanks!
> > > > > Greg
> > > > >
> > > > > On Wed, Feb 25, 2026 at 1:08 AM Anatolii Popov via dev <
> > > > > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Given the importance of this KIP, we want to keep the vote open
> > for a
> > > > few
> > > > > > more days to give time to people who had comments in the DISCUSS
> > > thread
> > > > > to
> > > > > > cast their vote if they want.
> > > > > >
> > > > > > On Wed, Feb 25, 2026 at 10:47 AM Josep Prat via dev <
> > > > > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > > As a co-author of the KIP, I want to explicitly cast my vote
> for
> > > this
> > > > > > KIP.
> > > > > > >
> > > > > > > +1 (binding)
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Feb 25, 2026 at 9:02 AM Luke Chen <[email protected]>
> > > wrote:
> > > > > > >
> > > > > > > > I've re-read KIP-1150, and still agree this is what we need
> for
> > > > > Apache
> > > > > > > > Kafka.
> > > > > > > >
> > > > > > > > +1 (binding) from me.
> > > > > > > >
> > > > > > > > Thank you,
> > > > > > > > Luke
> > > > > > > >
> > > > > > > > On Wed, Feb 25, 2026 at 12:10 PM Chris Egerton <
> > > > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > >> Hi all,
> > > > > > > >>
> > > > > > > >> Thanks for the KIP. I've reviewed 1150, 1163, and 1164, as
> > well
> > > as
> > > > > the
> > > > > > > >> relevant discussion threads. I may have granular comments
> > about
> > > > 1163
> > > > > > and
> > > > > > > >> 1164 but the overall approach suggested in 1150 looks good
> to
> > > me.
> > > > I
> > > > > > > >> especially like that the approach covers two main pain
> points
> > of
> > > > > > > operating
> > > > > > > >> and paying for Kafka today: it allows cross-AZ traffic to be
> > > > reduced
> > > > > > > (even
> > > > > > > >> eliminated in some cases), and it also allows local disk
> usage
> > > by
> > > > > > > brokers
> > > > > > > >> to be reduced (if operators opt for a small local cache on
> > > > follower
> > > > > > > >> brokers
> > > > > > > >> for non-tiered segments).
> > > > > > > >>
> > > > > > > >> +1 (binding)
> > > > > > > >>
> > > > > > > >> Cheers,
> > > > > > > >>
> > > > > > > >> Chris
> > > > > > > >>
> > > > > > > >> On Mon, Jan 26, 2026 at 3:36 PM vaquar khan <
> > > > [email protected]>
> > > > > > > >> wrote:
> > > > > > > >>
> > > > > > > >> > Hi Josep,
> > > > > > > >> >
> > > > > > > >> > Thank you for the detailed response. I appreciate the
> > > > > clarification
> > > > > > > >> > regarding the distinction between the Inkless POC and the
> > KIP
> > > > > > design.
> > > > > > > >> >
> > > > > > > >> > However, my objection is not based on temporary bugs in
> the
> > > > fork,
> > > > > > but
> > > > > > > >> *on
> > > > > > > >> > architectural gaps in the KIPs themselves* that these
> > > > > implementation
> > > > > > > >> issues
> > > > > > > >> > highlighted. If we are voting to approve the design, the
> > > design
> > > > > > > >> documents
> > > > > > > >> > must be structurally complete regarding data safety.
> > > > > > > >> >
> > > > > > > >> > *1. Regarding Storage Leaks (The Missing Design)* You
> > > mentioned
> > > > > that
> > > > > > > >> > cleanup logic "can be defined later." However, KIP-1163
> > > > explicitly
> > > > > > > >> > delegates this responsibility to a separate process, and
> > > > KIP-1165
> > > > > > > >> (Object
> > > > > > > >> > Compaction/GC) is currently marked as "Discarded" in the
> > wiki.
> > > > > > > >> >
> > > > > > > >> > We cannot vote to approve a storage engine that has no
> > > specified
> > > > > > > >> mechanism
> > > > > > > >> > for garbage collection. The "Upload-then-Commit" pattern
> > > > described
> > > > > > in
> > > > > > > >> > KIP-1163 structurally creates orphaned segments during
> > broker
> > > > > > > failures.
> > > > > > > >> > Without an active KIP defining the reconciliation protocol
> > > > (since
> > > > > > > >> KIP-1165
> > > > > > > >> > was withdrawn), the proposal effectively describes a
> system
> > > with
> > > > > > > >> unbounded
> > > > > > > >> > storage growth during failure modes. This is a blocking
> > design
> > > > > gap,
> > > > > > > not
> > > > > > > >> an
> > > > > > > >> > implementation detail.
> > > > > > > >> >
> > > > > > > >> > *2. Regarding EOS (The Coordinator Synchronization Gap)*
> > This
> > > is
> > > > > > not a
> > > > > > > >> > misunderstanding of standard Kafka transactions; it is a
> > > > critique
> > > > > of
> > > > > > > how
> > > > > > > >> > KIP-1150 changes them. Standard EOS relies on the
> Partition
> > > > Leader
> > > > > > to
> > > > > > > >> > sequence markers and calculate the LSO (Last Stable
> Offset)
> > in
> > > > > > memory.
> > > > > > > >> > KIP-1150 removes the Leader.
> > > > > > > >> >
> > > > > > > >> > KIP-1164 (Batch Coordinator) must explicitly define the
> RPC
> > > flow
> > > > > > > between
> > > > > > > >> > the Transaction Coordinator and the Batch Coordinator to
> > > replace
> > > > > the
> > > > > > > >> > leader's role. Currently, the KIP does not specify how the
> > > > system
> > > > > > > >> prevents
> > > > > > > >> > a "Split Brain" scenario where a consumer reads ahead of a
> > > > > > transaction
> > > > > > > >> > marker that hasn't yet been sequenced by the Batch
> > > Coordinator.
> > > > > This
> > > > > > > is
> > > > > > > >> a
> > > > > > > >> > protocol-level correctness issue that must be resolved in
> > the
> > > > text
> > > > > > > >> before
> > > > > > > >> > adoption.
> > > > > > > >> >
> > > > > > > >> > Please note - I am maintaining my objection based on
> missing
> > > > > > > >> > specifications, not code bugs.
> > > > > > > >> >
> > > > > > > >> > I respectfully request that we pause the vote until:
> > > > > > > >> >
> > > > > > > >> >     A valid design for Garbage Collection (replacing the
> > > > discarded
> > > > > > > >> > KIP-1165) is added to the proposal.
> > > > > > > >> >
> > > > > > > >> >     The Transaction/LSO synchronization protocol is
> > explicitly
> > > > > > > >> documented
> > > > > > > >> > in KIP-1164.
> > > > > > > >> >
> > > > > > > >> > Regards,
> > > > > > > >> >
> > > > > > > >> > Vaquar Khan
> > > > > > > >> > Sr Data Architect
> > > > > > > >> > https://www.linkedin.com/in/vaquar-khan-b695577/
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > [image: Aiven] <https://www.aiven.io>
> > > > > > >
> > > > > > > *Josep Prat*
> > > > > > > Sr. Engineering Director, Streaming Services, *Aiven*
> > > > > > > [email protected]   |   +491715557497
> > > > > > > aiven.io <https://www.aiven.io>   |   <
> > > > > > https://www.facebook.com/aivencloud
> > > > > > > >
> > > > > > >   <https://www.linkedin.com/company/aiven/>   <
> > > > > > > https://twitter.com/aiven_io>
> > > > > > > *Aiven Deutschland GmbH*
> > > > > > > Alexanderufer 3-7, 10117 Berlin
> > > > > > >
> > > > > > > Geschäftsführer: Oskari Saarenmaa, Kenneth Chen
> > > > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Anatolii Popov
> > > > > > Senior Software Developer, *Aiven OY*
> > > > > > m: +358505126242
> > > > > > w: aiven.io  e: [email protected]
> > > > > > <https://www.facebook.com/aivencloud>
> > > > > > <https://www.linkedin.com/company/aiven/>   <
> > > > > https://twitter.com/aiven_io>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to