Re: [SPAM]Re: [DISCUSS] KIP-1150 Diskless Topics

Ivan Yurchenko Wed, 23 Apr 2025 05:14:52 -0700

Hi Yuxia!

Thank you for the question. We've just opened the discussion thread for the 
KIP-1164 [1]. If you don't mind, could you please repost your question there? 
This would help a lot to keep the branchy discussion manageable.


Best,
Ivan

[1] https://lists.apache.org/thread/m9l6lbqv2cffxtz5frypylmqjd7bsqoz

On Wed, Apr 23, 2025, at 09:39, yuxia wrote:
> Hi!
> 
> Thanks for the greate work and I'm excited to see it happens. These KIPs 
> looks well to me. 
> I have a question about the Batch Coordinator in KIP-1164.
> Seems Batch Coordinator is very important in the diskless implementation, 
> could you explain more details on the implementation?
> For me, I'm wondering how it "chooses the total ordering for writes" and 
> what's the "information necessary to support idempotent producers".
> I'm thinking about the following cases:
> 1: client is going to send message A, B, C to Kafka
> 2: client sending A, B to broker1, broker1 recieve A, B
> 3: broker1 down, client send C to broker2
> 4: since broker1 is down, then client recieve A,B fail and retry to send A,B 
> to broker2
> Then, how Batch Coordinator can choose totol order to be A,B,C ?
> 
> 
> Best regards,
> Yuxia
> 
> ----- 原始邮件 -----
> 发件人: "Christo Lolov" <[email protected]>
> 收件人: [email protected]
> 发送时间: 星期二, 2025年 4 月 22日 下午 9:04:06
> 主题: [SPAM]Re: [DISCUSS] KIP-1150 Diskless Topics
> 
> Hello!
> 
> I want to start with saying that this is a big and impressive undertaking
> and I am really excited to see its progression! I am posting my initial
> comments in this thread, but they span a few of the child KIPs. Let me know
> which questions you would like to move elsewhere. I understand that you
> want first a consensus on the direction, but I think I still need designs
> on a few of the core areas to form an opinion.
> 
> CL - 1: In the same lane as Luke's comment, it would be very useful to see
> explicitly what will stay on disk and what won't stay on disk
> 
> CL - 2: It would also be very useful to explicitly say what the
> interactions will be with the Kraft-related topic - would it be diskless or
> on disk?
> 
> CL - 3: Do you envision that this feature will work with KIP-932?
> 
> CL - 4: KIP-1163 says that there won't be a production-grade implementation
> of the Batch Coordinator and KIP-1164 says the opposite. Which one would it
> be?
> 
> CL - 5: KIP-1163 says that the Batch Coordinator doesn't need to concern
> itself with object storage and KIP-1164 says that it will manage the object
> physical deletion. Which one would it be?
> 
> CL - 6: Could you go in a bit more details on whether we would need changes
> to the Kafka clients to achieve what you are proposing? If no changes are
> necessary to the clients then what changes would be necessary to brokers to
> make clients believe they are communicating with the "right" brokers? Would
> those make it in KIP-1163?
> 
> CL - 7: Where and how would indexes (offset, time, producer snapshot) live?
> In particular, I am interested in how the reference Batch Coordinator will
> quickly (for a certain definition of quickly) rebuild state?
> 
> CL - 8: I think that we try to have as few Kafka dependencies as possible.
> The closure of compile + runtime broker-only dependencies is currently 16
> (if I have done my analysis correctly). What problem(s) do you envision
> w.r.t. spilling to disk which we wouldn't be able to solve with our own
> implementation that require SQLite?
> 
> Once again, great work so far!
> 
> Best,
> Christo
> 
> On Sun, 20 Apr 2025 at 23:04, Stanislav Kozlovski <
> [email protected]> wrote:
> 
> > This is an amazing initiative. Huge kudos for driving it. We should
> > incorporate it one way or another.
> >
> > I have a suggestion I'd like to hear your thoughts on. I'm cognizant of
> > the effort required for KIP-1150 so I don't necessarily want to increase
> > the scope - but thinking about this early on can help design later on, plus
> > shape the motivation.
> >
> > The idea is to introduce support for replicationless acks=1 writes. This
> > would be very similar to how AutoMQ's WAL+S3 feature works, as far as I
> > understand it.
> >
> > Could we have Diskless Brokers serve acks=1 produce requests by
> > immediately persisting the data on disk (not sure if we should use fsync or
> > not), responding to the request, and then still asynchronously batching
> > said data with regular acks=all data via the "
> > diskless.append.commit.interval.ms"/ "diskless.append.buffer.max.bytes"
> > configs?
> >
> > If I'm not mistaken, this would offer very similar guarantees as today's
> > acks=1 requests, where a period of low durability exists b/w the time the
> > leader persists to its local disk and the time all followers persist to
> > their disk. Granted, in traditional Kafka this period is probably no more
> > than a hundred milliseconds, and here it'd be at least 2x higher. But I
> > believe that given the major savings, many acks=1 users will be happy to
> > make the tradeoff.
> >
> > While on the topic of cost, I hastily ran some cost calculations and found
> > that the KIP should reduce replication costs by more than 80x. (
> > https://topicpartition.io/blog/kip-1150-diskless-topics-in-apache-kafka).
> > There may be some errors there as the batch coordinator RPC and merging
> > isn't fully fleshed out - but I believe it's directionally correct. It may
> > be worth to add that to the motivation in one way or another - so as to be
> > able to quantify the numbers.
> >
> > Best,
> > Stanislav
> >
> > On 2025/04/19 11:02:30 Ivan Yurchenko wrote:
> > > Hi Ziming,
> > >
> > > > 1. Is this feature available by just a minor adjust of config or it
> > will intrude current code heavily, say, AutoMq is 100% compatible with
> > Kafka and doesn’t intrude the code heavily
> > >
> > > If we speak about the part visible to the user, we expect:
> > >  1. Minimal changes to the client code (with potential fallback with
> > even 0 changes for older clients).
> > >  2. A limited set of new configurations for broker and topics.
> > > Otherwise, this should be a perfectly normal Apache Kafka.
> > >
> > > > 2. Though we are not discussing implement details, it’s worth giving
> > some high-level architecture ideas, and it’s better to compare with AutoMq
> > like systems.
> > >
> > > There's quite a bit of high-level architecture in a sub-KIP-1163 [1].
> > > We didn't do comparison to AutoMQ (to the best of our knowledge, they
> > have a fairly different approach), but if this helps the community to get
> > the idea then sure, we should do this.
> > >
> > > > 3. What we will provide through it, I think we will just provide a
> > common interface and put implementations in another repos, just as we did
> > for Kafka Connect and Kafka Tired Storage.
> > >
> > > This is true for the component that does CRUD operations on object
> > storage. However, for the batch coordinator we would like to provide a
> > decent out-of-the-box self-contained (i.e. no external deps like database)
> > implementation that many Kafka users who don't have challenging scaling
> > requirements would benefit from. There's the sub-KIP-1164 [2] for this.
> > >
> > > > 4. How to deal with KRaft related protocol, since metadata topic is
> > managed differently with __cluster_metadata, through this KIP, will we
> > align the gap between __cluster_metadata  and data topics by put metadata
> > in an object storage? if so, there will be no standby controller? since
> > standby controller is the __cluster_metadata followers and there will be no
> > followers.
> > >
> > > The current plan is to not directly work with the KRaft and
> > __cluster_metadata. What we need from KRaft is 3 types of events:
> > topic/partition creation, topic deletion, and topic configuration changes
> > (with the possibility to limit this set to topic deletion only). We think
> > that'd be enough if we have a "bridge" that watches for these events in
> > __cluster_metadata and reflects them in the batch coordinator (basically,
> > by sending requests).
> > > Does this answer the question or maybe I misunderstood?
> > >
> > > Best,
> > > Ivan
> > >
> > > [1]
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1163%3A+Diskless+Core
> > > [2]
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1164%3A+Topic+Based+Batch+Coordinator
> > >
> > > On Fri, Apr 18, 2025, at 12:42, Ziming Deng wrote:
> > > > Hi Josep,
> > > >
> > > > This would be a fascinating feature, some well known Kafka users are
> > using Kafka in a cloud-native env. As for as I know, there are already some
> > secondary development version Kafka which provide this feature, for
> > example, I am using AutoMq(https://github.com/AutoMQ/automq) in my
> > environment, which significantly helped ms reduced the cost, so I think
> > it’s worthwhile to clarify some related details:
> > > > 1. Is this feature available by just a minor adjust of config or it
> > will intrude current code heavily, say, AutoMq is 100% compatible with
> > Kafka and doesn’t intrude the code heavily
> > > > 2. Though we are not discussing implement details, it’s worth giving
> > some high-level architecture ideas, and it’s better to compare with AutoMq
> > like systems.
> > > > 3. What we will provide through it, I think we will just provide a
> > common interface and put implementations in another repos, just as we did
> > for Kafka Connect and Kafka Tired Storage.
> > > > 4. How to deal with KRaft related protocol, since metadata topic is
> > managed differently with __cluster_metadata, through this KIP, will we
> > align the gap between __cluster_metadata  and data topics by put metadata
> > in an object storage? if so, there will be no standby controller? since
> > standby controller is the __cluster_metadata followers and there will be no
> > followers.
> > > >
> > > > —
> > > > Ziming
> > > >
> > > > > On Apr 16, 2025, at 19:58, Josep Prat <[email protected]>
> > wrote:
> > > > >
> > > > > Hi Kafka Devs!
> > > > >
> > > > > We want to start a new KIP discussion about introducing a new type of
> > > > > topics that would make use of Object Storage as the primary source of
> > > > > storage. However, as this KIP is big we decided to split it into
> > multiple
> > > > > related KIPs.
> > > > > We have the motivational KIP-1150 (
> > > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics
> > )
> > > > > that aims to discuss if Apache Kafka should aim to have this type of
> > > > > feature at all. This KIP doesn't go onto details on how to implement
> > it.
> > > > > This follows the same approach used when we discussed KRaft.
> > > > >
> > > > > But as we know that it is sometimes really hard to discuss on that
> > meta
> > > > > level, we also created several sub-kips (linked in KIP-1150) that
> > offer an
> > > > > implementation of this feature.
> > > > >
> > > > > We kindly ask you to use the proper DISCUSS threads for each type of
> > > > > concern and keep this one to discuss whether Apache Kafka wants to
> > have
> > > > > this feature or not.
> > > > >
> > > > > Thanks in advance on behalf of all the authors of this KIP.
> > > > >
> > > > > ------------------
> > > > > Josep Prat
> > > > > Open Source Engineering Director, Aiven
> > > > > [email protected]   |   +491715557497 | aiven.io
> > > > > Aiven Deutschland GmbH
> > > > > Alexanderufer 3-7, 10117 Berlin
> > > > > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> > > > > Anna Richardson, Kenneth Chen
> > > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > >
> > > >
> > >
> >
>

Re: [SPAM]Re: [DISCUSS] KIP-1150 Diskless Topics

Reply via email to