Hi Josep, Thanks for the KIP.
I think there's a bit of confusion in the motivation and naming here. As Jun said, what's being proposed here is not truly "diskless" -- we're still storing a fair amount of metadata on local disks. The proposal talks about "Unification/Relationship with Tiered Storage: Identifying a long-term vision for Diskless and Tiered Storage plugins" as "future work." But it seems like when we're adding a new feature, we should consider how it interacts with existing features before we add it, not after it's already in place. To that end, it's useful to compare this KIP against KIP-1176: Tiered Storage for Active Log Segment. In their current forms, both KIP-1176 and KIP-1150 require small disks on each broker. Traditional Kafka tiered storage essentially lets us treat s3 (or other blobstore) as cold storage for older data. KIP-1176 is essentially a refinement of that model that allows us to tier the active log segments as well. As it stands currently, the big advantage of KIP-1150 over the traditional tiered storage is that with KIP-1150, you don't have to send most of your data through normal Kafka replication. This, in turn, is mainly about saving costs on clouds where replication is expensive. When I read KIP-1163, I see the following: > 1. Producers send Produce requests to any broker. > 2. The broker accumulates Produce requests in a buffer until exceeding some > size or time limit. > 3. When enough data accumulates or the timeout elapses, the Broker creates a > shared log segment and batch > coordinates for all of the buffered batches. > 4. The shared log segment is uploaded to object storage and is written > durably. > 5. The broker commits the batch coordinates with the Batch Coordinator > (described in details in KIP-1164). > 6. The Batch Coordinator assigns offsets to the written batches, persists the > batch coordinates, and responds > to the Broker. > 7. The broker sends responses to all Produce requests that are associated > with the committed object. To me this raises a few questions: A. What kind of latencies should we expect here? It seems like we're both buffering lots of produce requests, and waiting until they're written to s3. B. Could we do something similar with KIP-1176 by not ack'ing the ProduceRequest until the tiering had caught up to what we produced? This will have higher latency, but maybe not higher than KIP-1150 (see point A). If we could do that then maybe the cost advantage of KIP-1150 disappears, since I could put all the replicas of my topic in one AZ, and ensure durability by waiting for s3. Another piece of feedback I would give is that I do not think the batch coordinator should be pluggable. Since this is a central part of the system, we should try to focus our efforts on designing a single good one, rather than having lots of pluggable ones. Making this pluggable also will make it difficult to evolve the system in the future. We should present a compelling use-case for pluaggability before introducing it. (In the case of supporting all the different blobstores, the need for pluggability is obvious, of course.) For "Compatibility, Deprecation, and Migration Plan," we just have some text saying that this feature didn't exist before, and now it will. But this isn't very helpful. Instead, we should try to spell out what parts of the system will come with compatibility guarantees. For example, will the format in which we write data to s3 (or other blobstore) be stable and documented, so that 3rd party tools can work with it? Or will we keep it internal and unstable? best, Colin On Wed, Apr 16, 2025, at 04:58, Josep Prat wrote: > Hi Kafka Devs! > > We want to start a new KIP discussion about introducing a new type of > topics that would make use of Object Storage as the primary source of > storage. However, as this KIP is big we decided to split it into multiple > related KIPs. > We have the motivational KIP-1150 ( > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics) > that aims to discuss if Apache Kafka should aim to have this type of > feature at all. This KIP doesn't go onto details on how to implement it. > This follows the same approach used when we discussed KRaft. > > But as we know that it is sometimes really hard to discuss on that meta > level, we also created several sub-kips (linked in KIP-1150) that offer an > implementation of this feature. > > We kindly ask you to use the proper DISCUSS threads for each type of > concern and keep this one to discuss whether Apache Kafka wants to have > this feature or not. > > Thanks in advance on behalf of all the authors of this KIP. > > ------------------ > Josep Prat > Open Source Engineering Director, Aiven > josep.p...@aiven.io | +491715557497 | aiven.io > Aiven Deutschland GmbH > Alexanderufer 3-7, 10117 Berlin > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen, > Anna Richardson, Kenneth Chen > Amtsgericht Charlottenburg, HRB 209739 B