Hello Kafka devs! I finally found some time to sit down to this topic, I hope I'm not too late to the party with my thoughts ;)
I like the proposed solution a lot, how it's thoroughly explained and thought through. The essence of it does more or less what Bufstream and Warpstream are doing and the high level concept is already validated, and the proposed solution is well integrated into Kafka architecture, leveraging KRaft and Kafka Topics for coordination while Bufstream and Warpstream are using independent components to achieve a similar functionality. I also like the idea of a pluggable coordinator service if a need for other implementation emerges over time. I have gone through the documented KIPs. Some of my thoughts on this are concrete change propositions, others are just thoughts I’m putting forward. I’d love to hear what you think about all this. KIP-1163 (Diskless Core): 1163.1: "Brokers will be able to process requests containing partitions from both classic and new diskless topics in the same batch. When responding to these mixed-typed requests, the broker will be delayed until the diskless partitions are committed." I'm not sure if this was already proposed, but maybe it would be feasible to make producer behaviour configurable as part of this KIP? Latency of Diskless and regular topics will be very different and this might result in some unexpected behaviour KIP-1164 (Batch Coordinator): 1. Usage of topics for metadata management and outlined scalability issues. It's been mentioned in "Future work" of KIP-1164 that tiered storage or snapshots (or both) will be used to manage the growing state of the metadata topic. Is it a good idea to accept the design without it? Diskless Topics can't be production ready without this problem solved, right? Tiered Storage and Snapshots in S3 seem like a comprehensive solution to this issue - Tiered Storage solving the problem of topic size, snaphots sovling the problem of Batch Coordinator performance. I'd love to see a discussion on this being a part of KIP-1164 as well 2. "If the topic becomes fully unavailable, Batch Coordinator instances would be able to serve read-only operations from their local materialized state, which isn’t guaranteed to be up-to-date" Can we enforce acks=ALL like behaviour here to only return ack to the producer once metadata about the written object has been replicated to the required number of ISR? I assumed this behaviour, but maybe if it's not in the design this should be a part of the specification? This should solve this issue if I understand the problem correctly? KIP-1165 (Object Compaction): Intuitively I feel resistance to the idea of keeping shared log segments after object compaction, but I don't understand the process deeply enough to have a strong opinion here. But I would like to propose 3 ideas for discussion: 1. Design the algorithm in a way that in the compacted files prefers grouping by topic-partition, but never mix up different topics together in compacted objects. Grouping topics in shared log segments after compaction seems like a serious mistake. This approach would also solve the proposed dilemma whether it would make sense to keep compacted topics separate - as each topic would be separate. 2. I would suggest excluding Topic Compaction from the first iteration. If I understand correctly it's being discussed here, but AFAIK Topic Compaction isn't supported for Tiered Storage either. I see KIP-O mentioning Garbage Collection, I would definitely propose to start with cleanup.policy=DELETE only. Maybe I misunderstood the intention here? which takes me to my third point: 3. I would prefer to not have this called Object Compaction. I think this can create confusion with the existing Kafka feature of Log Compaction. All in all, kudos to Aiven for the work on this feature. It's great to see this happening so quickly and I like the design a lot. I hope my comments are helpful and I'm looking forward to seeing this feature in Kafka soon! Kind regards, Jan Siekierski On 2025/04/16 11:58:22 Josep Prat wrote: > Hi Kafka Devs! > > We want to start a new KIP discussion about introducing a new type of > topics that would make use of Object Storage as the primary source of > storage. However, as this KIP is big we decided to split it into multiple > related KIPs. > We have the motivational KIP-1150 ( > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics) > that aims to discuss if Apache Kafka should aim to have this type of > feature at all. This KIP doesn't go onto details on how to implement it. > This follows the same approach used when we discussed KRaft. > > But as we know that it is sometimes really hard to discuss on that meta > level, we also created several sub-kips (linked in KIP-1150) that offer an > implementation of this feature. > > We kindly ask you to use the proper DISCUSS threads for each type of > concern and keep this one to discuss whether Apache Kafka wants to have > this feature or not. > > Thanks in advance on behalf of all the authors of this KIP. > > ------------------ > Josep Prat > Open Source Engineering Director, Aiven > josep.p...@aiven.io | +491715557497 | aiven.io > Aiven Deutschland GmbH > Alexanderufer 3-7, 10117 Berlin > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen, > Anna Richardson, Kenneth Chen > Amtsgericht Charlottenburg, HRB 209739 B >