RE: [DISCUSS] KIP-1150 Diskless Topics

Jan Siekierski Sun, 04 May 2025 08:16:02 -0700

Hello Kafka devs!

I finally found some time to sit down to this topic, I hope I'm not too late to 
the party with my thoughts ;)

I like the proposed solution a lot, how it's thoroughly explained and thought 
through. The essence of it does more or less what Bufstream and Warpstream are 
doing and the high level concept is already validated, and the proposed 
solution is well integrated into Kafka architecture, leveraging KRaft and Kafka 
Topics for coordination while Bufstream and Warpstream are using independent 
components to achieve a similar functionality. I also like the idea of a 
pluggable coordinator service if a need for other implementation emerges over 
time.

I have gone through the documented KIPs. Some of my thoughts on this are 
concrete change propositions, others are just thoughts I’m putting forward. I’d 
love to hear what you think about all this.

KIP-1163 (Diskless Core):
1163.1:
"Brokers will be able to process requests containing partitions from both 
classic and new diskless topics in the same batch. When responding to these 
mixed-typed requests, the broker will be delayed until the diskless partitions 
are committed."

I'm not sure if this was already proposed, but maybe it would be feasible to 
make producer behaviour configurable as part of this KIP? Latency of Diskless 
and regular topics will be very different and this might result in some 
unexpected behaviour

KIP-1164 (Batch Coordinator):
1. Usage of topics for metadata management and outlined scalability issues.
It's been mentioned in "Future work" of KIP-1164 that tiered storage or 
snapshots (or both) will be used to manage the growing state of the metadata 
topic.

Is it a good idea to accept the design without it? Diskless Topics can't be 
production ready without this problem solved, right? Tiered Storage and 
Snapshots in S3 seem like a comprehensive solution to this issue - Tiered 
Storage solving the problem of topic size, snaphots sovling the problem of 
Batch Coordinator performance. I'd love to see a discussion on this being a 
part of KIP-1164 as well

2. "If the topic becomes fully unavailable, Batch Coordinator instances would 
be able to serve read-only operations from their local materialized state, 
which isn’t guaranteed to be up-to-date"
Can we enforce acks=ALL like behaviour here to only return ack to the producer 
once metadata about the written object has been replicated to the required 
number of ISR? I assumed this behaviour, but maybe if it's not in the design 
this should be a part of the specification? This should solve this issue if I 
understand the problem correctly?

KIP-1165 (Object Compaction):
Intuitively I feel resistance to the idea of keeping shared log segments after 
object compaction, but I don't understand the process deeply enough to have a 
strong opinion here. But I would like to propose 3 ideas for discussion:

1. Design the algorithm in a way that in the compacted files prefers grouping 
by topic-partition, but never mix up different topics together in compacted 
objects. Grouping topics in shared log segments after compaction seems like a 
serious mistake. This approach would also solve the proposed dilemma whether it 
would make sense to keep compacted topics separate - as each topic would be 
separate.

2. I would suggest excluding Topic Compaction from the first iteration. If I 
understand correctly it's being discussed here, but AFAIK Topic Compaction 
isn't supported for Tiered Storage either. I see KIP-O mentioning Garbage 
Collection, I would definitely propose to start with cleanup.policy=DELETE only.
Maybe I misunderstood the intention here? which takes me to my third point:

3. I would prefer to not have this called Object Compaction. I think this can 
create confusion with the existing Kafka feature of Log Compaction. 

All in all, kudos to Aiven for the work on this feature. It's great to see this 
happening so quickly and I like the design a lot. I hope my comments are 
helpful and I'm looking forward to seeing this feature in Kafka soon!

Kind regards,
Jan Siekierski

On 2025/04/16 11:58:22 Josep Prat wrote:
> Hi Kafka Devs!
> 
> We want to start a new KIP discussion about introducing a new type of
> topics that would make use of Object Storage as the primary source of
> storage. However, as this KIP is big we decided to split it into multiple
> related KIPs.
> We have the motivational KIP-1150 (
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics)
> that aims to discuss if Apache Kafka should aim to have this type of
> feature at all. This KIP doesn't go onto details on how to implement it.
> This follows the same approach used when we discussed KRaft.
> 
> But as we know that it is sometimes really hard to discuss on that meta
> level, we also created several sub-kips (linked in KIP-1150) that offer an
> implementation of this feature.
> 
> We kindly ask you to use the proper DISCUSS threads for each type of
> concern and keep this one to discuss whether Apache Kafka wants to have
> this feature or not.
> 
> Thanks in advance on behalf of all the authors of this KIP.
> 
> ------------------
> Josep Prat
> Open Source Engineering Director, Aiven
> josep.p...@aiven.io   |   +491715557497 | aiven.io
> Aiven Deutschland GmbH
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> Anna Richardson, Kenneth Chen
> Amtsgericht Charlottenburg, HRB 209739 B
>

RE: [DISCUSS] KIP-1150 Diskless Topics

Reply via email to