Hi, Josep, Thanks for the KIP. At the highlevel, the KIP is well thought through and provides multiple benefits for Kafka in the Cloud. A few comments below.
JR1. One of the key motivations is to eliminate inter-zone data transfer costs from Kafka replication. It would be useful to provide a short summary regarding the cost saving from the major Cloud providers. As people mentioned in another email, currently Azure doesn't charge for inter-zone data transfer. JR2. Transactions on Diskless Topics is listed in the future work. Currently, we try to support all existing client APIs for every new feature. For example, remote storage (KIP-405) supports transactions in the very first release. Similarly, queue for Kafka (KIP-932) will support transactions and remote storage in its first release. The reasoning is that without the full support of all client APIs, it's going to be hard for a Kafka admin to adopt the new feature, since it has the potential to break existing or new users. So, it would be better if this KIP can follow the current convention to support all existing client APIs such as transactions and queue for Kafka. The current implementations of both transactions and queues depend on a partition leader. Since this KIP no longer has partition leaders, it will be useful to think through how those APIs can be supported in the new architecture. JR3. "Permit multi-region active-active topics with automatic failover". Could you elaborate on the benefit of this? Cloud providers still charge cross region data transfer in object stores, right? JR4. "Balance traffic among brokers and eliminate broker hotspots with per-client granularity". Does that mean all traffic from a client is served from a single broker? This seems to reduce the scalability from the client perspective. JR5. Regarding the name diskless. It might be ok, but people may associate it with less durability. Under the cover, the Cloud storage will still store the data on some disks. I am wondering if there is another name that captures the essence but without the potential negative impression. Thanks, Jun On Wed, Apr 16, 2025 at 5:00 AM Josep Prat <josep.p...@aiven.io.invalid> wrote: > Hi Kafka Devs! > > We want to start a new KIP discussion about introducing a new type of > topics that would make use of Object Storage as the primary source of > storage. However, as this KIP is big we decided to split it into multiple > related KIPs. > We have the motivational KIP-1150 ( > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics > ) > that aims to discuss if Apache Kafka should aim to have this type of > feature at all. This KIP doesn't go onto details on how to implement it. > This follows the same approach used when we discussed KRaft. > > But as we know that it is sometimes really hard to discuss on that meta > level, we also created several sub-kips (linked in KIP-1150) that offer an > implementation of this feature. > > We kindly ask you to use the proper DISCUSS threads for each type of > concern and keep this one to discuss whether Apache Kafka wants to have > this feature or not. > > Thanks in advance on behalf of all the authors of this KIP. > > ------------------ > Josep Prat > Open Source Engineering Director, Aiven > josep.p...@aiven.io | +491715557497 | aiven.io > Aiven Deutschland GmbH > Alexanderufer 3-7, 10117 Berlin > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen, > Anna Richardson, Kenneth Chen > Amtsgericht Charlottenburg, HRB 209739 B >