Hi, Josep,

Thanks for the KIP. At the highlevel, the KIP is well thought through and
provides multiple benefits for Kafka in the Cloud. A few comments below.

JR1. One of the key motivations is to eliminate inter-zone data transfer
costs from Kafka replication. It would be useful to provide a short summary
regarding the cost saving from the major Cloud providers. As people
mentioned in another email, currently Azure doesn't charge for inter-zone
data transfer.

JR2. Transactions on Diskless Topics is listed in the future work.
Currently, we try to support all existing client APIs for every new
feature. For example, remote storage (KIP-405) supports transactions in the
very first release. Similarly, queue for Kafka (KIP-932) will support
transactions and remote storage in its first release. The reasoning is that
without the full support of all client APIs, it's going to be hard for a
Kafka admin to adopt the new feature, since it has the potential to break
existing or new users. So, it would be better if this KIP can follow the
current convention to support all existing client APIs such as transactions
and queue for Kafka. The current implementations of both transactions and
queues depend on a partition leader. Since this KIP no longer has partition
leaders, it will be useful to think through how those APIs can be supported
in the new architecture.

JR3. "Permit multi-region active-active topics with automatic failover".
Could you elaborate on the benefit of this? Cloud providers still charge
cross region data transfer in object stores, right?

JR4. "Balance traffic among brokers and eliminate broker hotspots with
per-client granularity". Does that mean all traffic from a client is served
from a single broker? This seems to reduce the scalability from the client
perspective.

JR5. Regarding the name diskless. It might be ok, but people may associate
it with less durability. Under the cover, the Cloud storage will still
store the data on some disks. I am wondering if there is another name that
captures the essence but without the potential negative impression.

Thanks,

Jun


On Wed, Apr 16, 2025 at 5:00 AM Josep Prat <josep.p...@aiven.io.invalid>
wrote:

> Hi Kafka Devs!
>
> We want to start a new KIP discussion about introducing a new type of
> topics that would make use of Object Storage as the primary source of
> storage. However, as this KIP is big we decided to split it into multiple
> related KIPs.
> We have the motivational KIP-1150 (
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics
> )
> that aims to discuss if Apache Kafka should aim to have this type of
> feature at all. This KIP doesn't go onto details on how to implement it.
> This follows the same approach used when we discussed KRaft.
>
> But as we know that it is sometimes really hard to discuss on that meta
> level, we also created several sub-kips (linked in KIP-1150) that offer an
> implementation of this feature.
>
> We kindly ask you to use the proper DISCUSS threads for each type of
> concern and keep this one to discuss whether Apache Kafka wants to have
> this feature or not.
>
> Thanks in advance on behalf of all the authors of this KIP.
>
> ------------------
> Josep Prat
> Open Source Engineering Director, Aiven
> josep.p...@aiven.io   |   +491715557497 | aiven.io
> Aiven Deutschland GmbH
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> Anna Richardson, Kenneth Chen
> Amtsgericht Charlottenburg, HRB 209739 B
>

Reply via email to