hi Jian sorry for late review. There are some questions below.
Is this broker-level configuration dynamic? We should clarify this in the KIP. Also, should we add a metric to track the 'delayed size'? Best, Chia-Ping On 2025/11/19 13:29:11 jian fu wrote: > Hi everyone, I'd like to start a discussion on KIP-1241, the goal is to > reduce the remote storage. KIP: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1241%3A+Reduce+tiered+storage+redundancy+with+delayed+upload > > The Draft PR: https://github.com/apache/kafka/pull/20913 Problem: > Currently, > Kafka's tiered storage implementation uploads all non-active local log > segments to remote storage immediately, even when they are still within the > local retention period. > This results in redundant storage of the same data in both local and remote > tiers. > > When there is no requirement for real-time analytics or immediate > consumption based on remote storage. It has the following drawbacks: > > 1. Wastes storage capacity and costs: The same data is stored twice during > the local retention window > 2. Provides no immediate benefit: During the local retention period, reads > prioritize local data, making the remote copy unnecessary > > > So. this KIP is to reduce tiered storage redundancy with delayed upload. > You can check the test result example here directly: > https://github.com/apache/kafka/pull/20913#issuecomment-3547156286 > Looking forward to your feedback! Best regards, Jian >
