Re: Re: [DISCUSS] KIP-1241: Reduce tiered storage redundancy with delayed upload

Kamal Chandraprakash Tue, 02 Dec 2025 01:40:46 -0800

1. Do you also have to update the RemoteCopy lag segments and bytes metric?
2. As Haiying mentioned, the segments get eventually uploaded to remote so
not sure about the
benefit of this proposal. And, remote storage cost is considered as low
when compared to broker local-disk.
It provides some cushion during third-party object storage downtime.



On Tue, Dec 2, 2025 at 2:45 PM Kamal Chandraprakash <
[email protected]> wrote:

> Hi Jian,
>
> Thanks for the KIP!
>
> When remote storage is unavailable for a few hrs, then with lazy upload
> there is a risk of the broker disk getting full soon.
> The Admin has to configure the local retention configs properly.  With
> eager upload, the disk utilization won't grow
> until the local retention time (expectation is that all the
> passive segments are uploaded). And, provides some time
> for the Admin to take any action based on the situation.
>
> --
> Kamal
>
> On Tue, Dec 2, 2025 at 10:28 AM Haiying Cai via dev <[email protected]>
> wrote:
>
>> Jian,
>>
>> Understands this is an optional feature and the cost saving depends on
>> the ratio between local.retention.ms and total retention.ms.
>>
>> In our setup, we have local.retention set to 3 hours and total retention
>> set to 3 days, so the saving is not going to be significant.
>>
>> On 2025/12/01 05:33:11 jian fu wrote:
>> > Hi Haiying Cai,
>> >
>> > Thanks for joining the discussion for this KIP. All of your concerns are
>> > valid, and that is exactly why I introduced a topic-level configuration
>> to
>> > make this feature optional. This means that, by default, the behavior
>> > remains unchanged. Only when users are not pursuing faster broker boot
>> time
>> > or other optimizations — and care more about cost — would they enable
>> this
>> > option to some topics to save resources.
>> >
>> > Regarding cost self: the actual savings depend on the ratio between
>> local
>> > retention and remote retention. In the KIP/PR, I provided a test
>> example:
>> > if we configure 1 day of local retention and 2 days of remote
>> retention, we
>> > can save about 50%. And realistically, I don't think anyone would boldly
>> > set local retention to a very small value (such as minutes) due to the
>> > latency concerns associated with remote storage. So in short, the
>> feature
>> > will help reduce cost, and the amount saved simply depends on the ratio.
>> > Take my company's usage as real example, we configure most of the
>> topics: 1
>> > day of local retention and 3–7 days of remote storage (3 days for topic
>> > with log/metric usage, 7 days for topic with normal business usage).
>> and we
>> > don't care about the boot speed and some thing else, This KIP allows us
>> to
>> > save 1/7 to 1/3 of the total disk usage for remote storage.
>> >
>> > Anyway, this is just a topic-level optional feature which don't reject
>> the
>> > benifit for current design. Thanks again for the discussion. I can
>> update
>> > the KIP to better classify scenarios where this optional feature is not
>> > suitable. Currently, I only listed real-time analytics as the negative
>> > example.
>> >
>> > Welcome further discussion to help make this KIP more complete. Thanks!
>> >
>> > Regards,
>> > Jian
>> >
>> > Haiying Cai via dev <[email protected]> 于2025年12月1日周一 12:40写道：
>> >
>> > > Jian,
>> > >
>> > > Thanks for the contribution.  But I feel the uploading the local
>> segment
>> > > file to remote storage ASAP is advantageous in several scenarios:
>> > >
>> > > 1. Enable the fast bootstrapping a new broker.  A new broker doesn’t
>> have
>> > > to replicate all the data from the leader broker, it only needs to
>> > > replicate the data from the tail of the remote log segment to the
>> tail of
>> > > the current end of the topic (LSO) since all the other data are in the
>> > > remote tiered storage and it can download them later lazily, this is
>> what
>> > > KIP-1023 trying to solve;
>> > > 2. Although nobody has proposed a KIP to allow a consumer client to
>> read
>> > > from the remote tiered storage directly, but this will helps the
>> > > fall-behind consumer to do catch-up reads or perform the backfill.
>> This
>> > > path allows the consumer backfill to finish without polluting the
>> broker’s
>> > > page cache.  The earlier the data is on the remote tiered storage,
>> the more
>> > > advantageous it is for the client.
>> > >
>> > > I think in your Proposal, you are delaying uploading the segment but
>> the
>> > > file will still be uploaded at a later time, I guess this can saves a
>> few
>> > > hours storage cost for that file in the remote storage, not sure
>> whether
>> > > that is a significant cost saved (if the file needs to stay in remote
>> > > tiered storage for several days or weeks due to retention policy).
>> > >
>> > > On 2025/11/19 13:29:11 jian fu wrote:
>> > > > Hi everyone, I'd like to start a discussion on KIP-1241, the goal
>> is to
>> > > > reduce the remote storage. KIP:
>> > > >
>> > >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1241%3A+Reduce+tiered+storage+redundancy+with+delayed+upload
>> > > >
>> > > > The Draft PR:   https://github.com/apache/kafka/pull/20913
>> Problem:
>> > > > Currently,
>> > > > Kafka's tiered storage implementation uploads all non-active local
>> log
>> > > > segments to remote storage immediately, even when they are still
>> within
>> > > the
>> > > > local retention period.
>> > > > This results in redundant storage of the same data in both local and
>> > > remote
>> > > > tiers.
>> > > >
>> > > > When there is no requirement for real-time analytics or immediate
>> > > > consumption based on remote storage. It has the following drawbacks:
>> > > >
>> > > > 1. Wastes storage capacity and costs: The same data is stored twice
>> > > during
>> > > > the local retention window
>> > > > 2. Provides no immediate benefit: During the local retention period,
>> > > reads
>> > > > prioritize local data, making the remote copy unnecessary
>> > > >
>> > > >
>> > > > So. this KIP is to reduce tiered storage redundancy with delayed
>> upload.
>> > > > You can check the test result example here directly:
>> > > > https://github.com/apache/kafka/pull/20913#issuecomment-3547156286
>> > > > Looking forward to your feedback! Best regards, Jian
>> > > >
>> >
>
>

Re: Re: [DISCUSS] KIP-1241: Reduce tiered storage redundancy with delayed upload

Reply via email to