Hi Henry,

Thanks for the KIP!

LC1: Does the "Bloom Filter Index" upload to the remote storage?
It looks like we don't? Could I know why not?

LC2: Do we have any "bandwidth control" mechanism for the remote log
compaction?
Ideally, it should not impact other high priority tasks like
produce/consume.

LC3: Will tiered topics have its own compact thread?
Is it possible the tiered topics compaction takes long and impacts normal
topics compaction?

LC4: Will we have a new method in RemoteStorageManager to upload/fetch
the cleaner-offset-checkpoint?

LC5: I think we should add some metrics to monitor the tiered storage
compaction progress.


Thank you,
Luke



On Thu, Jan 15, 2026 at 4:44 AM Henry Haiying Cai via dev <
[email protected]> wrote:

> Hi Ivan,
>
> Thanks for the interest and comment in the KIP. As you pointed out, there
> is potential savings from reusing the cache in RemoteStorageManager's
> plugin and avoid creating more copies of data in files.
>
> For IY1: For the remote log segment file downloaded for cleaning, there
> are actually 2 types of log segment: the segment file which were already
> cleaned before (segment A in the diagram) and the segment file which is
> dirty and plan to be cleaned in this cycle (segment B1 in the diagram).
> Data in segment B1 needs to read twice (the first time to build the index
> lookup table and second time to be read and check against the index lookup
> table). Data in Segment A we only need to read it for one time since it is
> not needed to build the index table. So we do not need to store the data
> from segment A in a temp file and reading in streaming fashion from
> RemoteStorageManager is fine. For segment B1, there is pros and cons on
> whether we store it first as a temp file as you pointed out. I think
> overtime the messages in cleaned section (segment A like) will become more
> and bigger so the time spend on reading segment B1 will become
> comparatively smaller and payback on the optimization might not be
> significant overall.
> And also as you pointed out, the output segment file would have to be
> stored as a file first for the upload to work.
>
> For IY2: If we want to optimize the reading from segment B1 by reading
> from RemoteStorageManager twice and not generating temporary file, we would
> need some some kind of guarantee that RemoteStorageManager would be able to
> reuse its internal file/chunk cache for the second read. Otherwise the
> saving on local file resource (and less page cache) is not worth the cost
> of remote reading twice. So I like your idea of adding more methods in
> RemoteStorageManager interface, but I would want to add 2 methods to the
> interface:
>
>   public interface RemoteStorageManager {
>     boolean supportsInputLogSegmentCaching();
>     InputStream fetchLogSegment(RemoteLogSegmentMetadata,
> remoteLogSegmentMetadata, int startPosition, boolean keepForReuse);
>   }
>
> Through the first
> method supportsInputLogSegmentCaching RemoteStorageManager plugin tells
> Kafka that it supports caching/reuse of LogSegment for download; This way
> Kafka code can decide to read from RemoteStorageManager twice for dirty log
> segment B1. Kafka will call RemoteStorageManager's
> new fetchLogSegment method with keepForReuseset to true when it needs to
> build the index lookup table for segment B1 and the RemoteStorageManager
> code will keep the data longer in the cache. The cache can be a time/size
> bound LRU cache to delay the entry eviction.
>

Reply via email to