[
https://issues.apache.org/jira/browse/KAFKA-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nandini Singhal reassigned KAFKA-19970:
---------------------------------------
Assignee: Nandini Singhal
> Add time-based eviction to tiered storage index cache to prevent stale
> entries from accumulating
> ------------------------------------------------------------------------------------------------
>
> Key: KAFKA-19970
> URL: https://issues.apache.org/jira/browse/KAFKA-19970
> Project: Kafka
> Issue Type: Improvement
> Components: Tiered-Storage
> Reporter: Nandini Singhal
> Assignee: Nandini Singhal
> Priority: Major
>
> The remote log index cache (RemoteIndexCache) currently only uses
> weight-based eviction. This can cause old, smaller index files to remain
> cached indefinitely while newer indices thrash the cache, leading to
> inefficient cache utilization and increased remote fetch failures.
> (RemoteIndexCache.java:142-144):
> {code:java}
> return Caffeine.newBuilder()
> .maximumWeight(maxSize)
> .weigher((Uuid key, Entry entry) -> (int) entry.entrySizeBytes)
> .evictionListener(...)
> .build();{code}
> In environments with:
> - Heavy backfill workloads (reading old data once, then moving to newer
> data)
> - Sequential read patterns through tiered storage
> - Variable index file sizes
> The cache can end up in a state where:
> - Old index files from completed backfills remain cached (small, low
> frequency)
> - Newer index files thrash continuously (larger, similar frequency)
> - Cache hit rate degrades over time
> - Increased remote storage fetch errors due to cache misses
>
> Add time-based eviction using Caffeine's expireAfterAccess to the cache
> configuration so that even if an entry remains in a favorable frequency
> bucket, it will be evicted after not being accessed for the configured
> duration.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)