Nandini Singhal created KAFKA-19970:
---------------------------------------
Summary: Add time-based eviction to tiered storage index cache to
prevent stale entries from accumulating
Key: KAFKA-19970
URL: https://issues.apache.org/jira/browse/KAFKA-19970
Project: Kafka
Issue Type: Improvement
Components: Tiered-Storage
Reporter: Nandini Singhal
The remote log index cache (RemoteIndexCache) currently only uses weight-based
eviction. This can cause old, smaller index files to remain cached indefinitely
while newer indices thrash the cache, leading to inefficient cache utilization
and increased remote fetch failures.
(RemoteIndexCache.java:142-144):
{code:java}
return Caffeine.newBuilder()
.maximumWeight(maxSize)
.weigher((Uuid key, Entry entry) -> (int) entry.entrySizeBytes)
.evictionListener(...)
.build();{code}
In environments with:
- Heavy backfill workloads (reading old data once, then moving to newer data)
- Sequential read patterns through tiered storage
- Variable index file sizes
The cache can end up in a state where:
- Old index files from completed backfills remain cached (small, low
frequency)
- Newer index files thrash continuously (larger, similar frequency)
- Cache hit rate degrades over time
- Increased remote storage fetch errors due to cache misses
Add time-based eviction using Caffeine's expireAfterAccess to the cache
configuration so that even if an entry remains in a favorable frequency bucket,
it will be evicted after not being accessed for the configured duration.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)