blcksrx opened a new pull request, #14440:
URL: https://github.com/apache/iceberg/pull/14440

   ### Allow dual catalog cache expiration policies (expire-after-access and 
expire-after-write)
   ---
   This change enhances Iceberg’s catalog-level cache by allowing 
`expire-after-access` and `expire-after-write` policies to be used 
concurrently. This provides more flexible and powerful cache-tuning strategies 
by enabling both time-since-last-access and time-since-creation eviction 
policies on the same cache.
   
   This is achieved by introducing a new property, 
`cache.expiration.expire-after-write-interval-ms`, which works in conjunction 
with the existing `cache.expiration-interval-ms` (which controls 
`expire-after-access`).
   
   This addresses the pain point described in Issue #14417, where long-running 
streaming jobs could prevent cache entries from ever expiring under a pure 
access-based policy. By combining both policies, users can ensure periodic data 
refreshes while still efficiently caching frequently accessed tables.
   
    ---
   #### Background
   
   In many scenarios, especially with long-running services, you want to 
balance performance with data freshness. For example:
   
   - Performance: Caching is essential to avoid the high cost of reloading 
table metadata for every query.
   - Freshness: Cached entries must be periodically refreshed to pick up new 
snapshots or to prevent issues with expired credentials.
   
   Using `expire-after-access` alone is not sufficient for continuous 
workloads, as frequently accessed entries may never be evicted. Using` 
expire-after-write` alone can be inefficient if it evicts "hot" entries that 
are still in active use.
   
   By using both, you can configure a "best-of-both-worlds" strategy:
   - A shorter expire-after-access duration to quickly evict inactive tables.
   - A longer expire-after-write duration to act as a safety net, ensuring that 
even "hot" tables are refreshed periodically (e.g., before underlying 
credentials expire).
    
    ---
   #### What changed in this PR
   ##### ✅ New catalog property for dual-policy expiration
   Two distinct properties now control cache expiration. An entry is evicted if 
either condition is met. If both are set to a non-positive value, caching is 
disabled.
   
   
   | Property                          | Default            | Description       
                                     |
   | --------------------------------- | ------------------ | 
------------------------------------------------------ |
   | cache.expire-after-write-ms       | 0                  | Duration in 
milliseconds to expire a table from the cache after being created.tables will 
not refresh on write. 0 disables this policy. its disabled by default |
   
   ##### ✅ CachingCatalog supports dual policies
   `CachingCatalog.wrap(...)` has been updated to configure the underlying 
Caffeine cache with both expiration policies when both properties are provided.
   ##### ✅ Tests cover dual-policy scenarios
   Tests have been added to validate that when both policies are active, an 
entry is evicted based on whichever condition is met first, and that each 
policy works correctly on its own.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to