abhita opened a new issue, #18405:
URL: https://github.com/apache/datafusion/issues/18405

   Currently for a given Cache like `DefaultFilesMetadataCache`, all the Cache 
Storage and Retrieval mechanisms are tightly coupled with Eviction Strategy. 
   
   ```
   struct DefaultFilesMetadataCacheState {
       lru_queue: LruQueue<Path, (ObjectMeta, Arc<dyn FileMetadata>)>,
       memory_limit: usize,
       memory_used: usize,
       cache_hits: HashMap<Path, usize>,
   }
   ```
   
   For any change in Eviction Policy strategy, we would have to end up 
implementing a new DataStructure having its' own implementation of Cache 
Accessing methods.
   
   Instead, we can decouple the Cache Data Structure and the Eviction 
Strategies by doing something similar as below:
   ```
   pub struct FileMetadataCache {
       /// (DashMap-based, already thread-safe)
       inner_cache: DashMap<ObjectMeta, FileMetadata>,
       /// The eviction policy (thread-safe)
       eviction_strategy: Arc<Mutex<Box<dyn EvictionStrategy>>>,
   .
   .
   .
   .
   ```
   
   Accompanied by a pluggable Cache Strategy which would be listening to events 
from Cache-Storage and accordingly select items of eviction
   ```
   // Core trait for cache eviction strategy
   pub trait EvictionStrategy: Send + Sync {
       /// Called when a cache entry is accessed
       fn on_access(&mut self, key: &str, size: usize);
   
       /// Called when a cache entry is inserted
       fn on_insert(&mut self, key: &str, size: usize);
   
       /// Called when a cache entry is removed
       fn on_remove(&mut self, key: &str);
   
       /// Select entries for eviction to reach target size
       /// Returns keys to evict, ordered by eviction priority
       fn select_for_eviction(&self, target_size: usize) -> Vec<String>;
   
       /// Reset policy state
       fn clear(&mut self);
   
       /// Get the name of this strategy
       fn strategy_name(&self) -> &'static str;
   }
   ```
   
   Benefits
   
   - Separation of Concerns: Cache storage logic is independent of eviction 
policy
   - Strategy Hot-Swapping: Change eviction strategies without recompiling the 
cache
   - Multiple Implementations: Support LRU, LFU, FIFO, ARC, LIRS, or custom 
strategies out-of-the-box
   - Per-Cache Policies and Code Re-usability: Different cache instances can 
use different strategies
   - Reduced Duplication: Eliminate duplicated cache access code across 
implementations


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to