abhita commented on issue #18405:
URL: https://github.com/apache/datafusion/issues/18405#issuecomment-3484023433

   @nuno-faria 
   @alamb 
   Thanks for the feedback! I understand the concern about balancing simplicity 
with advanced use cases.
   
   Regarding your suggestion to provide a custom cache manager - I've actually 
looked into this, and the issue is that with the current design, I would need 
to reimplement the entire cache accessor logic and all cache operations for 
each eviction strategy. This leads to significant code duplication.
   
   As i mentioned earlier,
   >For any change in Eviction Policy strategy, we would have to end up 
implementing a new DataStructure having its' own implementation of Cache 
Accessing methods.
   
   For example, if I want to support both LRU and LFU eviction policies, I'd 
need to:
   1. Duplicate all the cache access methods (get, put, remove, etc.)
   2. Duplicate the thread-safety mechanisms
   3. Only the eviction selection logic would differ
   
   The proposal in this issue aims to solve exactly this problem by:
   - Keeping the cache storage logic (DashMap, memory tracking, etc.) in one 
place
   - Making only the eviction strategy pluggable via the `EvictionStrategy` 
trait
   - Allowing users to swap strategies without reimplementing the entire cache
   
   This way:
   - Simple use cases can still use the default LRU strategy with minimal 
configuration(`DefaultFilesMetadataCache`)
   - Advanced users can plug in custom eviction strategies (LFU, ARC, LIRS, 
etc.) without duplicating cache infrastructure code. (`CustomMetadataCache`)
   - The trait interface remains stable for those who need completely custom 
implementations
   
   I think this strikes a good balance - we maintain simplicity for the default 
case while enabling flexibility without code duplication. What do you think 
about this approach?
   
   I'm happy to work on a PR to demonstrate this if it sounds reasonable.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to