abhita commented on issue #18405: URL: https://github.com/apache/datafusion/issues/18405#issuecomment-3484023433
@nuno-faria @alamb Thanks for the feedback! I understand the concern about balancing simplicity with advanced use cases. Regarding your suggestion to provide a custom cache manager - I've actually looked into this, and the issue is that with the current design, I would need to reimplement the entire cache accessor logic and all cache operations for each eviction strategy. This leads to significant code duplication. As i mentioned earlier, >For any change in Eviction Policy strategy, we would have to end up implementing a new DataStructure having its' own implementation of Cache Accessing methods. For example, if I want to support both LRU and LFU eviction policies, I'd need to: 1. Duplicate all the cache access methods (get, put, remove, etc.) 2. Duplicate the thread-safety mechanisms 3. Only the eviction selection logic would differ The proposal in this issue aims to solve exactly this problem by: - Keeping the cache storage logic (DashMap, memory tracking, etc.) in one place - Making only the eviction strategy pluggable via the `EvictionStrategy` trait - Allowing users to swap strategies without reimplementing the entire cache This way: - Simple use cases can still use the default LRU strategy with minimal configuration(`DefaultFilesMetadataCache`) - Advanced users can plug in custom eviction strategies (LFU, ARC, LIRS, etc.) without duplicating cache infrastructure code. (`CustomMetadataCache`) - The trait interface remains stable for those who need completely custom implementations I think this strikes a good balance - we maintain simplicity for the default case while enabling flexibility without code duplication. What do you think about this approach? I'm happy to work on a PR to demonstrate this if it sounds reasonable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
