BlakeOrth commented on code in PR #18855:
URL: https://github.com/apache/datafusion/pull/18855#discussion_r2561940077


##########
datafusion/execution/src/cache/cache_manager.rs:
##########
@@ -189,13 +228,20 @@ pub struct CacheManagerConfig {
     /// Avoid get same file statistics repeatedly in same datafusion session.
     /// Default is disable. Fow now only supports Parquet files.
     pub table_files_statistics_cache: Option<FileStatisticsCache>,
-    /// Enable cache of file metadata when listing files.
-    /// This setting avoids listing file meta of the same path repeatedly
-    /// in same session, which may be expensive in certain situations (e.g. 
remote object storage).
+    /// Enable caching of file metadata when listing files.
+    /// Enabling the cache avoids repeat list and metadata fetch operations, 
which may be expensive
+    /// in certain situations (e.g. remote object storage), for objects under 
paths that are
+    /// cached.
     /// Note that if this option is enabled, DataFusion will not see any 
updates to the underlying
-    /// location.  
-    /// Default is disable.
-    pub list_files_cache: Option<ListFilesCache>,
+    /// storage for at least `list_files_cache_ttl` duration.
+    /// Default is disabled.
+    pub list_files_cache: Option<Arc<dyn ListFilesCache>>,
+    /// Limit the number of objects to keep in the `list_files_cache`. 
Default: ~125k objects
+    pub list_files_cache_limit: usize,

Review Comment:
   Yes, it does seem like this API could be easier to use. It seems to me that 
this would be something that we might want to do as future work and fix up both 
the metadata and list cache at the same time since this PR has already grown to 
a pretty sizeable amount of code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to