alamb opened a new issue, #19056: URL: https://github.com/apache/datafusion/issues/19056
### Is your feature request related to a problem or challenge? - part of https://github.com/apache/datafusion/issues/17214 - follow on to https://github.com/apache/datafusion/pull/18855 @BlakeOrth added a cache to avoid re-listing all files which is great, and it includes a max size and a TTL (time to live) for the entries. The default ttl is infinite (to ensure stability). However this is likely not what all users want so it would be useful to be able to change these parameters similarly to how the other runtime options can be configured ### Describe the solution you'd like I think what we should do (as a follow on PR) is to add runtime configuration settings for the max cache size and its ttl in https://datafusion.apache.org/user-guide/configs.html#runtime-configuration-settings This would mean supporting ```sql -- set list files cache limit to 5MB SET datafusion.runtime.list_files_cache_limit = '5M' -- set time to live for each entry to 1 minute 30 seconds SET datafusion.runtime.list_files_cache_limit = '1m30s'; -- would it be better like `1:30`? ``` ### Describe alternatives you've considered I suggest adding two new runtime configuration options, following the model of `metadata_cache_limit` 1. `list_files_cache_limit` -- size of cache 2. `list_files_cache_ttl` -- ttl duration of entries that would mean roughly adding support here (and elsewhere in that file) https://github.com/apache/datafusion/blob/838e1dea832e3cd8585498ba12216e1ad9f584a4/datafusion/core/src/execution/context/mod.rs#L1160-L1163 And add tests like https://github.com/apache/datafusion/blob/c8d26ba012471e6aece9430642d6a8a923bc344c/datafusion/sqllogictest/test_files/set_variable.slt#L314-L316 And then add a note to the upgrade guide ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
