alamb commented on code in PR #17133: URL: https://github.com/apache/datafusion/pull/17133#discussion_r2279464405
########## datafusion/execution/src/cache/cache_unit.rs: ########## @@ -271,21 +281,36 @@ impl DefaultFilesMetadataCacheState { } } +/// Default implementation of [`FileMetadataCache`] +/// /// Collected file embedded metadata cache. -/// The metadata for some file is invalided when the file size or last modification time have been -/// changed. -/// The `memory_limit` passed in the constructor controls the maximum size of the cache, which uses -/// a Least Recently Used eviction algorithm. -/// Users should use the `get` and `put` methods. The `get_with_extra` and `put_with_extra` methods -/// simply call `get` and `put`, respectively. +/// +/// The metadata for each file is invalidated when the file size or last +/// modification time have been changed. +/// +/// # Internal details +/// +/// The `memory_limit` controls the maximum size of the cache, which uses a +/// Least Recently Used eviction algorithm. When adding a new entry, if the total +/// size of the cached entries exceeds `memory_limit`, the least recently used entries +/// are evicted until the total size is lower than `memory_limit`. +/// +/// # `Extra` Handling +/// +/// Users should use the [`Self::get`] and [`Self::put`] methods. The +/// [`Self::get_with_extra`] and [`Self::put_with_extra`] methods simply call +/// `get` and `put`, respectively. pub struct DefaultFilesMetadataCache { // the state is wrapped in a Mutex to ensure the operations are atomic state: Mutex<DefaultFilesMetadataCacheState>, } impl DefaultFilesMetadataCache { - /// The `memory_limit` parameter controls the maximum size of the cache, in bytes, using a Least - /// Recently Used eviction algorithm. + /// Create a new instance of [`DefaultFilesMetadataCache`]. + /// + /// # Arguments Review Comment: Yeah, I agree the docs are inconsistent in this regard. I personally think that a whole section is only really useful when the arguments are non obvious or there is some nuance to them (such as this case). However, that is a subjective judgement call for sure I think it would not add much clarity if we, for example, added an `# Arguments` section for all functions, though I do think this is what is done in some other projects -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org