petermarshallio commented on a change in pull request #11584: URL: https://github.com/apache/druid/pull/11584#discussion_r722174661
########## File path: docs/querying/caching.md ########## @@ -32,30 +32,50 @@ If you're unfamiliar with Druid architecture, review the following topics before For instructions to configure query caching see [Using query caching](./using-caching.md). +Cache monitoring, including the hit rate and number of evictions, is available in [Druid metrics](../operations/metrics.html#cache). + +Query-level caching is in addition to [data-level caching](../design/historical.md) on Historicals. + ## Cache types -Druid supports the following types of caches: +Druid supports two types of query caching: -- **Per-segment** caching which stores _partial results_ of a query for a specific segment. Per-segment caching is enabled on Historicals by default. -- **Whole-query** caching which stores all results for a query. +- [Per-segment caching](#per-segment-caching) stores _partial_ query results for a specific segment (enabled by default). +- [Whole-query caching](#whole-query-caching) stores _final_ query results. -To avoid returning stale results, Druid invalidates the cache the moment any underlying data changes for both types of cache. +> **Druid invalidates _any_ cache the moment any underlying data changes** +> +> This ensures that Druid does not return stale results, especially important for `table` datasources that have highly-variable underlying data segments, including real-time data segments. -Druid can store cache data on the local JVM heap or in an external distributed key/value store. The default is a local cache based upon [Caffeine](https://github.com/ben-manes/caffeine). Maximum cache storage defaults to the minimum value of 1 GiB or the ten percent of the maximum runtime memory for the JVM with no cache expiration. See [Cache configuration](../configuration/index.md#cache-configuration) for information on how to configure cache storage. +> **Druid can store cache data on the local JVM heap or in an external distributed key/value store (e.g. memcached)** +> +> The default is a local cache based upon [Caffeine](https://github.com/ben-manes/caffeine). The default maximum cache storage size is the minimum of 1 GiB / ten percent of maximum runtime memory for the JVM, with no cache expiration. See [Cache configuration](../configuration/index.md#cache-configuration) for information on how to configure cache storage. When using caffeine, the cache is inside the JVM heap and is directly measurable. Heap usage will grow up to the maximum configured size, and then the least recently used segment results will be evicted and replaced with newer results. Review comment: So I'm not an expert here, and I suspect we may need to get an eng to look at this to be absolutely 100%, but I believe a flush method is more associated with putting what's in cache into the thing that it's caching, whereas an evict is about deletion of objects from the cache. > An eviction policy decides which objects should be deleted at any given time. This policy directly affects the cache's hit rate — a crucial characteristic of caching libraries. https://www.baeldung.com/java-caching-caffeine I know it's only words (!!!) but I think it might be good to be right on this point to avoid confusion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
