[GitHub] [druid] petermarshallio commented on a change in pull request #11584: Docs - query caching

GitBox Tue, 05 Oct 2021 05:09:33 -0700


petermarshallio commented on a change in pull request #11584:
URL: https://github.com/apache/druid/pull/11584#discussion_r722174661




##########
File path: docs/querying/caching.md
##########
@@ -32,30 +32,50 @@ If you're unfamiliar with Druid architecture, review the 
following topics before
 
 For instructions to configure query caching see [Using query 
caching](./using-caching.md).
 
+Cache monitoring, including the hit rate and number of evictions, is available 
in [Druid metrics](../operations/metrics.html#cache).
+
+Query-level caching is in addition to [data-level 
caching](../design/historical.md) on Historicals.
+
 ## Cache types
 
-Druid supports the following types of caches:
+Druid supports two types of query caching:
 
-- **Per-segment** caching which stores _partial results_ of a query for a 
specific segment. Per-segment caching is enabled on Historicals by default.
-- **Whole-query** caching which stores all results for a query.
+- [Per-segment caching](#per-segment-caching) stores _partial_ query results 
for a specific segment (enabled by default).
+- [Whole-query caching](#whole-query-caching) stores _final_ query results.
 
-To avoid returning stale results, Druid invalidates the cache the moment any 
underlying data changes for both types of cache.
+> **Druid invalidates _any_ cache the moment any underlying data changes**
+>
+> This ensures that Druid does not return stale results, especially important 
for `table` datasources that have highly-variable underlying data segments, 
including real-time data segments.
 
-Druid can store cache data on the local JVM heap or in an external distributed 
key/value store. The default is a local cache based upon 
[Caffeine](https://github.com/ben-manes/caffeine). Maximum cache storage 
defaults to the minimum value of 1 GiB or the ten percent of the maximum 
runtime memory for the JVM with no cache expiration. See [Cache 
configuration](../configuration/index.md#cache-configuration) for information 
on how to configure cache storage.
+> **Druid can store cache data on the local JVM heap or in an external 
distributed key/value store (e.g. memcached)**
+>
+> The default is a local cache based upon 
[Caffeine](https://github.com/ben-manes/caffeine). The default maximum cache 
storage size is the minimum of 1 GiB / ten percent of maximum runtime memory 
for the JVM, with no cache expiration. See [Cache 
configuration](../configuration/index.md#cache-configuration) for information 
on how to configure cache storage.  When using caffeine, the cache is inside 
the JVM heap and is directly measurable.  Heap usage will grow up to the 
maximum configured size, and then the least recently used segment results will 
be evicted and replaced with newer results.

Review comment:
       So I'm not an expert here, and I suspect we may need to get an eng to 
look at this to be absolutely 100%, but I believe a flush method is more 
associated with putting what's in cache into the thing that it's caching, 
whereas an evict is about deletion of objects from the cache.
   > An eviction policy decides which objects should be deleted at any given 
time. This policy directly affects the cache's hit rate — a crucial 
characteristic of caching libraries.
   https://www.baeldung.com/java-caching-caffeine
   
   I know it's only words (!!!) but I think it might be good to be right on 
this point to avoid confusion.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] petermarshallio commented on a change in pull request #11584: Docs - query caching

Reply via email to