peferron commented on issue #7236: Further improve caching documentation.
URL: https://github.com/apache/incubator-druid/pull/7236#issuecomment-472143133
 
 
   @gianm I'm glad to see that the original PR led to a comprehensive rewrite 
of the caching doc :)
   
   What I'm curious about is the performance of broker merging in a fully 
cached scenario. Imagine that all segment results are available in memcached. 
Then what's the threshold where a single broker getting all results in bulk 
from the cache then merging them all, starts falling behind farming it out to 
historicals? Historical caching has its own issues, such as inability of 
getting results in bulk, and lower local cache hit ratio when replication > 1, 
so I have a feeling that this threshold could actually be quite high when 
merging is cheap (simple sums, no HLLs, etc). It's probably really hard to 
write a good rule of thumb for this in the doc though, since it depends on many 
factors such as # of historicals, # of segments, result size, result merging 
cost, etc.
   
   What could be useful, though, is a curated list of holistic caching setups 
for the entire cluster. It's easy to get lost between all individual settings 
for using/populating local/remote/hybrid-L1/hybrid-L2 segment-level caches in 
the brokers, historicals, and MMs. But probably only a handful of distinct 
combinations of these settings make sense.
   
   For example, a simple setup that would be a good starting point for most 
users would be:
   
   - Broker: `useResultLevelCache = true`, `populateResultLevelCache = true`
   - Historical: `useCache = true`, `populateCache = true`
   - And everything else to default.
   
   A handful of other setups may make sense in more specific scenarios. The 
discussion in https://github.com/apache/incubator-druid/issues/4947 contains 
quite elaborate schemes.
   
   I'm not familiar enough with caching perf at the moment to write such a list 
of recommended setups, but I intend to benchmark a few of these setups later so 
perhaps I'll give it a shot after that.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to