Another quick question: the trend lines drawn on the graphs seem to be based on some assumption that there is an exponential scaling pattern. In practice I would think it would be sigmoid -- while the dataset size is smaller than cache capacity, changing the dataset size should have little to no effect on the latency (since you'd get 100% hit rate). As soon as it starts to be larger than the cache capacity, you'd expect the hit rate to be on average equal to (size of cache / size of data). The average latency, then, should be just about equal to the cache miss latency multiplied by the cache miss ratio. That is to say, as the dataset gets larger, the latency will level out as a flat line, not continue to grow as your trend lines are showing.
-Todd On Fri, Apr 4, 2014 at 9:40 PM, Stack <[email protected]> wrote: > Pardon, my questions are around Nick's blog on blockcache in case folks are > confused: http://www.n10k.com/blog/blockcache-101/ > St.Ack > > > On Fri, Apr 4, 2014 at 9:22 PM, Stack <[email protected]> wrote: > > > Nick: > > > > + You measure 99th percentile. Did you take measure of average/mean > > response times doing your blockcache comparison? (Our LarsHofhansl had > it > > that that on average reads out of bucket cache were a good bit slower). > Or > > is this a TODO? > > + We should just remove slabcache because bucket cache is consistently > > better and why have two means of doing same thing? Or, do you need more > > proof bucketcache subsumes slabcache? > > > > Thanks boss, > > St.Ack > > > > > -- Todd Lipcon Software Engineer, Cloudera
