On Tue, May 21, 2013 at 6:08 PM, lars hofhansl <[email protected]> wrote: > I just did a similar test using PE on a test cluster (16 DNs/RSs, 158 > mappers). > I set it up such that the data does not fit into the aggregate block cache, > but does fit into the aggregate OS buffer cache, in my case that turned out > to be 100m 1k rows. > Now I ran the SequentialRead and RandomRead tests. > > In both cases I see no disk activity (since the data fits into the OS cache). > The SequentialRead run finishes in about 7mins, whereas the RandomRead run > takes over 34mins. > This is with CDH4.2.1 and HBase 0.94.7 compiled against it and with SCR > enabled. > > The only difference is that in the SequentialRead case it is likely that the > next Get can still use the previously cached block, whereas in the RandomRead > read almost every Get need to fetch a block from the OS cache (as verified by > the cache miss rate, which is roughly the same as the request count per > RegionServer). Except for enabling SCR all other settings are close to the > defaults. > > I see 2000-4000 req/s/regionserver and the same number of cache missed per > second and RegionServer in the RandomRead, meaning each RegionServer brought > in about 125-200mb/s from the OS cache, which seems a tad low.
That's a lot of variance. In my test the latencies I wrote there were stable around those numbers. So we have a different way of measuring? > > > So this would imply that reading from the OS cache is almost 5x slower than > reading from the block cache. It would be interesting to explore the > discrepancy. > > > -- Lars > > > > ________________________________ > From: Jean-Daniel Cryans <[email protected]> > To: "[email protected]" <[email protected]> > Sent: Wednesday, April 24, 2013 6:01 PM > Subject: Unscientific comparison of fully-cached zipfian reading > > > Hey guys, > > I did a little benchmarking to see what kind of numbers we get from the > block cache and the OS cache. Please see: > > https://docs.google.com/spreadsheet/pub?key=0Ao87IrzZJSaydENaem5USWg4TlRKcHl0dEtTS2NBOUE&output=html > > Hopefully it gives you some ballpark numbers for further discussion. > > J-D
