I've been reading the book definitive guide and hbase in action a
little. I found this question from Cloudera that I'm not sure after
looking some benchmarks and documentations from HBase. Could someone
explain me a little about? . I think that when you do a large scan you
should disable the blockcache becuase the blocks are going to swat a
lot, so you didn't get anything from cache, I guess you should be
penalized since you're spending memory, calling GC and CPU with this task.
*You want to do a full table scan on your data. You decide to disable
block caching to see if this**
**improves scan performance. Will disabling block caching improve scan
performance?*
A.
No. Disabling block caching does not improve scan performance.
B.
Yes. When you disable block caching, you free up that memory for other
operations. With a full
table scan, you cannot take advantage of block caching anyway because
your entire table won't fit
into cache.
C.
No. If you disable block caching, HBase must read each block index from
disk for each scan,
thereby decreasing scan performance.
D.
Yes. When you disable block caching, you free up memory for MemStore,
which improves,
scan performance.