On Fri, Oct 29, 2010 at 6:41 AM, Sean Bigdatafun
<[email protected]> wrote:
> I have the same doubt here. Let's say I have a totally random read pattern
> (uniformly distributed).
>
> Now let's assume my total data size stored in HBase is 100TB on 10
> machines(not a big deal considering nowaday's disks), and the total size of
> my RS' memory is 10 * 6G = 60 GB. That translate into a 60/100*1000 = 0.06%
> cache hit probablity. Under random read pattern, each read is bound to
> experience the "open-> read index -> .... -> read datablock" sequence, which
> would be expensive.
>
> Any comment?
>

If totally random, as per Alvin's suggestion, yes, just turn off block
caching since it is doing you no good.

But totally random is unusual in practise, no?

St.Ack

Reply via email to