Re: HBase random access in HDFS and block indices

Stack Fri, 29 Oct 2010 10:01:53 -0700

On Fri, Oct 29, 2010 at 6:41 AM, Sean Bigdatafun
<[email protected]> wrote:
> I have the same doubt here. Let's say I have a totally random read pattern
> (uniformly distributed).
>
> Now let's assume my total data size stored in HBase is 100TB on 10
> machines(not a big deal considering nowaday's disks), and the total size of
> my RS' memory is 10 * 6G = 60 GB. That translate into a 60/100*1000 = 0.06%
> cache hit probablity. Under random read pattern, each read is bound to
> experience the "open-> read index -> .... -> read datablock" sequence, which
> would be expensive.
>
> Any comment?
>


If totally random, as per Alvin's suggestion, yes, just turn off block
caching since it is doing you no good.

But totally random is unusual in practise, no?

St.Ack

Re: HBase random access in HDFS and block indices

Reply via email to