Thanks for your response Jonathan.  We'll be doing largely single-row random 
lookups.  In this scenario, would it be best to try to make the block size 
encompass a single row?  How significant is the performance hit if hbase has to 
dig up multiple blocks to serve a singe row?


On May 18, 2010, at 3:12 PM, Jonathan Gray wrote:

> It would depend on your read patterns.
> 
> Is everything going to be single row gets, or will you also scan?
> 
> Single row lookups will be faster with smaller block sizes, at the expense of 
> a larger index size (and potentially slower scans as you have to deal with 
> more block fetches).
> 
>> -----Original Message-----
>> From: Jason Strutz [mailto:[email protected]]
>> Sent: Tuesday, May 18, 2010 9:33 AM
>> To: [email protected]
>> Subject: Optimal block size for large columns
>> 
>> I am working with a small cluster, trying to nail down appropriate
>> settings for block size.  We will have a single table with a single
>> column of data averaging 300k in size, sometimes upwards of 2mb, never
>> more than 10mb.
>> 
>> Is there any rule-of-thumb or other sage advice for block sizes for
>> large columns?
>> 
>> Thanks!

Reply via email to