Re: HBase minimum block size for sequential access

Andrew Nguyen Tue, 27 Jul 2010 13:19:35 -0700

I just attempted to change the blocksize and it doesn't seem to be taking.  I 
am doing the following in the shell:


> alter 'tablename', {METHOD=>'table_att', BLOCKSIZE=>1048576}

It returns without any errors but when I 'describe' the table, the setting is 
unchanged.  Is there another place for me to make such changes to the table?

Thanks!


On Jul 27, 2010, at 10:39 AM, Jean-Daniel Cryans wrote:

> I would try using the smallest keys you can (row key, family name,
> qualifier) as they are all stored with each value and you want to
> retrieve millions of them very quickly. Also do the usual like LZO
> compressing the families, and reading as few families as possible at
> the same time (eg if your table has 3 families, you shouldn't read
> more than 1 at a time).
> 
> J-D
> 
> On Tue, Jul 27, 2010 at 10:30 AM, Andrew Nguyen
> <[email protected]> wrote:
>> Perfect thanks, I will run some experiments and keep you posted.
>> 
>> Aside from just getting elapsed time on scans of various sizes, are there 
>> any other tips on what sorts of measurements to perform?  Also, since I'm 
>> doing the experiments with various block sizes anyways, any requests for 
>> other types of benchmarks?
>> 
>> Thanks,
>> Andrew
>> 
>> On Jul 27, 2010, at 10:13 AM, Jean-Daniel Cryans wrote:
>> 
>>>> Thanks for the heads up.  Do you know what happens if I set this value 
>>>> larger than 5MB?  We will always be scanning the data, and always in large 
>>>> blocks.  I have yet to calculate the typical size of a single scan but 
>>>> imagine that it will usually be larger than 1MB.
>>> 
>>> I never tried that, hard to tell, but always eager to hear about
>>> others' experiences :)
>>> 
>>>> 
>>>> Also, is there any way to change the block size with data already in 
>>>> HBase?  Our current import process is very slow (preprocessing of the 
>>>> data) and we don't have the resources to store the preprocessed data.
>>> 
>>> After altering the table, issue a major compaction on it and
>>> everything will be re-written with the new block size.
>>> 
>>> J-D
>> 
>>

Re: HBase minimum block size for sequential access

Reply via email to