You can change HFile size through hbase.hregion.max.filesize parameter. 

On May 13, 2013, at 2:45 AM, Praveen Bysani <[email protected]> wrote:

> Hi,
> 
> I wanted to minimize on the number of map reduce tasks generated while
> processing a job, hence configured it to a larger value.
> 
> I don't think i have configured HFile size in the cluster. I use Cloudera
> Manager to mange my cluster, and the only configuration i can relate
> to is hfile.block.cache.size
> which is set to 0.25. How do i change the HFile size ?
> 
> On 13 May 2013 15:03, Amandeep Khurana <[email protected]> wrote:
> 
>> On Sun, May 12, 2013 at 11:40 PM, Praveen Bysani <[email protected]
>>> wrote:
>> 
>>> Hi,
>>> 
>>> I have the dfs.block.size value set to 1 GB in my cluster configuration.
>> 
>> 
>> Just out of curiosity - why do you have it set at 1GB?
>> 
>> 
>>> I
>>> have around 250 GB of data stored in hbase over this cluster. But when i
>>> check the number of blocks, it doesn't correspond to the block size
>> value i
>>> set. From what i understand i should only have ~250 blocks. But instead
>>> when i did a fsck on the /hbase/<table-name>, i got the following
>>> 
>>> Status: HEALTHY
>>> Total size:    265727504820 B
>>> Total dirs:    1682
>>> Total files:   1459
>>> Total blocks (validated):      1459 (avg. block size 182129886 B)
>>> Minimally replicated blocks:   1459 (100.0 %)
>>> Over-replicated blocks:        0 (0.0 %)
>>> Under-replicated blocks:       0 (0.0 %)
>>> Mis-replicated blocks:         0 (0.0 %)
>>> Default replication factor:    3
>>> Average block replication:     3.0
>>> Corrupt blocks:                0
>>> Missing replicas:              0 (0.0 %)
>>> Number of data-nodes:          5
>>> Number of racks:               1
>>> 
>>> Are there any other configuration parameters that need to be set ?
>> 
>> 
>> What is your HFile size set to? The HFiles that get persisted would be
>> bound by that number. Thereafter each HFile would be split into blocks, the
>> size of which you configure using the dfs.block.size configuration
>> parameter.
>> 
>> 
>>> 
>>> --
>>> Regards,
>>> Praveen Bysani
>>> http://www.praveenbysani.com
> 
> 
> 
> -- 
> Regards,
> Praveen Bysani
> http://www.praveenbysani.com

Reply via email to