On Wed, May 11, 2011 at 11:59 PM, Mayuresh
<[email protected]> wrote:
> Hi,
> I have a question on how the splits work on hbase.
>

Once any file in the region is >= the configured
hbase.hregion.max.filesize, the region is split.

On split, the daughter regions take out 'references', one daughter
references the top half of the parents files and the other daughter
the bottom half.   The daugthers run this way until they compact.  At
this time the rewrite the data from parent into new files that live
under the daughters.

Are you looking at the fs after the above compactions/rewrites have
run? (Your files seem to be about half the configured size).

St.Ack


> I have one master which also acts as a region server, along with other 3
> region servers.
>
> I have set the following parameters on all the region servers
>
>  <property>
>    <name>hbase.hregion.max.filesize</name>
>    <value>1048576</value>
>    <description>
>    Maximum HStoreFile size. If any one of a column families' HStoreFiles
> has
>    grown to exceed this value, the hosting HRegion is split in two.
>    Default: 256M.
>    </description>
>  </property>
>  <property>
>    <name>hbase.hregion.memstore.flush.size</name>
>    <value>6291456</value>
>    <description>
>    Memstore will be flushed to disk if size of the memstore
>    exceeds this number of bytes.  Value is checked by a thread that runs
>    every hbase.server.thread.wakefrequency.
>    </description>
>  </property>
>
> i have 2 tables where I am loading data and I am expecting 1 M chunk of
> files to be created. However if I check the size on DFS, it is creating
> around 500K of files:
> /hbase/cpu_util_30secs/1693967354/data/2196215602953537657 - 576.7 KB
> /hbase/cpu_util_30secs/366365858/data/4815597063578386524 - 640.89 KB
>
>
> All the table regions that it has created are of this size. However I was
> expecting them to be 1M.
>
> Is there some other parameter that I must be tweaking?
>
> Regards,
> Mayuresh
>

Reply via email to