Hello,

during the loading of data in our cluster I noticed some strange behavior of 
some regions, that I don't understand. 

Scenario:
We convert data from a mysql database to HBase. The data is inserted with a put 
to the specific HBase table. The row key is a timestamp. I know the problem 
with timestamp keys, but in our requirement it works quiet well. The problem is 
now, that there are some regions, which are growing and growing.

For example the table on the picture [1]. First, all data was distributed over 
regions and node. And now, the data is written into only one region, which is 
growing and I can see no splitting at all. Actually the size of the big region 
is nearly 60 GB.

HBase version is 0.94.11. I cannot understand, why the splitting is not 
happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to 2 GB and 
HBase accepted this value.

<property>
        <!--Loaded from hbase-site.xml-->
        <name>hbase.hregion.max.filesize</name>
        <value>2147483648</value>
</property>

First mystery: Hannibal shows me the split size is 10 GB (see screenshot).
Second mystery: HBase is not splitting some regions neither at 2 GB nor 10 GB.

Any ideas? Could be the timestamp rowkey cause this problem?

Thanks,

        Timo

[1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to