Hello, during the loading of data in our cluster I noticed some strange behavior of some regions, that I don't understand.
Scenario: We convert data from a mysql database to HBase. The data is inserted with a put to the specific HBase table. The row key is a timestamp. I know the problem with timestamp keys, but in our requirement it works quiet well. The problem is now, that there are some regions, which are growing and growing. For example the table on the picture [1]. First, all data was distributed over regions and node. And now, the data is written into only one region, which is growing and I can see no splitting at all. Actually the size of the big region is nearly 60 GB. HBase version is 0.94.11. I cannot understand, why the splitting is not happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to 2 GB and HBase accepted this value. <property> <!--Loaded from hbase-site.xml--> <name>hbase.hregion.max.filesize</name> <value>2147483648</value> </property> First mystery: Hannibal shows me the split size is 10 GB (see screenshot). Second mystery: HBase is not splitting some regions neither at 2 GB nor 10 GB. Any ideas? Could be the timestamp rowkey cause this problem? Thanks, Timo [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
smime.p7s
Description: S/MIME cryptographic signature