Hey all,
We've just recently completed a bulk load of data into a number of
tables and once complete we restarted the cluster and migrated from EMR
5.19 to 5.20 thus moving our HBase version from 1.4.7 to 1.4.8.
Everything seemed stable at the time and there was no activity across
the cluster for a few days. Yesterday we activated our live ingest
pipeline that is now pushing more data into those bulk loaded tables.
What we're seeing now after about 24hrs of live ingest (~450 requests/s
across 80 regionservers) is that HBase is splitting regions like crazy.
In the last 24hrs we've double our region count, going from 24k to 51k
regions.
Looking at the tables, the regions being split are of relatively small
size. One example table had 136 regions yesterday with about 8-10GB per
region. That table now has 1446 regions, each at 1-2GB but has only
grown by ~700GB.
The current configuration we're using has the following values set which
I was under the impression would prevent this situation.
<property>
<name>hbase.hregion.max.filesize</name>
<value>21474836480</value>
</property>
<property>
<name>hbase.regionserver.regionSplitLimit</name>
<value>256</value>
</property>
<property>
<name>hbase.regionserver.region.split.policy</name>
<value>org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy</value>
</property>
For now I've disable splits through the shell. Any insight into what may
be causing this would be really appreciated. Also if anyone is aware of
a log4j config for this situation that would be very useful.
Thanks,
Austin