We have HBase cluster which was peacefully (acceptable throughput and latencies) serving for about a month (we are using 0.89.20100926 version)
This morning we wanted to set TTL to value smaller than default and Mr. Murphy struck. (A) We disabled and altered table with desired TTL value (using shell). Alerting one property (in this case TTL) reseted all other table properties to default values. For example version was reset to 3, compression was reset to NONE etc. [I think this is known issue with open Jira] (B) We wanted to go back to previous table properties. Now after multiple retires we were not able to disable table (even restarting cluster didn't helped). Most likely if clients are hitting hard (in this case ~10k qps) on HBase table, it takes forever to disable a table. So we stopped all clients and then were able to disable table and altered table properties to desired values. (C) Due to compression was reset to NONE and version was reset to 3 for good 10-12hrs, the total number of regions tripled and load (#regions/RS) increased from 100 to 300. After first major_compaction, compression, version 1, and new lower TTL became effective and we were back to original HDFS footprint and have bunch of small regions. What will trigger merging of these regions? the tool for merging does not seem to work or even if it does, it can only do two regions at-a-time. Any suggestions on how we can reduce number of regions and bring load back to where it was before? Thanks, --Abhi
