Personally, I think we should allow split when the store file count is greater than the configured limit. After the split is done, we could have two times of concurrency for compaction, so we could reduce the store file count faster.
I can not recall the reason why we disable split under this scenario, maybe it is because we need link a lot of store files which makes the split slow? 哈晓琳 <[email protected]> 于2021年10月14日周四 下午4:40写道: > Hello everyone, > > While resolving the compact and split issue for a huge region , I found > some problems. > I think split should be allowed to trigger when the store file count is > greater than the configured blocking file count. There is no guarantee that > the number of store files will be reduced under the blocking file count > currently, because there exists circumstance that the speed of adding files > is higher than the speed of reducing files, e.g. the bulk load factor, > large file skipping in compaction. > > The relevant issue is https://issues.apache.org/jira/browse/HBASE-26242, > and I have linked the description document, > > https://docs.google.com/document/d/1HuMEKTuPhSG5lQUvyoBSG3rv3ORB3pWq9IhFkhuZ4Hw/edit#heading=h.ki0y72tu2c5 > . > > The PR is https://github.com/apache/hbase/pull/3652 > > What do you think? > Looking for your quick response on this. > > Thanks, > Xiaolin Ha >
