[
https://issues.apache.org/jira/browse/HBASE-26242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaolin Ha updated HBASE-26242:
-------------------------------
Summary: Allow split when store file count larger than the configured
blocking file count (was: Support split before compact when store file count
larger than the configured blocking file count)
> Allow split when store file count larger than the configured blocking file
> count
> --------------------------------------------------------------------------------
>
> Key: HBASE-26242
> URL: https://issues.apache.org/jira/browse/HBASE-26242
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 3.0.0-alpha-1, 1.4.0, 2.0.0
> Reporter: Xiaolin Ha
> Assignee: Xiaolin Ha
> Priority: Major
>
> In the requestSplit() function (called by the MemstoreFlusher and
> CompactionRunner) for a region, it will check the compaction priority of the
> region. If compact priority < PRIORITY_USER , it will not split.
> {code:java}
> public synchronized boolean requestSplit(final Region r) {
> // don't split regions that are blocking
> HRegion hr = (HRegion)r;
> try {
> if (shouldSplitRegion() && hr.getCompactPriority() >= PRIORITY_USER) {
> byte[] midKey = hr.checkSplit().orElse(null);
> if (midKey != null) {
> requestSplit(r, midKey);
> return true;
> }
> }
> ....{code}
> But the region's compact priority is the minimum of all the stores, when the
> number of storefiles in a store is larger than the configed
> `hbase.hstore.blockingStoreFiles`, the priority will be a negative number,
> but the compared priority in requestSplit() is 1(PRIORITY_USER).
> {code:java}
> public int getStoreCompactionPriority() {
> int priority = blockingFileCount - storefiles.size();
> return (priority == HStore.PRIORITY_USER) ? priority + 1 : priority;
> }
> {code}
> As a result, when a region should split, but its speed of reducing the number
> of files through compaction is slower than the speed of generating new
> files(e.g. compacting L0 files to stripes, bulk load, flush memstore), the
> region will never split. While split can divide the compaction pressure(1
> parent compaction + 2 children compaction can be reduced to 2 children
> compaction).
> The problem is obvious in StripeStoreEngine, though memstore flushing is
> pending when store file count up to the blocking count, each L0 compaction
> may generate the stripe count new files to each stripe. And in this scenario,
> since the store always compact priority to split, the stripe count is larger
> and larger, the new files generated by compact is more and more, no split in
> the end...
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)