[ 
https://issues.apache.org/jira/browse/HBASE-26242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaolin Ha reassigned HBASE-26242:
----------------------------------

    Assignee: Xiaolin Ha

> Region never split when store file count larger than the configed blocking 
> file count
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-26242
>                 URL: https://issues.apache.org/jira/browse/HBASE-26242
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0-alpha-1, 1.4.0, 2.0.0
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Major
>
> In the requestSplit() function (called by the MemstoreFlusher and 
> CompactionRunner) for a region, it will check the compaction priority of the 
> region. If compact priority < PRIORITY_USER , it will not split.
> {code:java}
> public synchronized boolean requestSplit(final Region r) {
>   // don't split regions that are blocking
>   HRegion hr = (HRegion)r;
>   try {
>     if (shouldSplitRegion() && hr.getCompactPriority() >= PRIORITY_USER) {
>       byte[] midKey = hr.checkSplit().orElse(null);
>       if (midKey != null) {
>         requestSplit(r, midKey);
>         return true;
>       }
>     }
> ....{code}
> But the region's compact priority is the minimum of all the stores, when the 
> number of storefiles in a store is larger than the configed 
> `hbase.hstore.blockingStoreFiles`, the priority will be a negative number, 
> but the compared priority in requestSplit() is 1(PRIORITY_USER).
> {code:java}
> public int getStoreCompactionPriority() {
>   int priority = blockingFileCount - storefiles.size();
>   return (priority == HStore.PRIORITY_USER) ? priority + 1 : priority;
> }
> {code}
> As a result, when a region should split, but its speed of reducing the number 
> of files through compaction is slower than the speed of generating new 
> files(e.g. compacting L0 files to stripes, bulk load, flush memstore), the 
> region will never split. While split can divide the compaction pressure(1 
> parent compaction + 2 children compaction can be reduced to 2 children 
> compaction).
> The problem is obvious in StripeStoreEngine, though memstore flushing is 
> pending when store file count up to the blocking count, each L0 compaction 
> may generate the stripe count new files to each stripe. And in this scenario, 
> since the store always compact priority to split, the stripe count is larger 
> and larger, the new files generated by compact is more and more, no split in 
> the end...
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to