Andrew Kyle Purtell created HBASE-24445:
-------------------------------------------

             Summary: Improve default thread pool size for opening store files
                 Key: HBASE-24445
                 URL: https://issues.apache.org/jira/browse/HBASE-24445
             Project: HBase
          Issue Type: Improvement
            Reporter: Andrew Kyle Purtell


For each store open we create a CompletionService and also create a thread pool 
for opening and closing store files. See HStore#openStoreFiles and 
HRegion#getStoreFileOpenAndCloseThreadPool. By default this pool has only one 
thread. It can be increased with "hbase.hstore.open.and.close.threads.max" but 
this config value is then divided by number of stores in the region.

"hbase.hstore.open.and.close.threads.max" is also used to size other thread 
pools for opening and closing the stores themselves, so it's an unfortunate 
overloading.

We should have a configuration parameter that directly and simply tunes the 
thread pool size for opening store files. Introduce a new configuration 
parameter: "hbase.hstore.hfile.open.threads.max" which will define the upper 
bound for a thread pool shared by the entire store for opening hfiles. The 
default should be 1 to preserve default behavior.

Once this is done, we could increase this to 2, 4, 8, or more for increased 
parallelism when opening store files without impact on other activities. The 
time required to open all storefiles often dominates the total time for 
bringing a region online. The thread pool will be shut down and eligible for 
garbage collection once all files are loaded and the store is online.

Number of open threads should scale with the number of stores, so allocating 
the pool at the store level continues to make sense.

Longer term we might try recursively decomposing the region open task with a 
fork-join pool such that the opening of store files can be dynamically 
parallelized in a probably superior way (conjecture pending a real attempt with 
metrics) . 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to