[ 
https://issues.apache.org/jira/browse/HBASE-24436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119749#comment-17119749
 ] 

Junhong Xu commented on HBASE-24436:
------------------------------------

{quote}so they will at least be proportional to the number of stores (as 
opposed to number of regions, which is not the same and may be substantially 
lower, imagine a table with 10 families... you are exchanging 10 fixed size 
pools for 1 fixed size pool with this proposal, seems the wrong direction)
{quote}
Ping [~apurtell] I still haven't gotten the point quoted above. Actually we have
{quote}protected ThreadPoolExecutor newStoreFileOpenAndCloseThreadPool(
 final String threadNamePrefix) {
 int numStores = Math.max(1, this.htableDescriptor.getColumnFamilyCount());
 int maxThreads = Math.max(numStores,
 conf.getInt(HConstants.HSTORE_OPEN_AND_CLOSE_THREADS_MAX,
 HConstants.DEFAULT_HSTORE_OPEN_AND_CLOSE_THREADS_MAX));
 return getOpenAndCloseThreadPool(maxThreads, threadNamePrefix);
 }
{quote}
So the store file open/close thread pool size is proportional to the number of 
stores when the count of the stores is larger than the configured fixed thread 
pool size.And it's bit costly if we create a runnable for one or few storefile 
opening and then destroy it.According to the 80/20 Rule, we gain less from 
parallelism in that way cos there are a few storefiles that cost too much time. 
Furthermore, we wanna support many many regions ahead.Creating too many threads 
is a challenge for that too. I think this proposal is just a tradeoff between 
speed and resource consuming. Looking forward to your opinion.Thanks.

> The store file open and close thread pool should be shared at the region level
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-24436
>                 URL: https://issues.apache.org/jira/browse/HBASE-24436
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Junhong Xu
>            Assignee: Junhong Xu
>            Priority: Minor
>
> For now, we provide threads per column family evenly in general, but  there 
> are some cases that some column families have much more store files than 
> others( maybe that's the life, right? ). So in that case, some Stores have 
> beed done quickly while others are struggling.We should share the thread pool 
> at the region level in case of data skew.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to