[ https://issues.apache.org/jira/browse/HBASE-24436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119749#comment-17119749 ]
Junhong Xu commented on HBASE-24436: ------------------------------------ {quote}so they will at least be proportional to the number of stores (as opposed to number of regions, which is not the same and may be substantially lower, imagine a table with 10 families... you are exchanging 10 fixed size pools for 1 fixed size pool with this proposal, seems the wrong direction) {quote} Ping [~apurtell] I still haven't gotten the point quoted above. Actually we have {quote}protected ThreadPoolExecutor newStoreFileOpenAndCloseThreadPool( final String threadNamePrefix) { int numStores = Math.max(1, this.htableDescriptor.getColumnFamilyCount()); int maxThreads = Math.max(numStores, conf.getInt(HConstants.HSTORE_OPEN_AND_CLOSE_THREADS_MAX, HConstants.DEFAULT_HSTORE_OPEN_AND_CLOSE_THREADS_MAX)); return getOpenAndCloseThreadPool(maxThreads, threadNamePrefix); } {quote} So the store file open/close thread pool size is proportional to the number of stores when the count of the stores is larger than the configured fixed thread pool size.And it's bit costly if we create a runnable for one or few storefile opening and then destroy it.According to the 80/20 Rule, we gain less from parallelism in that way cos there are a few storefiles that cost too much time. Furthermore, we wanna support many many regions ahead.Creating too many threads is a challenge for that too. I think this proposal is just a tradeoff between speed and resource consuming. Looking forward to your opinion.Thanks. > The store file open and close thread pool should be shared at the region level > ------------------------------------------------------------------------------ > > Key: HBASE-24436 > URL: https://issues.apache.org/jira/browse/HBASE-24436 > Project: HBase > Issue Type: Improvement > Reporter: Junhong Xu > Assignee: Junhong Xu > Priority: Minor > > For now, we provide threads per column family evenly in general, but there > are some cases that some column families have much more store files than > others( maybe that's the life, right? ). So in that case, some Stores have > beed done quickly while others are struggling.We should share the thread pool > at the region level in case of data skew. -- This message was sent by Atlassian Jira (v8.3.4#803005)