[ https://issues.apache.org/jira/browse/HBASE-22810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Purtell reopened HBASE-22810: ------------------------------------ Bisect fingers this as breaking TestFlushSnapshotFromClient on branch-1 {noformat} eb6b617d92998b887c8d811769c6a9d80b653432 is the first bad commit commit eb6b617d92998b887c8d811769c6a9d80b653432 Author: openinx <open...@gmail.com> Date: Thu Aug 15 10:57:42 2019 +0800 HBASE-22810 Initialize an separate ThreadPoolExecutor for taking/restoring snapshot (#486) .../apache/hadoop/hbase/executor/EventType.java | 4 +-- .../apache/hadoop/hbase/executor/ExecutorType.java | 1 + .../java/org/apache/hadoop/hbase/HConstants.java | 27 ++++++++++++++++ .../org/apache/hadoop/hbase/master/HMaster.java | 29 +++++++++-------- .../hadoop/hbase/executor/TestExecutorService.java | 36 ++++++++++++++++++++++ 5 files changed, 83 insertions(+), 14 deletions(-) {noformat} > Initialize an separate ThreadPoolExecutor for taking/restoring snapshot > ------------------------------------------------------------------------ > > Key: HBASE-22810 > URL: https://issues.apache.org/jira/browse/HBASE-22810 > Project: HBase > Issue Type: Improvement > Reporter: Zheng Hu > Assignee: Zheng Hu > Priority: Major > Fix For: 3.0.0, 1.5.0, 2.3.0, 2.0.6, 2.2.1, 2.1.6, 1.3.6, 1.4.11 > > > In EventType class, we have the following definition, means taking snapshot > & restoring snapshot are use the MASTER_TABLE_OPERATIONS Executor now. > {code} > /** > * Messages originating from Client to Master.<br> > * C_M_SNAPSHOT_TABLE<br> > * Client asking Master to snapshot an offline table. > */ > C_M_SNAPSHOT_TABLE (48, ExecutorType.MASTER_TABLE_OPERATIONS), > /** > * Messages originating from Client to Master.<br> > * C_M_RESTORE_SNAPSHOT<br> > * Client asking Master to restore a snapshot. > */ > C_M_RESTORE_SNAPSHOT (49, ExecutorType.MASTER_TABLE_OPERATIONS), > {code} > But when I checked the MASTER_TABLE_OPERATIONS thread pool initialization, I > see : > {code} > private void startServiceThreads() throws IOException{ > // ... some other code initializing .... > // We depend on there being only one instance of this executor running > // at a time. To do concurrency, would need fencing of enable/disable of > // tables. > // Any time changing this maxThreads to > 1, pls see the comment at > // AccessController#postCompletedCreateTableAction > > this.executorService.startExecutorService(ExecutorType.MASTER_TABLE_OPERATIONS, > 1); > startProcedureExecutor(); > {code} > That's to say, for CPs enable or disable table sequencely, we will create > a ThreadPoolExecutor with threadPoolSize=1. Then we actually cann't > accomplish the snapshoting concurrence even if they are total difference > tables, says if there are two table snapshoting request, and the Table A cost > 5min for snapshoting, then the Table B need to wait 5min and once Table A > finish its snapshot , then Table B will start the snapshot. > While we've setting the snapshot timeout, so it will be easy to timeout for > table B snapshoting . Actually, we can create a separate thead pool for > snapshot operations only. -- This message was sent by Atlassian JIRA (v7.6.14#76016)