[ https://issues.apache.org/jira/browse/HBASE-28881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HBASE-28881: ----------------------------------- Labels: pull-request-available (was: ) > Setting `hbase.master.procedure.threads` to negative value doesn't break > HMaster but clients cannot connect > ----------------------------------------------------------------------------------------------------------- > > Key: HBASE-28881 > URL: https://issues.apache.org/jira/browse/HBASE-28881 > Project: HBase > Issue Type: Bug > Components: master > Affects Versions: 2.4.2, 2.6.0, 3.0.0-beta-1 > Reporter: Ariadne_team > Priority: Critical > Labels: pull-request-available > Fix For: 4.0.0-alpha-1 > > Attachments: HBASE-28881-000.patch, HBASE-28881-001.patch > > > ============================ > Problem > ------------------------------------------------- > When we set 'hbase.master.procedure.threads' to a negative value as following: > <property> > <name>hbase.master.procedure.threads</name> > <value>-1</value> > </property> > We found that HMaster starts normally, but the HBase client cannot connect to > the server. Additionally, there are no related error messages in the HMaster > logs, making it difficult for users to diagnose the root cause of the issue. > The root cause may be in the following code: > After 'hbase.master.procedure.threads' is parsed and loaded in > createProcedureExecutor(), it will be propagated to init(): > {code:java} > private void createProcedureExecutor() throws IOException { > final int numThreads = > conf.getInt(MasterProcedureConstants.MASTER_PROCEDURE_THREADS, Math.max( > (cpus > 0 ? cpus / 4 : 0), > MasterProcedureConstants.DEFAULT_MIN_MASTER_PROCEDURE_THREADS)); > ... > procedureExecutor.init(numThreads, abortOnCorruption); > } {code} > In the {{init}} function, the parameter {{numThreads}} is used to initialize > a series of work threads in a loop. However, since the configuration value is > set to -1, the program does not enter the loop, resulting in no work threads > being initialized. This leads to the client being unable to connect. > {code:java} > for (int i = 0; i < corePoolSize; ++i) { > workerThreads.add(new WorkerThread(threadGroup)); > } {code} > However, when this failure occurs, there are no error logs in the HMaster > that explicitly point to this configuration parameter, making it difficult > for users to diagnose the root cause. > It is recommended that validation checks and corresponding log messages for > this configuration parameter be added to assist users in diagnosing this > issue. > > ============================ > Solution (the attached patch) > ------------------------------------------------- > Since {{numThreads}} is declared as final in the {{createProcedureExecutor}} > method and cannot be modified, it may be beneficial to add logging within > that method to capture the configuration value. > {code:java} > @@ -1743,6 +1743,9 @@ public class HMaster extends > HBaseServerBase<MasterRpcServices> implements Maste > int cpus = Runtime.getRuntime().availableProcessors(); > final int numThreads = > conf.getInt(MasterProcedureConstants.MASTER_PROCEDURE_THREADS, Math.max( > (cpus > 0 ? cpus / 4 : 0), > MasterProcedureConstants.DEFAULT_MIN_MASTER_PROCEDURE_THREADS)); > + if (numThreads <= 0) { > + LOG.warn(MasterProcedureConstants.MASTER_PROCEDURE_THREADS + " is set > to {}.", numThreads); > + } > final boolean abortOnCorruption = > conf.getBoolean(MasterProcedureConstants.EXECUTOR_ABORT_ON_CORRUPTION, > MasterProcedureConstants.DEFAULT_EXECUTOR_ABORT_ON_CORRUPTION); > {code} > > These are the situations I encountered and possible mitigation solutions. If > there is anything else you need to add, please remind me. Thank you. -- This message was sent by Atlassian Jira (v8.20.10#820010)