Chaopeng Luo created HBASE-28866:
------------------------------------

             Summary: Check and add logs for the configuration settings of log 
cleaner to prevent runtime errors
                 Key: HBASE-28866
                 URL: https://issues.apache.org/jira/browse/HBASE-28866
             Project: HBase
          Issue Type: Bug
          Components: master
    Affects Versions: 3.0.0-beta-1, 2.4.2
            Reporter: Chaopeng Luo
             Fix For: 3.0.0-beta-1
         Attachments: LogCleaner.patch

============================
Problem
-------------------------------------------------
HBase Master cannot be initialized with the following setting:
  <property>
    <name>hbase.oldwals.cleaner.thread.size</name>
    <value>-1</value>
    <description>Default is 2</description>
  </property>
 
After running the start-hbase.sh, the Master node could not be started due to 
an exception:


{code:java}
ERROR [master/localhost:16000:becomeActiveMaster] master.HMaster: Failed to 
become active master
java.lang.IllegalArgumentException: Illegal Capacity: -1
    at java.util.ArrayList.<init>(ArrayList.java:157)
    at 
org.apache.hadoop.hbase.master.cleaner.LogCleaner.createOldWalsCleaner(LogCleaner.java:149)
    at 
org.apache.hadoop.hbase.master.cleaner.LogCleaner.<init>(LogCleaner.java:80)
    at 
org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:1329)
    at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:917)
    at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2081)
    at org.apache.hadoop.hbase.master.HMaster.lambda$0(HMaster.java:505)
    at java.lang.Thread.run(Thread.java:750){code}

We were really confused and misled by the error log as the 'Illegal Capacity' 
of ArrayList seems like an internal code issue.
 
After we read the source code, we found that 
"hbase.oldwals.cleaner.thread.size" is parsed and used in 
createOldWalsCleaner() function without checking:
{code:java}
int size = conf.getInt(OLD_WALS_CLEANER_THREAD_SIZE, 
DEFAULT_OLD_WALS_CLEANER_THREAD_SIZE);    this.oldWALsCleaner = 
createOldWalsCleaner(size); {code}
The value of "hbase.oldwals.cleaner.thread.size" will be served as the 
initialCapacity of ArrayList. If the configuration value is negative, an 
IllegalArgumentException will be thrown.:
{code:java}
private List<Thread> createOldWalsCleaner(int size) {
    ...
    List<Thread> oldWALsCleaner = new ArrayList<>(size);
    ...
} {code}
============================ 
Solution (the attached patch) 
-------------------------------------------------
The basic idea of the attached patch is to add a check and relevant logging for 
this value during the initialization of the {{LogCleaner}} in the constructor. 
This will help users better diagnose the issue. The detailed patch is shown 
below.
{code:java}
@@ -78,6 +78,11 @@ 
public class LogCleaner extends CleanerChore<BaseLogCleanerDelegate>       
pool, params, null);
     this.pendingDelete = new LinkedBlockingQueue<>();
     int size = conf.getInt(OLD_WALS_CLEANER_THREAD_SIZE, 
DEFAULT_OLD_WALS_CLEANER_THREAD_SIZE);
+    if (size <= 0) {
+      LOG.warn("The size of old WALs cleaner thread is {}, which is invalid, "
+          + "the default value will be used.", size);
+      size = DEFAULT_OLD_WALS_CLEANER_THREAD_SIZE;
+    }
     this.oldWALsCleaner = createOldWalsCleaner(size);
     this.cleanerThreadTimeoutMsec = 
conf.getLong(OLD_WALS_CLEANER_THREAD_TIMEOUT_MSEC,       
DEFAULT_OLD_WALS_CLEANER_THREAD_TIMEOUT_MSEC);{code}
Thanks!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to