[ https://issues.apache.org/jira/browse/HBASE-13988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612656#comment-14612656 ]
Enis Soztutar commented on HBASE-13988: --------------------------------------- The patch aborts the RS with the throwable which will get logged as well, no? {code} uncaughtExceptionHandler = new UncaughtExceptionHandler() { @Override public void uncaughtException(Thread t, Throwable e) { abort("Uncaught exception in service thread " + t.getName(), e); } }; ... public void abort(String reason, Throwable cause) { String msg = "ABORTING region server " + this + ": " + reason; if (cause != null) { LOG.fatal(msg, cause); } else { LOG.fatal(msg); } {code} We were already aborting the RS in case leases thread dies, so it does not change the semantics. +1. > Add exception handler for lease thread > -------------------------------------- > > Key: HBASE-13988 > URL: https://issues.apache.org/jira/browse/HBASE-13988 > Project: HBase > Issue Type: Bug > Affects Versions: 2.0.0 > Reporter: Liu Shaohui > Assignee: Liu Shaohui > Priority: Minor > Fix For: 2.0.0, 1.0.2, 1.1.2, 0.98.15 > > Attachments: HBASE-13988-v001.diff > > > In a prod cluster, a region server exited for some important > threads were not alive. After excluding other threads from the log, we > doubted the lease thread was the root. > So we need to add an exception handler to the lease thread to debug why it > exited in future. > > {quote} > 2015-06-29,12:46:09,222 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: One or more > threads are no longer alive -- stop > 2015-06-29,12:46:09,223 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > server on 21600 > ... > 2015-06-29,12:46:09,330 INFO org.apache.hadoop.hbase.regionserver.LogRoller: > LogRoller exiting. > 2015-06-29,12:46:09,330 INFO > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Thread-37 exiting > 2015-06-29,12:46:09,330 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer$CompactionChecker: > regionserver21600.compactionChecker exiting > 2015-06-29,12:46:12,403 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer$PeriodicMemstoreFlusher: > regionserver21600.periodicFlusher exiting > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)