[ 
https://issues.apache.org/jira/browse/HBASE-24052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17067879#comment-17067879
 ] 

Bharath Vissapragada commented on HBASE-24052:
----------------------------------------------

Ya, the second waitFor() is the mysterious one, fwiw, found another race in 
teardown. Race between shutdown and start on {{HMaster#cleanerPool}} 

{noformat}
java.lang.NullPointerException: Chore's pool can not be null
        at 
org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897)
        at 
org.apache.hadoop.hbase.master.cleaner.CleanerChore.<init>(CleanerChore.java:99)
        at 
org.apache.hadoop.hbase.master.cleaner.CleanerChore.<init>(CleanerChore.java:81)
        at 
org.apache.hadoop.hbase.master.cleaner.LogCleaner.<init>(LogCleaner.java:78)
        at 
org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:1451)
        at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1058)
        at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2231)
        at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:621)
        at java.lang.Thread.run(Thread.java:748)
{noformat}

Also, I was planning to do this, adding more logging to see why the rpcs 
failed. Probably good to squash all of them to a single debug logging commit? I 
can spin up a separate one on how to handle InterrupedExceptions in the ZK 
watchers.

{noformat}
diff --git 
a/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterShutdown.java
 
b/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterShutdown.java
index a5e596f79d..4d586a96c5 100644
--- 
a/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterShutdown.java
+++ 
b/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterShutdown.java
@@ -183,10 +183,10 @@ public class TestMasterShutdown {
               LOG.debug("Shutdown RPC sent.");
               return true;
             } catch (CompletionException e) {
-              LOG.debug("Failure sending shutdown RPC.");
+              LOG.debug("Failure sending shutdown RPC.", e);
             }
           } catch (IOException|CompletionException e) {
-            LOG.debug("Failed to establish connection.");
+            LOG.debug("Failed to establish connection.", e);
           } catch (Throwable e) {
             LOG.info("Something unexpected happened.", e);
           }
{noformat}

> Add debug to TestMasterShutdown
> -------------------------------
>
>                 Key: HBASE-24052
>                 URL: https://issues.apache.org/jira/browse/HBASE-24052
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Michael Stack
>            Priority: Trivial
>         Attachments: 0001-HBASE-24052-Add-debug-to-TestMasterShutdown.patch
>
>
> Temporarily add debug to TestMasterShutdown overnight to learn more about a 
> test failure not reproducible locally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to