[GitHub] [hbase] bharathv commented on a change in pull request #1141: HBASE-23808 [Flakey Test] TestMasterShutdown#testMasterShutdownBefore…

2020-02-11 Thread GitBox
bharathv commented on a change in pull request #1141: HBASE-23808 [Flakey Test] 
TestMasterShutdown#testMasterShutdownBefore…
URL: https://github.com/apache/hbase/pull/1141#discussion_r378018435
 
 

 ##
 File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 ##
 @@ -2804,9 +2804,15 @@ public MemoryBoundedLogMessageBuffer 
getRegionServerFatalLogBuffer() {
* Master runs a coordinated stop of all RegionServers and then itself.
*/
   public void shutdown() throws IOException {
+if (!isInitialized()) {
+  LOG.info("Shutdown requested but we're not the active master. Proceeding 
as a stop.");
 
 Review comment:
   Well, I'm not entirely sure if this is the intended way to stop stand-by 
masters. Reading the code.. it looks like the active master coordinates a 
proper cluster shutdown. In that process, it removes the /hbase/running znode 
that the standb-bys keep a watch on. See the following code in 
ActiveMasterManager..
   
   ```
   @Override
 public void nodeDeleted(String path) {
   
   // We need to keep track of the cluster's shutdown status while
   // we wait on the current master. We consider that, if the cluster
   // was already in a "shutdown" state when we started, that this master
   // is part of a new cluster that was started shortly after the old 
cluster
   // shut down, so that state is now irrelevant. This means that the 
shutdown
   // state must be set while we wait on the active master in order
   // to shutdown this master. See HBASE-8519.
   if(path.equals(watcher.getZNodePaths().clusterStateZNode) && 
!master.isStopped()) {
 clusterShutDown.set(true);
   }
   
   ```
   Ideally they should shut themselves down if the ZK event notifications 
happen as expected. Is that not the case?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] bharathv commented on a change in pull request #1141: HBASE-23808 [Flakey Test] TestMasterShutdown#testMasterShutdownBefore…

2020-02-07 Thread GitBox
bharathv commented on a change in pull request #1141: HBASE-23808 [Flakey Test] 
TestMasterShutdown#testMasterShutdownBefore…
URL: https://github.com/apache/hbase/pull/1141#discussion_r376536908
 
 

 ##
 File path: 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterShutdown.java
 ##
 @@ -128,41 +142,50 @@ public void 
testMasterShutdownBeforeStartingAnyRegionServer() throws Exception {
 conf.setInt(ServerManager.WAIT_ON_REGIONSERVERS_MINTOSTART, 1);
 
 // Start the cluster
-final HBaseTestingUtility util = new HBaseTestingUtility(conf);
-util.startMiniDFSCluster(3);
-util.startMiniZKCluster();
-util.createRootDir();
-final LocalHBaseCluster cluster =
-new LocalHBaseCluster(conf, NUM_MASTERS, NUM_RS, HMaster.class,
-MiniHBaseCluster.MiniHBaseClusterRegionServer.class);
-final int MASTER_INDEX = 0;
-final MasterThread master = cluster.getMasters().get(MASTER_INDEX);
-master.start();
-LOG.info("Called master start on " + master.getName());
-Thread shutdownThread = new Thread("Shutdown-Thread") {
-  @Override
-  public void run() {
-LOG.info("Before call to shutdown master");
-try (Connection connection = createConnection(util); Admin admin = 
connection.getAdmin()) {
-  admin.shutdown();
-} catch (Exception e) {
-  LOG.info("Error while calling Admin.shutdown, which is expected: " + 
e.getMessage());
+LocalHBaseCluster cluster = null;
+try {
+  htu = new HBaseTestingUtility(conf);
+  htu.startMiniDFSCluster(3);
+  htu.startMiniZKCluster();
+  htu.createRootDir();
+  cluster = new LocalHBaseCluster(conf, NUM_MASTERS, NUM_RS, HMaster.class,
+MiniHBaseCluster.MiniHBaseClusterRegionServer.class);
+  final int MASTER_INDEX = 0;
+  final MasterThread master = cluster.getMasters().get(MASTER_INDEX);
+  master.start();
+  LOG.info("Called master start on " + master.getName());
+  final LocalHBaseCluster finalCluster = cluster;
+  Thread shutdownThread = new Thread("Shutdown-Thread") {
+@Override
+public void run() {
+  LOG.info("Before call to shutdown master");
+  try (Connection connection = createConnection(htu); Admin admin = 
connection.getAdmin()) {
+admin.shutdown();
 
 Review comment:
   No, not after adding this check 
https://github.com/apache/hbase/blob/d110c08dce482ef3161294ab931d2559d8e57fca/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterShutdown.java#L141
   
   This check basically works around the actual problem without fixing it. 
Ideally you could do the same because that check is anyway applied in the 
master after the branch merge.
   
   Fwiw, I think this race is exposed after committing HBASE-23764, because 
that speeds up the connections.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] bharathv commented on a change in pull request #1141: HBASE-23808 [Flakey Test] TestMasterShutdown#testMasterShutdownBefore…

2020-02-06 Thread GitBox
bharathv commented on a change in pull request #1141: HBASE-23808 [Flakey Test] 
TestMasterShutdown#testMasterShutdownBefore…
URL: https://github.com/apache/hbase/pull/1141#discussion_r376196608
 
 

 ##
 File path: 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterShutdown.java
 ##
 @@ -66,43 +78,45 @@ public void testMasterShutdown() throws Exception {
 Configuration conf = HBaseConfiguration.create();
 
 // Start the cluster
-HBaseTestingUtility htu = new HBaseTestingUtility(conf);
-StartMiniClusterOption option = StartMiniClusterOption.builder()
-
.numMasters(NUM_MASTERS).numRegionServers(NUM_RS).numDataNodes(NUM_RS).build();
-htu.startMiniCluster(option);
-MiniHBaseCluster cluster = htu.getHBaseCluster();
-
-// get all the master threads
-List masterThreads = cluster.getMasterThreads();
-
-// wait for each to come online
-for (MasterThread mt : masterThreads) {
-  assertTrue(mt.isAlive());
-}
-
-// find the active master
-HMaster active = null;
-for (int i = 0; i < masterThreads.size(); i++) {
-  if (masterThreads.get(i).getMaster().isActiveMaster()) {
-active = masterThreads.get(i).getMaster();
-break;
+try {
+  htu = new HBaseTestingUtility(conf);
+  StartMiniClusterOption option = StartMiniClusterOption.builder()
+.numMasters(NUM_MASTERS)
+.numRegionServers(NUM_RS)
+.numDataNodes(NUM_RS)
+.build();
+  final MiniHBaseCluster cluster = htu.startMiniCluster(option);
+
+  // wait for all master thread to spawn and start their run loop.
+  final long thirtySeconds = TimeUnit.SECONDS.toMillis(30);
+  final long oneSecond = TimeUnit.SECONDS.toMillis(1);
+  assertNotEquals(-1, htu.waitFor(thirtySeconds, oneSecond, () -> {
+final List masterThreads = cluster.getMasterThreads();
+return CollectionUtils.isNotEmpty(masterThreads)
 
 Review comment:
   nit: Can probably be simplified to masterThreads != null && 
masterThreads.size() >=3...
   because the second check automatically means isNotEmpty().


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] bharathv commented on a change in pull request #1141: HBASE-23808 [Flakey Test] TestMasterShutdown#testMasterShutdownBefore…

2020-02-06 Thread GitBox
bharathv commented on a change in pull request #1141: HBASE-23808 [Flakey Test] 
TestMasterShutdown#testMasterShutdownBefore…
URL: https://github.com/apache/hbase/pull/1141#discussion_r376199298
 
 

 ##
 File path: 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterShutdown.java
 ##
 @@ -128,41 +142,50 @@ public void 
testMasterShutdownBeforeStartingAnyRegionServer() throws Exception {
 conf.setInt(ServerManager.WAIT_ON_REGIONSERVERS_MINTOSTART, 1);
 
 // Start the cluster
-final HBaseTestingUtility util = new HBaseTestingUtility(conf);
-util.startMiniDFSCluster(3);
-util.startMiniZKCluster();
-util.createRootDir();
-final LocalHBaseCluster cluster =
-new LocalHBaseCluster(conf, NUM_MASTERS, NUM_RS, HMaster.class,
-MiniHBaseCluster.MiniHBaseClusterRegionServer.class);
-final int MASTER_INDEX = 0;
-final MasterThread master = cluster.getMasters().get(MASTER_INDEX);
-master.start();
-LOG.info("Called master start on " + master.getName());
-Thread shutdownThread = new Thread("Shutdown-Thread") {
-  @Override
-  public void run() {
-LOG.info("Before call to shutdown master");
-try (Connection connection = createConnection(util); Admin admin = 
connection.getAdmin()) {
-  admin.shutdown();
-} catch (Exception e) {
-  LOG.info("Error while calling Admin.shutdown, which is expected: " + 
e.getMessage());
+LocalHBaseCluster cluster = null;
+try {
+  htu = new HBaseTestingUtility(conf);
+  htu.startMiniDFSCluster(3);
+  htu.startMiniZKCluster();
+  htu.createRootDir();
+  cluster = new LocalHBaseCluster(conf, NUM_MASTERS, NUM_RS, HMaster.class,
+MiniHBaseCluster.MiniHBaseClusterRegionServer.class);
+  final int MASTER_INDEX = 0;
+  final MasterThread master = cluster.getMasters().get(MASTER_INDEX);
+  master.start();
+  LOG.info("Called master start on " + master.getName());
+  final LocalHBaseCluster finalCluster = cluster;
+  Thread shutdownThread = new Thread("Shutdown-Thread") {
+@Override
+public void run() {
+  LOG.info("Before call to shutdown master");
+  try (Connection connection = createConnection(htu); Admin admin = 
connection.getAdmin()) {
+admin.shutdown();
+  } catch (Exception e) {
+LOG.info("Error while calling Admin.shutdown, which is expected: " 
+ e.getMessage());
+  }
+  LOG.info("After call to shutdown master");
+  finalCluster.waitOnMaster(MASTER_INDEX);
 }
-LOG.info("After call to shutdown master");
-cluster.waitOnMaster(MASTER_INDEX);
+  };
+  shutdownThread.start();
+  LOG.info("Called master join on " + master.getName());
+  master.join();
+  shutdownThread.join();
+
+  List masterThreads = cluster.getMasters();
+  // make sure all the masters properly shutdown
+  assertEquals(0, masterThreads.size());
+} finally {
+  if (cluster != null) {
+cluster.shutdown();
+  }
+  if (htu != null) {
+htu.shutdownMiniZKCluster();
 
 Review comment:
   nit: You could just use shutdownMiniCluster(). It appears null safe on 
underlying minicluster


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] bharathv commented on a change in pull request #1141: HBASE-23808 [Flakey Test] TestMasterShutdown#testMasterShutdownBefore…

2020-02-06 Thread GitBox
bharathv commented on a change in pull request #1141: HBASE-23808 [Flakey Test] 
TestMasterShutdown#testMasterShutdownBefore…
URL: https://github.com/apache/hbase/pull/1141#discussion_r376199438
 
 

 ##
 File path: 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterShutdown.java
 ##
 @@ -128,41 +142,50 @@ public void 
testMasterShutdownBeforeStartingAnyRegionServer() throws Exception {
 conf.setInt(ServerManager.WAIT_ON_REGIONSERVERS_MINTOSTART, 1);
 
 // Start the cluster
-final HBaseTestingUtility util = new HBaseTestingUtility(conf);
-util.startMiniDFSCluster(3);
-util.startMiniZKCluster();
-util.createRootDir();
-final LocalHBaseCluster cluster =
-new LocalHBaseCluster(conf, NUM_MASTERS, NUM_RS, HMaster.class,
-MiniHBaseCluster.MiniHBaseClusterRegionServer.class);
-final int MASTER_INDEX = 0;
-final MasterThread master = cluster.getMasters().get(MASTER_INDEX);
-master.start();
-LOG.info("Called master start on " + master.getName());
-Thread shutdownThread = new Thread("Shutdown-Thread") {
-  @Override
-  public void run() {
-LOG.info("Before call to shutdown master");
-try (Connection connection = createConnection(util); Admin admin = 
connection.getAdmin()) {
-  admin.shutdown();
-} catch (Exception e) {
-  LOG.info("Error while calling Admin.shutdown, which is expected: " + 
e.getMessage());
+LocalHBaseCluster cluster = null;
+try {
+  htu = new HBaseTestingUtility(conf);
+  htu.startMiniDFSCluster(3);
+  htu.startMiniZKCluster();
+  htu.createRootDir();
+  cluster = new LocalHBaseCluster(conf, NUM_MASTERS, NUM_RS, HMaster.class,
+MiniHBaseCluster.MiniHBaseClusterRegionServer.class);
+  final int MASTER_INDEX = 0;
+  final MasterThread master = cluster.getMasters().get(MASTER_INDEX);
+  master.start();
+  LOG.info("Called master start on " + master.getName());
+  final LocalHBaseCluster finalCluster = cluster;
+  Thread shutdownThread = new Thread("Shutdown-Thread") {
+@Override
+public void run() {
+  LOG.info("Before call to shutdown master");
+  try (Connection connection = createConnection(htu); Admin admin = 
connection.getAdmin()) {
+admin.shutdown();
 
 Review comment:
   This is the place that the master registry exposed a race ..(shutdown goes 
missing..). Rebase will not be clean now :'(


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services