ndimiduk commented on a change in pull request #1141: HBASE-23808 [Flakey Test]
TestMasterShutdown#testMasterShutdownBefore…
URL: https://github.com/apache/hbase/pull/1141#discussion_r378510163
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -2804,9 +2804,15 @@ public MemoryBoundedLogMessageBuffer
getRegionServerFatalLogBuffer() {
* Master runs a coordinated stop of all RegionServers and then itself.
*/
public void shutdown() throws IOException {
+ if (!isInitialized()) {
+ LOG.info("Shutdown requested but we're not the active master. Proceeding
as a stop.");
Review comment:
@bharathv has the gist of it. At the point of this race condition -- all
four of these fields are `null` at the time the rpc is received -- the master
will simply do nothing. However, any master (active or backup) can currently
receive the rpc and if it's `clusterStatusTracker` is non-null, it will delete
this ZK node. From there, in the case of a backup master, the
`ActiveMasterManager` will notice and `stop` itself.
Related, looks like there's an early-out in `ServerManager#shutdown` that
can result in a master `stop`ping without properly shutting down its procedure
store.
```java
if (onlineServers.isEmpty()) {
// we do not synchronize here so this may cause a double stop, but not
a big deal
master.stop("OnlineServer=0 right after cluster shutdown set");
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services