[ 
https://issues.apache.org/jira/browse/HBASE-18609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171307#comment-16171307
 ] 

Chia-Ping Tsai commented on HBASE-18609:
----------------------------------------

{code}
-      balancer.setClusterStatus(master.getClusterStatus());
+      
balancer.setClusterStatus(master.getClusterStatus(EnumSet.of(Option.LIVE_SERVERS)));
{code}
This is a dangerous change because we are not sure whether the balancer only 
need the information of live servers. 

{code}
-    ClusterStatus current = getClusterStatus();
+    ClusterStatus current = getClusterStatus(EnumSet.of(Option.MASTER, 
Option.LIVE_SERVERS));
{code}
The change may introduce the bug because we need the information of 
{{BACK_MASTER}}. See the following code.
{code}
  protected boolean restoreMasters(ClusterStatus initial, ClusterStatus 
current) {
    List<IOException> deferred = new ArrayList<>();
    //check whether current master has changed
    final ServerName initMaster = initial.getMaster();
    if (!ServerName.isSameAddress(initMaster, current.getMaster())) {
      LOG.info("Restoring cluster - Initial active master : "
              + initMaster.getHostAndPort()
              + " has changed to : "
              + current.getMaster().getHostAndPort());
      // If initial master is stopped, start it, before restoring the state.
      // It will come up as a backup master, if there is already an active 
master.
      try {
        if (!clusterManager.isRunning(ServiceType.HBASE_MASTER,
                initMaster.getHostname(), initMaster.getPort())) {
          LOG.info("Restoring cluster - starting initial active master at:"
                  + initMaster.getHostAndPort());
          startMaster(initMaster.getHostname(), initMaster.getPort());
        }

        // master has changed, we would like to undo this.
        // 1. Kill the current backups
        // 2. Stop current master
        // 3. Start backup masters
        for (ServerName currentBackup : current.getBackupMasters()) {
          if (!ServerName.isSameAddress(currentBackup, initMaster)) {
            LOG.info("Restoring cluster - stopping backup master: " + 
currentBackup);
            stopMaster(currentBackup);
          }
        }
 ...
}
{code}
As i see it, it will be better that we just do the changes like:
{code}
   default CompletableFuture<ServerName> getMaster() {
-    return getClusterStatus().thenApply(ClusterStatus::getMaster);
+    return 
getClusterStatus(EnumSet.of(Option.MASTER)).thenApply(ClusterStatus::getMaster);
   }
{code}
As [~mdrob] mentioned previously, getClusterStatus(Set<Option>) may cause 
failure silently. A possible improvement is that we can introduce the methods, 
such as {{listLiveServers}}, {{getMaster}}, to get each of {{Option}} because 
it meets most of use cases in our code base. Also, that will be more readable 
and safe. BTW, HBASE-18131 has added the {{listDeadServers}}.

> Apply ClusterStatus#getClusterStatus(EnumSet<Option>) in code base
> ------------------------------------------------------------------
>
>                 Key: HBASE-18609
>                 URL: https://issues.apache.org/jira/browse/HBASE-18609
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Chia-Ping Tsai
>            Assignee: Reid Chan
>             Fix For: 2.0.0
>
>         Attachments: HBASE-18609.master.001.patch, 
> HBASE-18609.master.002.patch, HBASE-18609.master.003.patch, 
> HBASE-18609.master.004.patch
>
>
> HBASE-15511 enable us to get the cluster status by scope, and after 
> refactoring in HBASE-18621. We should apply it in code base so as to prevent 
> the useless information.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to