[
https://issues.apache.org/jira/browse/HBASE-18609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171307#comment-16171307
]
Chia-Ping Tsai commented on HBASE-18609:
----------------------------------------
{code}
- balancer.setClusterStatus(master.getClusterStatus());
+
balancer.setClusterStatus(master.getClusterStatus(EnumSet.of(Option.LIVE_SERVERS)));
{code}
This is a dangerous change because we are not sure whether the balancer only
need the information of live servers.
{code}
- ClusterStatus current = getClusterStatus();
+ ClusterStatus current = getClusterStatus(EnumSet.of(Option.MASTER,
Option.LIVE_SERVERS));
{code}
The change may introduce the bug because we need the information of
{{BACK_MASTER}}. See the following code.
{code}
protected boolean restoreMasters(ClusterStatus initial, ClusterStatus
current) {
List<IOException> deferred = new ArrayList<>();
//check whether current master has changed
final ServerName initMaster = initial.getMaster();
if (!ServerName.isSameAddress(initMaster, current.getMaster())) {
LOG.info("Restoring cluster - Initial active master : "
+ initMaster.getHostAndPort()
+ " has changed to : "
+ current.getMaster().getHostAndPort());
// If initial master is stopped, start it, before restoring the state.
// It will come up as a backup master, if there is already an active
master.
try {
if (!clusterManager.isRunning(ServiceType.HBASE_MASTER,
initMaster.getHostname(), initMaster.getPort())) {
LOG.info("Restoring cluster - starting initial active master at:"
+ initMaster.getHostAndPort());
startMaster(initMaster.getHostname(), initMaster.getPort());
}
// master has changed, we would like to undo this.
// 1. Kill the current backups
// 2. Stop current master
// 3. Start backup masters
for (ServerName currentBackup : current.getBackupMasters()) {
if (!ServerName.isSameAddress(currentBackup, initMaster)) {
LOG.info("Restoring cluster - stopping backup master: " +
currentBackup);
stopMaster(currentBackup);
}
}
...
}
{code}
As i see it, it will be better that we just do the changes like:
{code}
default CompletableFuture<ServerName> getMaster() {
- return getClusterStatus().thenApply(ClusterStatus::getMaster);
+ return
getClusterStatus(EnumSet.of(Option.MASTER)).thenApply(ClusterStatus::getMaster);
}
{code}
As [~mdrob] mentioned previously, getClusterStatus(Set<Option>) may cause
failure silently. A possible improvement is that we can introduce the methods,
such as {{listLiveServers}}, {{getMaster}}, to get each of {{Option}} because
it meets most of use cases in our code base. Also, that will be more readable
and safe. BTW, HBASE-18131 has added the {{listDeadServers}}.
> Apply ClusterStatus#getClusterStatus(EnumSet<Option>) in code base
> ------------------------------------------------------------------
>
> Key: HBASE-18609
> URL: https://issues.apache.org/jira/browse/HBASE-18609
> Project: HBase
> Issue Type: Improvement
> Reporter: Chia-Ping Tsai
> Assignee: Reid Chan
> Fix For: 2.0.0
>
> Attachments: HBASE-18609.master.001.patch,
> HBASE-18609.master.002.patch, HBASE-18609.master.003.patch,
> HBASE-18609.master.004.patch
>
>
> HBASE-15511 enable us to get the cluster status by scope, and after
> refactoring in HBASE-18621. We should apply it in code base so as to prevent
> the useless information.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)