Jack Yang created HBASE-27700:
---------------------------------
Summary: rolling-restart.sh stop all masters at the same time
Key: HBASE-27700
URL: https://issues.apache.org/jira/browse/HBASE-27700
Project: HBase
Issue Type: Improvement
Reporter: Jack Yang
The rolling-restart.sh in $HBASE_HOME/bin would stop all master service
(including the backup ones) at the same time, and then restart them at the same
time:
{code:java}
# The content of rolling-restart.sh
...
# stop all masters before re-start to avoid races for master znode
"$bin"/hbase-daemon.sh --config "${HBASE_CONF_DIR}" stop master
"$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" \
--hosts "${HBASE_BACKUP_MASTERS}" stop master-backup
# make sure the master znode has been deleted before continuing
zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
zookeeper.znode.master`
...
# all masters are down, now restart
"$bin"/hbase-daemon.sh --config "${HBASE_CONF_DIR}"
${START_CMD_DIST_MODE} master
"$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" \
--hosts "${HBASE_BACKUP_MASTERS}" ${START_CMD_DIST_MODE} master-backup {code}
In this way the HMaster service would be unavailable during this period. We can
restart them in a more graceful way, like this:
* Stop the backup masters, and then restart them one by one
* Stop the active master, then one of the backup master would become active
* Start the original active master, now it's the backup one
Will upload patch soon.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)