Jack Yang created HBASE-27700:
---------------------------------

             Summary: rolling-restart.sh stop all masters at the same time
                 Key: HBASE-27700
                 URL: https://issues.apache.org/jira/browse/HBASE-27700
             Project: HBase
          Issue Type: Improvement
            Reporter: Jack Yang


The rolling-restart.sh in $HBASE_HOME/bin would stop all master service 
(including the backup ones) at the same time, and then restart them at the same 
time:
{code:java}
# The content of rolling-restart.sh
...
# stop all masters before re-start to avoid races for master znode
"$bin"/hbase-daemon.sh --config "${HBASE_CONF_DIR}" stop master
"$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" \
--hosts "${HBASE_BACKUP_MASTERS}" stop master-backup

# make sure the master znode has been deleted before continuing
zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
zookeeper.znode.master`
...

# all masters are down, now restart
"$bin"/hbase-daemon.sh --config "${HBASE_CONF_DIR}"
${START_CMD_DIST_MODE} master
"$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" \
--hosts "${HBASE_BACKUP_MASTERS}" ${START_CMD_DIST_MODE} master-backup {code}
In this way the HMaster service would be unavailable during this period. We can 
restart them in a more graceful way, like this:
 * Stop the backup masters, and then restart them one by one
 * Stop the active master, then one of the backup master would become active
 * Start the original active master, now it's the backup one

Will upload patch soon.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to