[
https://issues.apache.org/jira/browse/AMBARI-12488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alejandro Fernandez updated AMBARI-12488:
-----------------------------------------
Fix Version/s: (was: trunk)
> RU - Use haadmin failover command instead of killing ZKFC during
> upgrade/downgrade
> ----------------------------------------------------------------------------------
>
> Key: AMBARI-12488
> URL: https://issues.apache.org/jira/browse/AMBARI-12488
> Project: Ambari
> Issue Type: Story
> Components: ambari-server
> Affects Versions: 2.0.0
> Reporter: Alejandro Fernandez
> Assignee: Alejandro Fernandez
> Labels: rolling_upgrade
> Fix For: 2.1.1
>
> Attachments: AMBARI-12488.patch, AMBARI-12488.v2.patch
>
>
> Currently RU orchestration during upgrade/downgrade kills ZKFC on the active
> NameNode to initiate a failover to standby. We should instead use the
> failover command.
> E.g.,
> {code}
> su hdfs -c 'hdfs haadmin -failover nn1 nn2'
> {code}
> Where nn1 is the current namenode if it if the active one, and nn2 is the
> remaining namenode.
> This is safer than killing zkfc on the active namenode because this command
> first tries to gracefully transition a NameNode to the Standby state. If this
> fails, the fencing methods (as configured by dfs.ha.fencing.methods) will be
> attempted until one succeeds. After this process the second NameNode will be
> transitioned to the Active state.
> It reduces long waits between ZKFC kill, failure kicking-in after a timeout,
> and then NN becoming active.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)