Alejandro Fernandez created AMBARI-12488:
--------------------------------------------

             Summary: RU - Use haadmin failover command instead of killing ZKFC 
during upgrade/downgrade
                 Key: AMBARI-12488
                 URL: https://issues.apache.org/jira/browse/AMBARI-12488
             Project: Ambari
          Issue Type: Story
          Components: ambari-server
    Affects Versions: 2.0.0
            Reporter: Alejandro Fernandez
            Assignee: Alejandro Fernandez
             Fix For: 2.1.2


Currently RU orchestration during upgrade/downgrade kills ZKFC on the active 
NameNode to initiate a failover to standby. We should instead use the failover 
command.
E.g.,

{code}
su hdfs -c 'hdfs haadmin -failover nn1 nn2'
{code}
Where nn1 is the current namenode if it if the active one, and nn2 is the 
remaining namenode.

This is safer than killing zkfc on the active namenode because this command 
first tries to gracefully transition a NameNode to the Standby state. If this 
fails, the fencing methods (as configured by dfs.ha.fencing.methods) will be 
attempted until one succeeds. After this process the second NameNode will be 
transitioned to the Active state. 

It reduces long waits between ZKFC kill, failure kicking-in after a timeout, 
and then NN becoming active.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to