-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30603/
-----------------------------------------------------------
Review request for Ambari, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, and
Yurii Shylov.
Bugs: AMBARI-9467
https://issues.apache.org/jira/browse/AMBARI-9467
Repository: ambari
Description
-------
UpgradeHelper somehow calls the active Namenode first, but this ends up being
the standby namenode by the time it gets called; investigate why.
We will abide by the order in the runbook to first upgrade the standby then the
active namenode, which then causes a flip.
In rare cases, if a namenode fails for whatever reason, ZKFC will initiate a
failover, which explains why sometimes the order may be flipped by the time
that the Namenode prepare happens. However, the namenode_upgrade.py script
works in both cases (active first, or standby first). So this explains the rare
behavior.
There's another Jira to run the namenode_upgrade script as part of the
Pre-Cluster group to make the backup, so this should reduce the likelyhood of a
flip happening after the calculation was made.
Diffs
-----
ambari-server/src/main/java/org/apache/ambari/server/serveraction/upgrades/FinalizeUpgradeAction.java
fceb44d
ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeHelper.java
0c6f68a
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
db17109
ambari-server/src/main/resources/common-services/ZOOKEEPER/3.4.5.2.0/package/scripts/params.py
2484463
ambari-server/src/main/resources/common-services/ZOOKEEPER/3.4.5.2.0/package/scripts/service_check.py
338de32
ambari-server/src/main/resources/common-services/ZOOKEEPER/3.4.5.2.0/package/scripts/zookeeper_server.py
a7ca335
Diff: https://reviews.apache.org/r/30603/diff/
Testing
-------
Verified Rolling Upgrade a 3-node cluster with HDFS, ZK, and Namenode HA. The
flip happens rarely, but ambari must be robust to handle it.
Unit tests are in progress.
Thanks,
Alejandro Fernandez