-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57981/
-----------------------------------------------------------
Review request for Ambari, Dmytro Grinenko, Dmitro Lisnichenko, Jonathan
Hurley, and Nate Cole.
Bugs: AMBARI-20593
https://issues.apache.org/jira/browse/AMBARI-20593
Repository: ambari
Description
-------
STR:
1) Install ambari 2.5.0.1
In the ambari.properties file, set
stack.upgrade.auto.retry.timeout.mins=6
stack.upgrade.auto.retry.check.interval.secs=30
2) Install HDP with any set of services
3) Add NameNode HA
4) Register and install new HDP stack version
5) Start RU
5) Corrupt one step from Core Masters group (e.g., stop ambari-agent on a node
while the command is running)
Ambari will restart Restarting NN Batch 1
6) Fix corrupted step (e.g., start ambari-agent again)
7) Corrupt another step from before the command is scheduled (e.g., stop
ambari-agent on a node)
8) Fix corrupted step (e.g., start ambari-agent agent)
The expectation is that Ambari Server should schedule the command on the 2nd
node. However, because the command never got an original_start_time and
start_time, the RetryUpgradeActionService was not able to retry it since it
didn't have any timestamps to compare against.
Diffs
-----
ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleCommand.java
85c8e9f
ambari-server/src/main/java/org/apache/ambari/server/state/services/RetryUpgradeActionService.java
a92aa04
ambari-server/src/test/java/org/apache/ambari/server/state/services/RetryUpgradeActionServiceTest.java
e699e49
Diff: https://reviews.apache.org/r/57981/diff/1/
Testing
-------
Verified on live cluster.
Waiting for unit test results.
Thanks,
Alejandro Fernandez