-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35640/
-----------------------------------------------------------
Review request for Ambari, Dmytro Sen and Nate Cole.
Bugs: AMBARI-12012
https://issues.apache.org/jira/browse/AMBARI-12012
Repository: ambari
Description
-------
STR:
1. User registers repo version 2.3.0.0 (notice that a build number was not
provided), and clicks the Install button
2. On all of the hosts, the yum commands timeout (or does a partial install),
this way, "hdp-select versions" will report that 2 versions exist (2.2.0.0-2041
and 2.3.0.0-2800). Because the install did not succeed, the command will not
return the actual_version installed (which was 2.3.0.0-2800). Note: I did this
by decreasing the timeouts in ambari.properties file to 5 mins, and adding a
sleep in install_packages.py after the first package was installed.
3. The ambari server code then changes the state of the 2.3.0.0 version it
knows about to INSTALL_FAILED so that the user can retry, but did not update
the repo version with the actual build version that includes the build number.
4. User retries and this time it succeeds. However, the delta of "hdp-select
versions" outputs "", so no "actual_version" is returned! This is really bad
because the build number is needed for ambari to use it whenever it calls
"hdp-select set <comp> <version>"
5. The ambari server code will change the state to INSTALLED.
The fix is for install_packages.py to always return the actual_version (even in
the case of a failure) so that Ambari server can correct the database entry
(even if the command fails/timesout). This will only happen the first time, but
subsequent attempts to retry installation will use the right value so an exact
match will be found in the database.
Diffs
-----
ambari-server/src/main/java/org/apache/ambari/server/bootstrap/DistributeRepositoriesStructuredOutput.java
f1d6aad
ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java
5600ef1
ambari-server/src/main/resources/custom_actions/scripts/install_packages.py
f8b2308
Diff: https://reviews.apache.org/r/35640/diff/
Testing
-------
Reproduced the issue on a live cluster and verified that the patch worked even
when the agents reported that the packages failed to be installed.
Unit tests in progress
Thanks,
Alejandro Fernandez