-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38492/
-----------------------------------------------------------
(Updated Sept. 18, 2015, 12:17 p.m.)
Review request for Ambari, Alejandro Fernandez and Nate Cole.
Bugs: AMBARI-13145
https://issues.apache.org/jira/browse/AMBARI-13145
Repository: ambari
Description
-------
Aborting a failed task during an upgrade causes the entire upgrade request to
become ABORTED. The ActionScheduler has logic which will abort an entire
request if the command's success factor was not met. This logic also needs to
take into account skippable stages which will be marked as COMPLETED even with
failed tasks.
Diffs
-----
ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionScheduler.java
7d93638
ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
31356bb
Diff: https://reviews.apache.org/r/38492/diff/
Testing (updated)
-------
Instrumented my environment so that every single stage failed in either one of
the following two ways:
- A timeout of the task by Ambari Server (placing into HOLDING_TIMEDOUT and
then eventually TIMEDOUT
- A timeout of the python executor (placing into HOLDING_FAILED) and then
eventually FAILED.
mvn clean test
Tests run: 3186, Failures: 0, Errors: 0, Skipped: 25
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 23:04 min
[INFO] Finished at: 2015-09-18T11:06:17-04:00
[INFO] Final Memory: 50M/1347M
[INFO] ------------------------------------------------------------------------
Thanks,
Jonathan Hurley