-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29298/
-----------------------------------------------------------

Review request for Ambari, Alejandro Fernandez and Nate Cole.


Bugs: AMBARI-8852
    https://issues.apache.org/jira/browse/AMBARI-8852


Repository: ambari


Description
-------

During RU, a failure occurred on "Client Components" group, "Service Check 
HBASE, MAPREDUCE2, HDFS, YARN" item.
The UI presented me with a Retry button.  However, the server rejected this 
request:

PUT /api/v1/clusters/ysru2/upgrades/5/upgrade_groups/4/upgrade_items/30
{"UpgradeItem":{"status":"PENDING"}}

{
  "status" : 400,
  "message" : "java.lang.IllegalArgumentException: Can not transition a stage 
from FAILED to PENDING"
}

I believe this is the current expected behavior since the failure is not marked 
to hold.  
However, on any service check failure, the user should be able to retry (or 
maybe on any failure?  actions should be idempotent).

----

Allow Retry - mark a stage (upgrade item) to allow any failed task to be 
retried. This means that if a failure occurs during the execution of the task 
then the stage & task will transition to HOLDING_FAILED. Once in the 
HOLDING_FAILED state, the stage can be pushed to PENDING (retry) or FAILED. 
Transitioning the stage to FAILED will cause the remaining tasks in that stage 
to be ABORTED. It never makes sense to allow the remaining tasks of a stage to 
continue executing after the stage has been accepted as FAILED. However, the 
remaining stages of the upgrade request may be allowed execute...

Skippable - mark a stage to allow it to be skipped in the event of a failure so 
that the remaining stages may still execute. This means that when a stage state 
is set to FAILED, it will not trigger the remaining stages of the request to 
abort.
By separating the concepts of retry and skippable, we can be more flexible in 
how we define the behavior of the upgrade. For example, the core masters 
upgrade item should be marked as allow_retry = true and skippable = false. If a 
failure occurs during this stage you should be able to retry. If the failure 
can not be resolved then the entire upgrade request should be aborted.


Diffs
-----

  
ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionScheduler.java
 ccecad9 
  
ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleCommand.java
 f71e2d5 
  ambari-server/src/main/java/org/apache/ambari/server/actionmanager/Stage.java 
4922fa5 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariActionExecutionHelper.java
 17d5782 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariCustomCommandExecutionHelper.java
 c8ae61d 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java
 19ee6d9 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/KerberosHelper.java
 fb19bd5 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/internal/ClusterStackVersionResourceProvider.java
 9329ea9 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/internal/HostStackVersionResourceProvider.java
 3b1b462 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java
 fa39c97 
  ambari-server/src/main/java/org/apache/ambari/server/utils/StageUtils.java 
e6e51a1 
  
ambari-server/src/test/java/org/apache/ambari/server/actionmanager/ExecutionCommandWrapperTest.java
 948f137 
  
ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionDBAccessorImpl.java
 a756275 
  
ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionManager.java
 01a40f4 
  
ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
 8ce4ff2 
  
ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestStage.java
 bde19a1 
  
ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java
 a6df0db 
  
ambari-server/src/test/java/org/apache/ambari/server/controller/AmbariManagementControllerTest.java
 72a22e6 
  
ambari-server/src/test/java/org/apache/ambari/server/serveraction/ServerActionExecutorTest.java
 4bd0d18 
  
ambari-server/src/test/java/org/apache/ambari/server/stageplanner/TestStagePlanner.java
 dd2a519 

Diff: https://reviews.apache.org/r/29298/diff/


Testing
-------

Results :

Tests run: 2447, Failures: 0, Errors: 0, Skipped: 13

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 27:49 min
[INFO] Finished at: 2014-12-21T23:52:28-05:00
[INFO] Final Memory: 42M/496M
[INFO] ------------------------------------------------------------------------


Thanks,

Tom Beerbower

Reply via email to