-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48096/#review135934
-----------------------------------------------------------




ambari-agent/src/main/python/ambari_agent/ActionQueue.py (lines 433 - 434)
<https://reviews.apache.org/r/48096/#comment200958>

    You're creating a new status type here of `INSTALL_FAILED` but you're only 
using it when updating the recovery manager. Couldn't you just pass the 
`INSTALL_FAILED` literal directly to `update_current_status`?
    
    I just don't see `status` being handled differently with this new value 
being assigned.



ambari-agent/src/main/python/ambari_agent/ActionQueue.py (lines 438 - 439)
<https://reviews.apache.org/r/48096/#comment200956>

    These aren't needed.



ambari-agent/src/main/python/ambari_agent/ActionQueue.py (line 527)
<https://reviews.apache.org/r/48096/#comment200957>

    Not needed.



ambari-agent/src/main/python/ambari_agent/RecoveryManager.py (lines 323 - 334)
<https://reviews.apache.org/r/48096/#comment200962>

    This logic is getting a bit "if-elsy". Perhaps a state machine might be in 
order here?



ambari-agent/src/main/python/ambari_agent/RecoveryManager.py (lines 333 - 334)
<https://reviews.apache.org/r/48096/#comment200959>

    Not needed.



ambari-agent/src/main/python/ambari_agent/RecoveryManager.py (lines 589 - 590)
<https://reviews.apache.org/r/48096/#comment200961>

    Not needed.


- Jonathan Hurley


On May 31, 2016, 6:23 p.m., Nahappan Somasundaram wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48096/
> -----------------------------------------------------------
> 
> (Updated May 31, 2016, 6:23 p.m.)
> 
> 
> Review request for Ambari, Ajit Kumar, Jonathan Hurley, Sumit Mohanty, and 
> Sid Wagle.
> 
> 
> Bugs: AMBARI-16935
>     https://issues.apache.org/jira/browse/AMBARI-16935
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> AMBARI-16935: Retry and recover from component install failures
> 
> ** Issue **
> 
> There are multiple instances where components end up in INSTALL_FAILED state 
> during cluster setup. Ambari does not retry or recover from INSTALL_FAILED 
> state. 
> 
> Ambari should retry and recover from installation failures.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/ActionQueue.py 
> 4a843d840dafd96023ead8b929fef33efcb9fa41 
>   ambari-agent/src/main/python/ambari_agent/RecoveryManager.py 
> 87d9483c634026897629396bb48ec0cbabfcfae6 
>   ambari-agent/src/test/python/ambari_agent/TestRecoveryManager.py 
> ed0fd2fd3cfd37f535fa14f52835ddefd376038b 
> 
> Diff: https://reviews.apache.org/r/48096/diff/
> 
> 
> Testing
> -------
> 
> ** 1. mvn clean install **
> 
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Ambari Main ....................................... SUCCESS [8.346s]
> [INFO] Apache Ambari Project POM ......................... SUCCESS [0.036s]
> [INFO] Ambari Web ........................................ SUCCESS [24.196s]
> [INFO] Ambari Views ...................................... SUCCESS [1.370s]
> [INFO] Ambari Admin View ................................. SUCCESS [7.555s]
> [INFO] ambari-metrics .................................... SUCCESS [0.388s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [14.289s]
> [INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.879s]
> [INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.951s]
> [INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [1.085s]
> [INFO] Ambari Metrics Storm Sink ......................... SUCCESS [2.354s]
> [INFO] Ambari Metrics Collector .......................... SUCCESS [6.883s]
> [INFO] Ambari Metrics Monitor ............................ SUCCESS [2.126s]
> [INFO] Ambari Metrics Grafana ............................ SUCCESS [0.886s]
> [INFO] Ambari Metrics Assembly ........................... SUCCESS [1:15.977s]
> [INFO] Ambari Server ..................................... SUCCESS [3:06.681s]
> [INFO] Ambari Functional Tests ........................... SUCCESS [1.430s]
> [INFO] Ambari Agent ...................................... SUCCESS [30.176s]
> [INFO] Ambari Client ..................................... SUCCESS [0.052s]
> [INFO] Ambari Python Client .............................. SUCCESS [1.129s]
> [INFO] Ambari Groovy Client .............................. SUCCESS [2.394s]
> [INFO] Ambari Shell ...................................... SUCCESS [0.078s]
> [INFO] Ambari Python Shell ............................... SUCCESS [0.858s]
> [INFO] Ambari Groovy Shell ............................... SUCCESS [4.609s]
> [INFO] ambari-logsearch .................................. SUCCESS [0.264s]
> [INFO] Ambari Logsearch Appender ......................... SUCCESS [0.231s]
> [INFO] Ambari Logsearch Solr Client ...................... SUCCESS [4.324s]
> [INFO] Ambari Logsearch Portal ........................... SUCCESS [6.150s]
> [INFO] Ambari Logsearch Log Feeder ....................... SUCCESS [2.309s]
> [INFO] Ambari Logsearch Assembly ......................... SUCCESS [0.101s]
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] Total time: 6:29.831s
> [INFO] Finished at: Tue May 31 15:21:18 PDT 2016
> [INFO] Final Memory: 294M/1039M
> [INFO] 
> ------------------------------------------------------------------------
> 
> ** 2. mvn test -DskipSurefireTests **
> 
> ----------------------------------------------------------------------
> Ran 261 tests in 6.695s
> 
> OK
> ----------------------------------------------------------------------
> Total run:1052
> Total errors:0
> Total failures:0
> OK
> INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
> INFO: Return code from stack upgrade command, retcode = 0
> StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
> Returning DefaultStackAdvisor implementation
> StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
> StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
> Returning XYZ101StackAdvisor implementation
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] Total time: 55.370s
> [INFO] Finished at: Tue May 31 15:09:42 PDT 2016
> [INFO] Final Memory: 57M/1010M
> [INFO] 
> ------------------------------------------------------------------------
> 
> ** 3. Manual tests **
> Deployed a single node cluster VM and copied over ActionQueue.py and 
> RecoveryManager.py from the build to the VM. Put in some code in 
> ActionQueue.py to fail randomly on executing an install command. Verified 
> that re-install was attempted when install failed.
> 
> 
> Thanks,
> 
> Nahappan Somasundaram
> 
>

Reply via email to