> On March 21, 2016, 5:38 p.m., Alejandro Fernandez wrote:
> > ambari-agent/src/main/python/ambari_agent/Heartbeat.py, line 82
> > <https://reviews.apache.org/r/44986/diff/3/?file=1308889#file1308889line82>
> >
> >     What does repr of these 2 objects look like?

The only addition here was str(recovery_timestamp). Anyways, here's how the log 
looks like (before the change):

INFO 2016-03-22 00:01:24,987 Heartbeat.py:78 - Building Heartbeat: {responseId 
= 6, timestamp = 1458604884987, commandsInProgress = True, componentsMapped = 
False}

More documentation https://docs.python.org/2/library/repr.html


> On March 21, 2016, 5:38 p.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java,
> >  line 286
> > <https://reviews.apache.org/r/44986/diff/3/?file=1308892#file1308892line286>
> >
> >     It's possible for this to be called if clusterName is still null.

Yes, there is a possibility. Before creating a cluster, if we register the 
agent using the API and start it, it will attempt to register with the server. 
At this time the cluster name will be null. And subsequent in subsequent 
heartbeats, isConfigStale() is called. Though, hostname will not be null.


> On March 21, 2016, 5:38 p.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java,
> >  line 90
> > <https://reviews.apache.org/r/44986/diff/3/?file=1308895#file1308895line90>
> >
> >     Does this really have to be a concurrent hashmap?

With several hosts, we will endup with several heartbeats, which can occur at 
the same time. At the minimum, we need a read lock in isConfigStale().


> On March 21, 2016, 5:38 p.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java,
> >  line 133
> > <https://reviews.apache.org/r/44986/diff/3/?file=1308895#file1308895line133>
> >
> >     Why should empty cluster or host return true instead of raising an 
> > exception.

There is a potential for cluster name to be empty, as explained in a previous 
comment. However, hostname cannot be null.


> On March 21, 2016, 5:38 p.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangedEvent.java,
> >  line 23
> > <https://reviews.apache.org/r/44986/diff/3/?file=1308899#file1308899line23>
> >
> >     Please add a description of what "Recovery Enabled" means

Updated the comment to say auto start instead of recovery.


- Nahappan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44986/#review124692
-----------------------------------------------------------


On March 21, 2016, 12:24 p.m., Nahappan Somasundaram wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44986/
> -----------------------------------------------------------
> 
> (Updated March 21, 2016, 12:24 p.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid 
> Wagle.
> 
> 
> Bugs: AMBARI-15474
>     https://issues.apache.org/jira/browse/AMBARI-15474
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> AMBARI-15474: Listen for changes to auto-start configuration and send them to 
> the agent during heartbeats.
> 
> In HeartbeatHandler::handleHeartBeat(), send the recovery configuration to 
> the agent. Instead of pulling the recovery configuration from the application 
> cache, implement change events and get notified whenever there are changes to 
> the recovery configuration.
> 
> **Changes**:
> 1. Instead of sending the recovery configuration to the agent during every 
> heartbeat (10 seconds interval), send the configuration only when changes are 
> detected.
> 2. Used AmbariEventPublisher to publish the changes. RecoveryConfigHelper is 
> the subscriber which listens to changes to the configuration.
> 3. RecoveryConfigHelper maintains a map of Cluster-->Host-->Timestamp which 
> is the last updated timestamp of the recovery configuration. This timestamp 
> is used as a token and is sent to the agent with the configuration. The agent 
> provides this timestamp during each heartbeat. If the timestamp from the 
> heartbeat is the same as the one on the server side, no configuration is sent 
> to the agent.
> 4. Whenever changes are detected, the timestamp is invalidated. During the 
> next heartbeat, the recovery configuration is read from the DB and sent to 
> the agent along with the latest timestamp.
> 5. Added new unit tests to test each configuration change event.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/Controller.py 
> c1c16ac97cca0d2677da2e26eee7b4949a4bf15c 
>   ambari-agent/src/main/python/ambari_agent/Heartbeat.py 
> b28644c834a2387d2fb0ad17d104224e830a0245 
>   ambari-agent/src/main/python/ambari_agent/RecoveryManager.py 
> ed537ca6928ce376c5f3e906f7a47c3f1919e309 
>   ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java 
> 1430aa20cf68db071be85c4ad2ebb9acdaccecc0 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
>  3a80803724bcb6ca9e4ec875aa17c9ac1c66fbe8 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatResponse.java
>  617b04b5b8768ba28a3c8ecb6e5e00601f153396 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java
>  c2c18462c586292caaf40bdb6a25a8fe9a39d76c 
>   
> ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfigHelper.java
>  92a622117c71ddfc3bd2562c04ee525491bd6ffd 
>   
> ambari-server/src/main/java/org/apache/ambari/server/events/AmbariEvent.java 
> 78731854fe80c3d6087a73ebf00b0ffe9e287354 
>   
> ambari-server/src/main/java/org/apache/ambari/server/events/ClusterConfigChangedEvent.java
>  PRE-CREATION 
>   
> ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentInstalledEvent.java
>  2ce01236b3bb1ab4934fe456e5669fa474987426 
>   
> ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentRecoveryChangedEvent.java
>  PRE-CREATION 
>   
> ambari-server/src/main/java/org/apache/ambari/server/events/ServiceComponentUninstalledEvent.java
>  19712cdb18592816f82898cd7aa08a3600c7e1f4 
>   ambari-server/src/main/java/org/apache/ambari/server/state/ConfigImpl.java 
> 2cc3d0013140c9c7d3453d7c5e05c3aa02bc5794 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/ServiceComponentImpl.java
>  4be1c21bf41416681be836716ce38305cc3f287e 
>   
> ambari-server/src/main/java/org/apache/ambari/server/state/svccomphost/ServiceComponentHostImpl.java
>  3a0075e3a3c90088ca6bd2d9eca48ef50db9fdb3 
>   
> ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java
>  b1f94e38f97550123dd366984438b0bbf9c2826c 
>   
> ambari-server/src/test/java/org/apache/ambari/server/configuration/RecoveryConfigHelperTest.java
>  c86b95a7cd10b0720f0a3b9676f65a8bd68dd6c9 
>   
> ambari-server/src/test/java/org/apache/ambari/server/events/listeners/upgrade/HostVersionOutOfSyncListenerTest.java
>  1bf9d8336533bc677384314f998f10aa2f10504e 
> 
> Diff: https://reviews.apache.org/r/44986/diff/
> 
> 
> Testing
> -------
> 
> **1. mvn clean install **
> 
> 
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Ambari Main ....................................... SUCCESS [5.864s]
> [INFO] Apache Ambari Project POM ......................... SUCCESS [0.037s]
> [INFO] Ambari Web ........................................ SUCCESS [23.683s]
> [INFO] Ambari Views ...................................... SUCCESS [1.069s]
> [INFO] Ambari Admin View ................................. SUCCESS [5.949s]
> [INFO] ambari-metrics .................................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Common ............................. SUCCESS [0.431s]
> [INFO] Ambari Metrics Hadoop Sink ........................ SUCCESS [1.060s]
> [INFO] Ambari Metrics Flume Sink ......................... SUCCESS [0.534s]
> [INFO] Ambari Metrics Kafka Sink ......................... SUCCESS [0.584s]
> [INFO] Ambari Metrics Storm Sink ......................... SUCCESS [1.472s]
> [INFO] Ambari Metrics Collector .......................... SUCCESS [5.981s]
> [INFO] Ambari Metrics Monitor ............................ SUCCESS [1.696s]
> [INFO] Ambari Metrics Grafana ............................ SUCCESS [0.819s]
> [INFO] Ambari Metrics Assembly ........................... SUCCESS [1:13.281s]
> [INFO] Ambari Server ..................................... SUCCESS [2:11.632s]
> [INFO] Ambari Functional Tests ........................... SUCCESS [0.907s]
> [INFO] Ambari Agent ...................................... SUCCESS [21.659s]
> [INFO] Ambari Client ..................................... SUCCESS [0.043s]
> [INFO] Ambari Python Client .............................. SUCCESS [0.829s]
> [INFO] Ambari Groovy Client .............................. SUCCESS [2.078s]
> [INFO] Ambari Shell ...................................... SUCCESS [0.039s]
> [INFO] Ambari Python Shell ............................... SUCCESS [0.554s]
> [INFO] Ambari Groovy Shell ............................... SUCCESS [0.811s]
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] Total time: 4:42.176s
> [INFO] Finished at: Thu Mar 17 15:52:41 PDT 2016
> [INFO] Final Memory: 261M/1122M
> [INFO] 
> ------------------------------------------------------------------------
> 
> **2. mvn test -DskipSurefireTests **
> 
> Ran 253 tests in 6.633s
> 
> OK
> ----------------------------------------------------------------------
> Total run:926
> Total errors:0
> Total failures:0
> OK
> INFO: AMBARI_SERVER_LIB is not set, using default /usr/lib/ambari-server
> INFO: Return code from stack upgrade command, retcode = 0
> StackAdvisor implementation for stack HDP1, version 2.0.6 was not found
> Returning DefaultStackAdvisor implementation
> StackAdvisor implementation for stack XYZ, version 1.0.0 was loaded
> StackAdvisor implementation for stack XYZ, version 1.0.1 was loaded
> Returning XYZ101StackAdvisor implementation
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] Total time: 52.037s
> [INFO] Finished at: Thu Mar 17 15:11:40 PDT 2016
> [INFO] Final Memory: 57M/1008M
> [INFO] 
> ------------------------------------------------------------------------
> 
> **3. mvn test 
> -Dtest=TestHeartbeatHandler,RecoveryConfigHelperTest,HostVersionOutOfSyncListenerTest
>  -DskipPythonTests **
> 
> -------------------------------------------------------
>  T E S T S
> -------------------------------------------------------
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m 
> -Djava.awt.headless=true
> Running org.apache.ambari.server.agent.TestHeartbeatHandler
> Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.137 sec - 
> in org.apache.ambari.server.agent.TestHeartbeatHandler
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m 
> -Djava.awt.headless=true
> Running org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.704 sec - 
> in org.apache.ambari.server.configuration.RecoveryConfigHelperTest
> Picked up _JAVA_OPTIONS: -Xmx2048m -XX:MaxPermSize=512m 
> -Djava.awt.headless=true
> Running 
> org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.333 sec - 
> in 
> org.apache.ambari.server.events.listeners.upgrade.HostVersionOutOfSyncListenerTest
> 
> Results :
> 
> Tests run: 36, Failures: 0, Errors: 0, Skipped: 0
> 
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] Total time: 1:59.372s
> [INFO] Finished at: Thu Mar 17 15:55:50 PDT 2016
> [INFO] Final Memory: 55M/1018M
> [INFO] 
> ------------------------------------------------------------------------
> 
> **4. Manual tests **
> 
> *Setup*
> *  Deployed a VM with the latest **trunk** bits and replaced 
> **ambari-server.jar** from the development build.
> *  Replaced **agent python** files from development.
> 
> *Tests*
> curl -u admin:admin -H "X-Requested-By: ambari" -X PUT 
> http://localhost:8080/api/v1/clusters/c1/components/METRICS_COLLECTOR -d 
> '{"ServiceComponentInfo":{"recovery_enabled":"false"}}'
> 
> In the Ambari Web UI, METRICS_COLLECTOR was set to Maintenance Mode.
> 
> 
> *Verification using the debugger*
> Before these tests were run, it was verified that the recovery configuration 
> was sent only during registration and not during the heartbeats since nothing 
> was modified.
> 
> For both the tests above, it was verified that events were published for the 
> above changes and that RecoveryConfigHelper captured those events and 
> invalidated the timestamp. During the next heartbeat, the new recovery 
> configuration information was pulled from the DB and send to the agent. It 
> was also verified that for subsequent heartbeats, the recovery configuration 
> was not sent.
> 
> 
> Thanks,
> 
> Nahappan Somasundaram
> 
>

Reply via email to