-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59684/
-----------------------------------------------------------

Review request for Ambari, Attila Doroszlai and Myroslav Papirkovskyy.


Bugs: AMBARI-21142
    https://issues.apache.org/jira/browse/AMBARI-21142


Repository: ambari


Description
-------

Currently if ambari server - agent communication gets out of sync only limited 
information is logged . This makes difficult the troubleshooting of the root 
cause also why the application can not recover from it.


Diffs
-----

  ambari-agent/src/main/python/ambari_agent/Controller.py 83f1da8 
  
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
 6b93462 


Diff: https://reviews.apache.org/r/59684/diff/1/


Testing
-------

Manual testing connection to server lost during heartbeat response processing 
on agent side.:

ambari-agent log:
ERROR 2017-05-31 09:45:52,778 Controller.py:469 - Connection to 
ambari-server.node.dc1.consul was lost (details=Simulating connection loss on 
heartbeat response !!!)
...
INFO 2017-05-31 09:45:53,921 Controller.py:438 - Reconnected to 
https://ambari-server.node.dc1.consul:8441/agent/v1/heartbeat/ambari-agent-1.node.dc1.consul

ambari-server log:
31 May 2017 09:45:53,918  WARN [qtp-ambari-agent-37] HeartBeatHandler:212 - Old 
responseId=103 received form host ambari-agent-1.node.dc1.consul - response was 
lost - returning cached response with responseId=104


Unit tests:

Agent:
Ran 452 tests in 30.156s

OK

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Ambari Main ........................................ SUCCESS [  0.529 s]
[INFO] Apache Ambari Project POM .......................... SUCCESS [  0.004 s]
[INFO] utility ............................................ SUCCESS [  2.002 s]
[INFO] Ambari Agent ....................................... SUCCESS [ 48.909 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS


Server:
Total run:1160
Total errors:0
Total failures:0
OK
[INFO] 

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Ambari Main ........................................ SUCCESS [  0.528 s]
[INFO] Apache Ambari Project POM .......................... SUCCESS [  0.003 s]
[INFO] Ambari Views ....................................... SUCCESS [  2.305 s]
[INFO] utility ............................................ SUCCESS [  1.066 s]
[INFO] ambari-metrics ..................................... SUCCESS [  0.298 s]
[INFO] Ambari Metrics Common .............................. SUCCESS [  4.904 s]
[INFO] Ambari Server ...................................... SUCCESS [48:33 min]
[INFO] ------------------------------------------------------------------------


Thanks,

Sebastian Toader

Reply via email to