rhtyd commented on issue #3505: Agent LB for CloudStack failed
URL: https://github.com/apache/cloudstack/issues/3505#issuecomment-515909527
 
 
   I looked at the original log again:
   ```
   2019-07-15 23:22:39,340 DEBUG [cloud.agent.Agent] (UgentTask-5:null) 
(logid:) Sending ping: Seq 1-19: { Cmd , MgmtId: -1, via: 1, Ver : v1, Flags: 
11, 
[{"com.cloud.agent.api.PingRoutingWithNwGroupsCommand":{"newGroupStates":{},"_hostVmStateReport":{},"_gatewayAccessible":true,"_vnetAccessible":true,"hostType
 ":"Routing","hostId":1,"wait":0}}] }
   2019-07-15 23:23:09,960 DEBUG [utils.nio.NioConnection] 
(Agent-NioConnectionHandler-1:null) (logid:) Location 1: Socket 
Socket[addr=/172.17.1.142,port=8250,localport= 34854] closed on read. Probably 
-1 returned: No route to host
   ```
   
   This happens only initially, after the exception is hit usually when trying 
to send request (ping) to the management server, it calls upon the StartupTask 
cancel() that kicks in the reconnection task the next time it runs. This means 
that thereotically 2x the ping interval time it will fail the watch task, then 
2x the wait time will be about 180s (default/hard coded) or 3mins. After every 
mgmt server connection failure, it doubles the time it takes to check if mgmt 
server is UP that means on each failure the value doubles. If you try to 
reproduce the force failure once it will run time for 6 mins, and the overall 
wait doubles to 12 mins. This checks out with your observation of ~15mins of 
reconnection.
   
   Here's what you can try: lower the `ping.interval` configured in your 
environment, this will ensure that failures are kicked in quickly. On the 
downside the waitime for the timer is hardcoded to 3mins which can be tuned in 
future minor version: 
https://github.com/apache/cloudstack/blob/master/agent/src/main/java/com/cloud/agent/Agent.java#L141

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to