rhtyd commented on issue #3505: Agent LB for CloudStack failed URL: https://github.com/apache/cloudstack/issues/3505#issuecomment-515909527 I looked at the original log again: ``` 2019-07-15 23:22:39,340 DEBUG [cloud.agent.Agent] (UgentTask-5:null) (logid:) Sending ping: Seq 1-19: { Cmd , MgmtId: -1, via: 1, Ver : v1, Flags: 11, [{"com.cloud.agent.api.PingRoutingWithNwGroupsCommand":{"newGroupStates":{},"_hostVmStateReport":{},"_gatewayAccessible":true,"_vnetAccessible":true,"hostType ":"Routing","hostId":1,"wait":0}}] } 2019-07-15 23:23:09,960 DEBUG [utils.nio.NioConnection] (Agent-NioConnectionHandler-1:null) (logid:) Location 1: Socket Socket[addr=/172.17.1.142,port=8250,localport= 34854] closed on read. Probably -1 returned: No route to host ``` This happens only initially, after the exception is hit usually when trying to send request (ping) to the management server, it calls upon the StartupTask cancel() that kicks in the reconnection task the next time it runs. This means that thereotically 2x the ping interval time it will fail the watch task, then 2x the wait time will be about 180s (default/hard coded) or 3mins. After every mgmt server connection failure, it doubles the time it takes to check if mgmt server is UP that means on each failure the value doubles. If you try to reproduce the force failure once it will run time for 6 mins, and the overall wait doubles to 12 mins. This checks out with your observation of ~15mins of reconnection. Here's what you can try: lower the `ping.interval` configured in your environment, this will ensure that failures are kicked in quickly. On the downside the waitime for the timer is hardcoded to 3mins which can be tuned in future minor version: https://github.com/apache/cloudstack/blob/master/agent/src/main/java/com/cloud/agent/Agent.java#L141
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
