rhtyd edited a comment on issue #3505: Agent LB for CloudStack failed
URL: https://github.com/apache/cloudstack/issues/3505#issuecomment-515899515
 
 
   From email thread on dev@:
   ```
   2019-07-18 15:26:23,420 INFO  [utils.nio.NioClient] (Agent-Handler-2:null) 
(logid:) Connecting to 172.17.1.142:8250
   2019-07-18 15:26:26,427 ERROR [utils.nio.NioConnection] 
(Agent-Handler-2:null) (logid:) Unable to initialize the threads.
   java.net.NoRouteToHostException: No route to host
         at sun.nio.ch.Net.connect0(Native Method)
         at sun.nio.ch.Net.connect(Net.java:454)
         at sun.nio.ch.Net.connect(Net.java:446)
         at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
         at com.cloud.utils.nio.NioClient.init(NioClient.java:56)
         at com.cloud.utils.nio.NioConnection.start(NioConnection.java:95)
         at com.cloud.agent.Agent.reconnect(Agent.java:517)
         at com.cloud.agent.Agent$ServerHandler.doTask(Agent.java:1091)
         at com.cloud.utils.nio.Task.call(Task.java:83)
         at com.cloud.utils.nio.Task.call(Task.java:29)
         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
         at java.lang.Thread.run(Thread.java:748)
   2019-07-18 15:26:26,432 INFO  [utils.exception.CSExceptionErrorCode] 
(Agent-Handler-2:null) (logid:) Could not find exception: 
com.cloud.utils.exception.NioConnectionException in error code list for 
exceptions
   2019-07-18 15:26:26,432 WARN  [cloud.agent.Agent] (Agent-Handler-2:null) 
(logid:) NIO Connection Exception  
com.cloud.utils.exception.NioConnectionException: No route to host
   2019-07-18 15:26:26,432 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) 
(logid:) Attempted to connect to the server, but received an unexpected 
exception, trying again...
   2019-07-18 15:26:26,432 INFO  [utils.nio.NioClient] (Agent-Handler-2:null) 
(logid:) NioClient connection closed
   2019-07-18 15:26:31,433 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) 
(logid:) Reconnecting to host:172.17.1.141
   ```
   
   The exception is thrown after attempting for 3mins which is a reasonable 
timeout, next after it decides to reconnect it sleeps for up to 3 minutes 
depending on the configured backoff/sleep algorithm and sleeps: 
https://github.com/apache/cloudstack/blob/master/agent/src/main/java/com/cloud/agent/Agent.java#L528
   
   From the logs, it seems the KVM host was disconnected from the managements 
server host for only `6mins` and not 15mins. I think it's perfectly reasonable 
to wait for few mins before kvm agent decides to switch, the instantaneous 
switching between mgmt server without proper socket and sleep timeouts can 
cause a large number of ownership switches and mgmt traffic.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to