rhtyd commented on issue #3505: Agent LB for CloudStack failed
URL: https://github.com/apache/cloudstack/issues/3505#issuecomment-515899515
 
 
   From email thread on dev@:
   ```
   2019-07-18 15:26:23,420 INFO  [utils.nio.NioClient] (Agent-Handler-2:null) 
(logid:) Connecting to 172.17.1.142:8250
   2019-07-18 15:26:26,427 ERROR [utils.nio.NioConnection] 
(Agent-Handler-2:null) (logid:) Unable to initialize the threads.
   java.net.NoRouteToHostException: No route to host
         at sun.nio.ch.Net.connect0(Native Method)
         at sun.nio.ch.Net.connect(Net.java:454)
         at sun.nio.ch.Net.connect(Net.java:446)
         at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
         at com.cloud.utils.nio.NioClient.init(NioClient.java:56)
         at com.cloud.utils.nio.NioConnection.start(NioConnection.java:95)
         at com.cloud.agent.Agent.reconnect(Agent.java:517)
         at com.cloud.agent.Agent$ServerHandler.doTask(Agent.java:1091)
         at com.cloud.utils.nio.Task.call(Task.java:83)
         at com.cloud.utils.nio.Task.call(Task.java:29)
         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
         at java.lang.Thread.run(Thread.java:748)
   2019-07-18 15:26:26,432 INFO  [utils.exception.CSExceptionErrorCode] 
(Agent-Handler-2:null) (logid:) Could not find exception: 
com.cloud.utils.exception.NioConnectionException in error code list for 
exceptions
   2019-07-18 15:26:26,432 WARN  [cloud.agent.Agent] (Agent-Handler-2:null) 
(logid:) NIO Connection Exception  
com.cloud.utils.exception.NioConnectionException: No route to host
   2019-07-18 15:26:26,432 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) 
(logid:) Attempted to connect to the server, but received an unexpected 
exception, trying again...
   2019-07-18 15:26:26,432 INFO  [utils.nio.NioClient] (Agent-Handler-2:null) 
(logid:) NioClient connection closed
   2019-07-18 15:26:31,433 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) 
(logid:) Reconnecting to host:172.17.1.141
   ```
   
   The exception is thrown after attempting for 3mins which is a reasonable 
timeout, next after it decides to reconnect it sleeps for upto 3 minutes 
depending on the configured backoff/sleep algorithm and sleep for 3mins: 
https://github.com/apache/cloudstack/blob/master/agent/src/main/java/com/cloud/agent/Agent.java#L528
   
   From the logs, it seems the KVM host was disconnected from the managements 
server host for only 6mins. I think it's perfectly reasonable to wait for few 
mins before kvm agent decides to switch, the instantaneous switching between 
mgmt server without proper socket and sleep timeouts can cause a large number 
of ownership switches and mgmt traffic.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to