We have a Xenserver 6.2 based pool of three hosts running under
CloudStack Acton release (code base is about two weeks old). We left
last night and everything was fine, and I have about 2 VMs running on
each host, not doing anything. This morning, I came in, and three VMs
have stopped, and I logged into XenCenter to see what the pool looked
like, and the Pol master hd changed from host HV3 to HV2, and HV1 was
offline. I logged in to HV1's console, and looked at the
/var/log/messages, and it was complaining about the pool master address
being wrong. I went into CloudStack UI and deleted and re-added the
host, and it failed immediately, and I got this in the log when I did:
2012-08-10 09:56:39,566 DEBUG [cloud.api.ApiServlet]
(catalina-exec-24:null) Invalid paramemter in URL found. param: hosttags=
2012-08-10 09:56:39,573 INFO [cloud.resource.ResourceManagerImpl]
(catalina-exec-24:null) Trying to add a new host at http://172.16.5.3 in
data center 2
2012-08-10 09:56:39,629 DEBUG [xen.resource.XenServerConnectionPool]
(catalina-exec-24:null) Slave logon to 172.16.5.3
2012-08-10 09:56:39,632 DEBUG [xen.resource.XenServerConnectionPool]
(catalina-exec-24:null) Failed to slave local login to 172.16.5.3 due to
The master says the host is not known to it. Perhaps the Host was
deleted from the master's database? Perhaps the slave is pointing to the
wrong master?
2012-08-10 09:56:39,638 DEBUG [xen.discoverer.XcpServerDiscoverer]
(catalina-exec-24:null) other exceptions: java.lang.RuntimeException:
can not get master ip
java.lang.RuntimeException: can not get master ip
at
com.cloud.hypervisor.xen.resource.XenServerConnectionPool.getMasterIp(XenServerConnectionPool.java:343)
at
com.cloud.hypervisor.xen.discoverer.XcpServerDiscoverer.find(XcpServerDiscoverer.java:179)
at
com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:644)
at
com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImpl.java:514)
at com.cloud.api.commands.AddHostCmd.execute(AddHostCmd.java:136)
at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:132)
at com.cloud.api.ApiServer.queueCommand(ApiServer.java:509)
at com.cloud.api.ApiServer.handleRequest(ApiServer.java:416)
at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:300)
at com.cloud.api.ApiServlet.doGet(ApiServlet.java:59)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889)
at
org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:721)
at
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2268)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
2012-08-10 09:56:39,638 WARN [cloud.resource.ResourceManagerImpl]
(catalina-exec-24:null) Unable to find the server resources at
http://172.16.5.3
2012-08-10 09:56:39,642 WARN [api.commands.AddHostCmd]
(catalina-exec-24:null) Exception:
com.cloud.exception.DiscoveryException: Unable to add the host
at
com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:694)
at
com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImpl.java:514)
at com.cloud.api.commands.AddHostCmd.execute(AddHostCmd.java:136)
at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:132)
at com.cloud.api.ApiServer.queueCommand(ApiServer.java:509)
at com.cloud.api.ApiServer.handleRequest(ApiServer.java:416)
at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:300)
at com.cloud.api.ApiServlet.doGet(ApiServlet.java:59)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889)
at
org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:721)
at
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2268)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
2012-08-10 09:56:39,642 WARN [cloud.api.ApiDispatcher]
(catalina-exec-24:null) class com.cloud.api.ServerApiException : Unable
to add the host
2012-08-10 09:56:39,723 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-305:null) Ping from 17
2012-08-10 09:56:43,822 DEBUG
[storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null) Zone
2 is ready to launch secondary storage VM
2012-08-10 09:56:43,916 DEBUG
[cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) Zone
2 is ready to launch console proxy
2012-08-10 09:56:44,102 DEBUG
[network.router.VirtualNetworkApplianceManagerImpl]
(RouterStatusMonitor-1:null) Found 2 routers.
2012-08-10 09:56:44,614 DEBUG [agent.manager.AgentManagerImpl]
(AgentManager-Handler-12:null) Ping from 22
2012-08-10 09:56:48,864 DEBUG [agent.manager.AgentManagerImpl]
(AgentManager-Handler-10:null) Ping from 18
2012-08-10 09:56:49,511 DEBUG [cloud.server.StatsCollector]
(StatsCollector-1:null) VmStatsCollector is running...
2012-08-10 09:56:49,525 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-305:null) Seq 16-92408948: Executing request
2012-08-10 09:56:49,763 DEBUG [xen.resource.CitrixResourceBase]
(DirectAgent-305:null) Vm cpu utilization 0.01
2012-08-10 09:56:49,763 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-305:null) Seq 16-92408948: Response Received:
2012-08-10 09:56:49,763 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(DirectAgent-305:null) Cleanup succeeded. Details null
2012-08-10 09:56:49,763 DEBUG [agent.transport.Request]
(StatsCollector-1:null) Seq 16-92408948: Received: { Ans: , MgmtId:
130577622632, via: 16, Ver: v1, Flags: 10, { GetVmStatsAnswer } }
2012-08-10 09:56:49,763 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(StatsCollector-1:null) Cleanup succeeded. Details null
2012-08-10 09:56:54,411 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-497:null) Ping from 17
2012-08-10 09:56:54,550 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-338:null) Ping from 16
2012-08-10 09:56:59,614 DEBUG [agent.manager.AgentManagerImpl]
(AgentManager-Handler-8:null) Ping from 22
2012-08-10 09:57:03,864 DEBUG [agent.manager.AgentManagerImpl]
(AgentManager-Handler-9:null) Ping from 18
2012-08-10 09:57:09,551 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-71:null) Ping from 16
2012-08-10 09:57:09,669 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-338:null) Ping from 17
2012-08-10 09:57:13,821 DEBUG
[storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null) Zone
2 is ready to launch secondary storage VM
2012-08-10 09:57:13,918 DEBUG
[cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) Zone
2 is ready to launch console proxy
2012-08-10 09:57:14,102 DEBUG
[network.router.VirtualNetworkApplianceManagerImpl]
(RouterStatusMonitor-1:null) Found 2 routers.
2012-08-10 09:57:14,614 DEBUG [agent.manager.AgentManagerImpl]
(AgentManager-Handler-11:null) Ping from 22
2012-08-10 09:57:15,645 DEBUG [cloud.server.StatsCollector]
(StatsCollector-3:null) HostStatsCollector is running...
2012-08-10 09:57:15,656 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-71:null) Seq 16-92408949: Executing request
2012-08-10 09:57:15,878 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-71:null) Seq 16-92408949: Response Received:
2012-08-10 09:57:15,878 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(DirectAgent-71:null) Cleanup succeeded. Details null
2012-08-10 09:57:15,878 DEBUG [agent.transport.Request]
(StatsCollector-3:null) Seq 16-92408949: Received: { Ans: , MgmtId:
130577622632, via: 16, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
2012-08-10 09:57:15,879 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(StatsCollector-3:null) Cleanup succeeded. Details null
2012-08-10 09:57:15,884 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-338:null) Seq 17-665190891: Executing request
2012-08-10 09:57:16,312 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-338:null) Seq 17-665190891: Response Received:
2012-08-10 09:57:16,312 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(DirectAgent-338:null) Cleanup succeeded. Details null
2012-08-10 09:57:16,312 DEBUG [agent.transport.Request]
(StatsCollector-3:null) Seq 17-665190891: Received: { Ans: , MgmtId:
130577622632, via: 17, Ver: v1, Flags: 10, { GetHostStatsAnswer } }
2012-08-10 09:57:16,313 DEBUG [cloud.vm.VirtualMachineManagerImpl]
(StatsCollector-3:null) Cleanup succeeded. Details null
2012-08-10 09:57:18,864 DEBUG [agent.manager.AgentManagerImpl]
(AgentManager-Handler-15:null) Ping from 18
2012-08-10 09:57:24,407 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-71:null) Ping from 17
2012-08-10 09:57:24,566 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-338:null) Ping from 16
2012-08-10 09:57:29,615 DEBUG [agent.manager.AgentManagerImpl]
(AgentManager-Handler-1:null) Ping from 22
2012-08-10 09:57:30,047 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-294:null) Seq 16-92405762: Executing request
2012-08-10 09:57:30,308 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-294:null) Seq 16-92405762: Response Received:
2012-08-10 09:57:30,308 DEBUG [agent.transport.Request]
(DirectAgent-294:null) Seq 16-92405762: Processing: { Ans: , MgmtId:
130577622632, via: 16, Ver: v1, Flags: 10,
[{"ClusterSyncAnswer":{"_clusterId":1,"_newStates":{},"_isExecuted":false,"result":true,"wait":0}}]
}
2012-08-10 09:57:31,060 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-357:null) Seq 17-665190402: Executing request
2012-08-10 09:57:31,250 DEBUG [agent.manager.DirectAgentAttache]
(DirectAgent-357:null) Seq 17-665190402: Response Received:
2012-08-10 09:57:31,250 DEBUG [agent.transport.Request]
(DirectAgent-357:null) Seq 17-665190402: Processing: { Ans: , MgmtId:
130577622632, via: 17, Ver: v1, Flags: 10,
[{"Answer":{"result":true,"wait":0}}] }
This is a very serious error, and I don't know how to fix it. Can
anyone suggest what might be the problem and hos I might fix it?
--
Regards,
Nik
Nik Martin
VP Business Development
Nfina Technologies, Inc.
+1.251.243.0043 x1003
Relentless Reliability