Hi Nik What's the network configuration in XenServer host? Is it bridge or openvswitch?
You can get the info by Cat /etc/xensource/network.conf Anthony > -----Original Message----- > From: Nik Martin [mailto:nik.mar...@nfinausa.com] > Sent: Friday, August 10, 2012 8:36 AM > To: cloudstack-users@incubator.apache.org > Subject: Re: Xen Host failure in pool > > On 08/10/2012 10:32 AM, Mice Xia wrote: > > > > I remember when network partition happens, pool slave may enter > emergency mode and show offline as it could not reach its master for a > long time. > > Could you check hv1's console (graphical console, not ssh console), > and check if its nics are shown correctly? > > > > Regards > > Mice > > > No, when I went into the xsconsole and tried to review all the settings, > it was not showing the management interfaces properly. > > -- > Regards, > > Nik > > > -----Original Message----- > > From: Nik Martin [mailto:nik.mar...@nfinausa.com] > > Sent: 2012-8-10 (ζζδΊ) 23:04 > > To: cloudstack-users@incubator.apache.org > > Subject: Xen Host failure in pool > > > > We have a Xenserver 6.2 based pool of three hosts running under > > CloudStack Acton release (code base is about two weeks old). We left > > last night and everything was fine, and I have about 2 VMs running on > > each host, not doing anything. This morning, I came in, and three VMs > > have stopped, and I logged into XenCenter to see what the pool looked > > like, and the Pol master hd changed from host HV3 to HV2, and HV1 was > > offline. I logged in to HV1's console, and looked at the > > /var/log/messages, and it was complaining about the pool master > address > > being wrong. I went into CloudStack UI and deleted and re-added the > > host, and it failed immediately, and I got this in the log when I did: > > > > > > 2012-08-10 09:56:39,566 DEBUG [cloud.api.ApiServlet] > > (catalina-exec-24:null) Invalid paramemter in URL found. param: > hosttags= > > 2012-08-10 09:56:39,573 INFO [cloud.resource.ResourceManagerImpl] > > (catalina-exec-24:null) Trying to add a new host at http://172.16.5.3 > in > > data center 2 > > 2012-08-10 09:56:39,629 DEBUG [xen.resource.XenServerConnectionPool] > > (catalina-exec-24:null) Slave logon to 172.16.5.3 > > 2012-08-10 09:56:39,632 DEBUG [xen.resource.XenServerConnectionPool] > > (catalina-exec-24:null) Failed to slave local login to 172.16.5.3 due > to > > The master says the host is not known to it. Perhaps the Host was > > deleted from the master's database? Perhaps the slave is pointing to > the > > wrong master? > > 2012-08-10 09:56:39,638 DEBUG [xen.discoverer.XcpServerDiscoverer] > > (catalina-exec-24:null) other exceptions: java.lang.RuntimeException: > > can not get master ip > > java.lang.RuntimeException: can not get master ip > > at > > > com.cloud.hypervisor.xen.resource.XenServerConnectionPool.getMasterIp(X > enServerConnectionPool.java:343) > > at > > > com.cloud.hypervisor.xen.discoverer.XcpServerDiscoverer.find(XcpServerD > iscoverer.java:179) > > at > > > com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManage > rImpl.java:644) > > at > > > com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImp > l.java:514) > > at com.cloud.api.commands.AddHostCmd.execute(AddHostCmd.java:136) > > at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:132) > > at com.cloud.api.ApiServer.queueCommand(ApiServer.java:509) > > at com.cloud.api.ApiServer.handleRequest(ApiServer.java:416) > > at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:300) > > at com.cloud.api.ApiServlet.doGet(ApiServlet.java:59) > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) > > at > > > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applic > ationFilterChain.java:290) > > at > > > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFil > terChain.java:206) > > at > > > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperVal > ve.java:233) > > at > > > org.apache.catalina.core.StandardContextValve.invoke(StandardContextVal > ve.java:191) > > at > > > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.jav > a:127) > > at > > > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.jav > a:102) > > at > > > org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:55 > 5) > > at > > > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve > .java:109) > > at > > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java: > 298) > > at > > > org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor. > java:889) > > at > > > org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.proc > ess(Http11NioProtocol.java:721) > > at > > > org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint. > java:2268) > > at > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.ja > va:1110) > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.j > ava:603) > > at java.lang.Thread.run(Thread.java:679) > > 2012-08-10 09:56:39,638 WARN [cloud.resource.ResourceManagerImpl] > > (catalina-exec-24:null) Unable to find the server resources at > > http://172.16.5.3 > > 2012-08-10 09:56:39,642 WARN [api.commands.AddHostCmd] > > (catalina-exec-24:null) Exception: > > com.cloud.exception.DiscoveryException: Unable to add the host > > at > > > com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManage > rImpl.java:694) > > at > > > com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImp > l.java:514) > > at com.cloud.api.commands.AddHostCmd.execute(AddHostCmd.java:136) > > at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:132) > > at com.cloud.api.ApiServer.queueCommand(ApiServer.java:509) > > at com.cloud.api.ApiServer.handleRequest(ApiServer.java:416) > > at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:300) > > at com.cloud.api.ApiServlet.doGet(ApiServlet.java:59) > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) > > at > > > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applic > ationFilterChain.java:290) > > at > > > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFil > terChain.java:206) > > at > > > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperVal > ve.java:233) > > at > > > org.apache.catalina.core.StandardContextValve.invoke(StandardContextVal > ve.java:191) > > at > > > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.jav > a:127) > > at > > > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.jav > a:102) > > at > > > org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:55 > 5) > > at > > > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve > .java:109) > > at > > > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java: > 298) > > at > > > org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor. > java:889) > > at > > > org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.proc > ess(Http11NioProtocol.java:721) > > at > > > org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint. > java:2268) > > at > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.ja > va:1110) > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.j > ava:603) > > at java.lang.Thread.run(Thread.java:679) > > 2012-08-10 09:56:39,642 WARN [cloud.api.ApiDispatcher] > > (catalina-exec-24:null) class com.cloud.api.ServerApiException : > Unable > > to add the host > > 2012-08-10 09:56:39,723 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-305:null) Ping from 17 > > 2012-08-10 09:56:43,822 DEBUG > > [storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null) > Zone > > 2 is ready to launch secondary storage VM > > 2012-08-10 09:56:43,916 DEBUG > > [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) > Zone > > 2 is ready to launch console proxy > > 2012-08-10 09:56:44,102 DEBUG > > [network.router.VirtualNetworkApplianceManagerImpl] > > (RouterStatusMonitor-1:null) Found 2 routers. > > 2012-08-10 09:56:44,614 DEBUG [agent.manager.AgentManagerImpl] > > (AgentManager-Handler-12:null) Ping from 22 > > 2012-08-10 09:56:48,864 DEBUG [agent.manager.AgentManagerImpl] > > (AgentManager-Handler-10:null) Ping from 18 > > 2012-08-10 09:56:49,511 DEBUG [cloud.server.StatsCollector] > > (StatsCollector-1:null) VmStatsCollector is running... > > 2012-08-10 09:56:49,525 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-305:null) Seq 16-92408948: Executing request > > 2012-08-10 09:56:49,763 DEBUG [xen.resource.CitrixResourceBase] > > (DirectAgent-305:null) Vm cpu utilization 0.01 > > 2012-08-10 09:56:49,763 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-305:null) Seq 16-92408948: Response Received: > > 2012-08-10 09:56:49,763 DEBUG [cloud.vm.VirtualMachineManagerImpl] > > (DirectAgent-305:null) Cleanup succeeded. Details null > > 2012-08-10 09:56:49,763 DEBUG [agent.transport.Request] > > (StatsCollector-1:null) Seq 16-92408948: Received: { Ans: , MgmtId: > > 130577622632, via: 16, Ver: v1, Flags: 10, { GetVmStatsAnswer } } > > 2012-08-10 09:56:49,763 DEBUG [cloud.vm.VirtualMachineManagerImpl] > > (StatsCollector-1:null) Cleanup succeeded. Details null > > 2012-08-10 09:56:54,411 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-497:null) Ping from 17 > > 2012-08-10 09:56:54,550 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-338:null) Ping from 16 > > 2012-08-10 09:56:59,614 DEBUG [agent.manager.AgentManagerImpl] > > (AgentManager-Handler-8:null) Ping from 22 > > 2012-08-10 09:57:03,864 DEBUG [agent.manager.AgentManagerImpl] > > (AgentManager-Handler-9:null) Ping from 18 > > 2012-08-10 09:57:09,551 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-71:null) Ping from 16 > > 2012-08-10 09:57:09,669 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-338:null) Ping from 17 > > 2012-08-10 09:57:13,821 DEBUG > > [storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null) > Zone > > 2 is ready to launch secondary storage VM > > 2012-08-10 09:57:13,918 DEBUG > > [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) > Zone > > 2 is ready to launch console proxy > > 2012-08-10 09:57:14,102 DEBUG > > [network.router.VirtualNetworkApplianceManagerImpl] > > (RouterStatusMonitor-1:null) Found 2 routers. > > 2012-08-10 09:57:14,614 DEBUG [agent.manager.AgentManagerImpl] > > (AgentManager-Handler-11:null) Ping from 22 > > 2012-08-10 09:57:15,645 DEBUG [cloud.server.StatsCollector] > > (StatsCollector-3:null) HostStatsCollector is running... > > 2012-08-10 09:57:15,656 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-71:null) Seq 16-92408949: Executing request > > 2012-08-10 09:57:15,878 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-71:null) Seq 16-92408949: Response Received: > > 2012-08-10 09:57:15,878 DEBUG [cloud.vm.VirtualMachineManagerImpl] > > (DirectAgent-71:null) Cleanup succeeded. Details null > > 2012-08-10 09:57:15,878 DEBUG [agent.transport.Request] > > (StatsCollector-3:null) Seq 16-92408949: Received: { Ans: , MgmtId: > > 130577622632, via: 16, Ver: v1, Flags: 10, { GetHostStatsAnswer } } > > 2012-08-10 09:57:15,879 DEBUG [cloud.vm.VirtualMachineManagerImpl] > > (StatsCollector-3:null) Cleanup succeeded. Details null > > 2012-08-10 09:57:15,884 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-338:null) Seq 17-665190891: Executing request > > 2012-08-10 09:57:16,312 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-338:null) Seq 17-665190891: Response Received: > > 2012-08-10 09:57:16,312 DEBUG [cloud.vm.VirtualMachineManagerImpl] > > (DirectAgent-338:null) Cleanup succeeded. Details null > > 2012-08-10 09:57:16,312 DEBUG [agent.transport.Request] > > (StatsCollector-3:null) Seq 17-665190891: Received: { Ans: , MgmtId: > > 130577622632, via: 17, Ver: v1, Flags: 10, { GetHostStatsAnswer } } > > 2012-08-10 09:57:16,313 DEBUG [cloud.vm.VirtualMachineManagerImpl] > > (StatsCollector-3:null) Cleanup succeeded. Details null > > 2012-08-10 09:57:18,864 DEBUG [agent.manager.AgentManagerImpl] > > (AgentManager-Handler-15:null) Ping from 18 > > 2012-08-10 09:57:24,407 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-71:null) Ping from 17 > > 2012-08-10 09:57:24,566 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-338:null) Ping from 16 > > 2012-08-10 09:57:29,615 DEBUG [agent.manager.AgentManagerImpl] > > (AgentManager-Handler-1:null) Ping from 22 > > 2012-08-10 09:57:30,047 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-294:null) Seq 16-92405762: Executing request > > 2012-08-10 09:57:30,308 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-294:null) Seq 16-92405762: Response Received: > > 2012-08-10 09:57:30,308 DEBUG [agent.transport.Request] > > (DirectAgent-294:null) Seq 16-92405762: Processing: { Ans: , MgmtId: > > 130577622632, via: 16, Ver: v1, Flags: 10, > > > [{"ClusterSyncAnswer":{"_clusterId":1,"_newStates":{},"_isExecuted":fal > se,"result":true,"wait":0}}] > > } > > 2012-08-10 09:57:31,060 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-357:null) Seq 17-665190402: Executing request > > 2012-08-10 09:57:31,250 DEBUG [agent.manager.DirectAgentAttache] > > (DirectAgent-357:null) Seq 17-665190402: Response Received: > > 2012-08-10 09:57:31,250 DEBUG [agent.transport.Request] > > (DirectAgent-357:null) Seq 17-665190402: Processing: { Ans: , MgmtId: > > 130577622632, via: 17, Ver: v1, Flags: 10, > > [{"Answer":{"result":true,"wait":0}}] } > > > > This is a very serious error, and I don't know how to fix it. Can > > anyone suggest what might be the problem and hos I might fix it? > > > > > >