I've restarted all the services and restarted the servers too. The SSVM and CP start with no trouble. Every time I try to start or create an instance, I see repeated messages like these:
/var/log/cloudstack/agent/cloudstack-agent.out: 2014-10-10 16:27:21,841{GMT} WARN [kvm.resource.LibvirtComputingResource] (Script-8:) Interrupting script. 2014-10-10 16:27:21,841{GMT} WARN [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-4:) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl -n r-19-VM -p %template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 . Output is: /var/log/cloudstack/agent/security_group.log: 2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next time! On Fri, Oct 10, 2014 at 3:04 PM, Ian Young <iyo...@ratespecial.com> wrote: > I tried to restart the network with the "clean up" option, via the web > console. After several minutes, it failed to restart the network. The > SSVM and CP are still running but the VR no longer exists. Why would these > be able to start but not the virtual router? > > On Fri, Oct 10, 2014 at 2:48 PM, Ian Young <iyo...@ratespecial.com> wrote: > >> I restarted the libvirtd service and the management service is now fully >> started (there are services listening on ports 8250 and 9090). The SSVM >> health check script now reports no problems. >> >> However, I tried starting an instance and both the instance and the >> virtual router are in a "starting" state but have been so for almost 10 >> minutes. In the catalina.out log I see: >> INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) >> There is pending job or HA tasks working on the VM. vm id: 4, postpone >> power-change report by resetting power-change counters >> INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) >> There is pending job or HA tasks working on the VM. vm id: 13, postpone >> power-change report by resetting power-change counters >> >> I'm also seeing this in the agent.log: >> 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] >> (Script-6:null) Interrupting script. >> 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] >> (agentRequest-Handler-2:null) Timed out: >> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl >> -n r-4-VM -p >> %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= >> lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 >> . Output is: >> >> And in the security_group.log: >> 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time! >> 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time! >> >> What does this mean? >> >> On Fri, Oct 10, 2014 at 2:11 PM, Ian Young <iyo...@ratespecial.com> >> wrote: >> >>> This morning I was unable to start new instances. I discovered that I >>> could SSH into the SSVM and the console proxy but not the virtual router. >>> Something strange was happening so I thought it might be a good time to >>> gracefully stop all the instances and reboot the hypervisor to see if the >>> VR would start working again. I also rebooted the management server (a >>> separate machine) to have a clean slate. Now that they've both been >>> rebooted, the following symptoms exist: >>> >>> * On the management server, there is no services listening on 9090 or >>> 8250. >>> * When I run the SSVM health check script, it says NFS is not currently >>> mounted. >>> * The management server log is reporting that Zone 1 is not ready to >>> launch SSVM/CP yet, even though both of those are running. >>> >>> The NFS server is running just fine. I can mount it in the management >>> server with no problems. I've restarted cloudstack-management and >>> cloudstack-agent but the problems persist. The "not ready to launch >>> SSVM/CP yet" messages sounds like the management server and the hypervisor >>> are not communicating or some information about the system state is out of >>> sync. How can I confirm this? >>> >> >> >