I didn't find anything like that. Everything's been runnin ok over the weekend so I will leave it as is.
On Mon, Oct 13, 2014 at 2:18 AM, Daan Hoogland <daan.hoogl...@gmail.com> wrote: > Good going Ian, sorry you didn't get any assistance on the way. Did you > find a setting that should have a different default? Like the router > service offering memory :P or doesn't that make any sense? > > On Sat, Oct 11, 2014 at 5:11 AM, Ian Young <iyo...@ratespecial.com> wrote: > > > Aha! I restarted cloudstack-agent, which caused the virtual router to > > change to a "stopped" status in the management console. However, the > > console viewer icon was still visible, so I clicked it. The router had > run > > out of memory and caused a kernel panic. I created a new system service > > offering with 500 MB of memory, changed the router's service offering, > and > > started it. It booted with no problem. The default memory size of 128 > MB > > is not enough. This is the system VM template I was using: > > > > > > > http://cloudstack.apt-get.eu/systemvm/4.4/systemvm64template-4.4.0-6-kvm.qcow2.bz2 > > > > On Fri, Oct 10, 2014 at 7:28 PM, Ian Young <iyo...@ratespecial.com> > wrote: > > > > > I dropped all the cloud* databases, deleted everything in primary and > > > secondary storage, and reinstalled the management server, following the > > > guide I wrote for myself the last time I built a stable CloudStack > > system. > > > Then I imported one of my backed up instances as a template and tried > to > > > create a new VM. Same problem as before. How is this possible? > > > > > > 2014-10-10 19:17:44,075 WARN [kvm.resource.LibvirtComputingResource] > > > (agentRequest-Handler-3:null) Timed out: > > > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/ > patchviasocket.pl > > > -n r-4-VM -p > > > > > > %template=domP%name=r-4-VM%eth0ip=192.168.102.222%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= > > > lax.ratespecial.com > > > %cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.0.33%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 > > > . Output is: > > > 2014-10-10 19:18:05,078 WARN [kvm.resource.LibvirtComputingResource] > > > (Script-3:null) Interrupting script. > > > > > > On Fri, Oct 10, 2014 at 4:33 PM, Ian Young <iyo...@ratespecial.com> > > wrote: > > > > > >> I've restarted all the services and restarted the servers too. The > SSVM > > >> and CP start with no trouble. Every time I try to start or create an > > >> instance, I see repeated messages like these: > > >> > > >> /var/log/cloudstack/agent/cloudstack-agent.out: > > >> 2014-10-10 16:27:21,841{GMT} WARN > > >> [kvm.resource.LibvirtComputingResource] (Script-8:) Interrupting > > script. > > >> 2014-10-10 16:27:21,841{GMT} WARN > > >> [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-4:) > Timed > > >> out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/ > > >> patchviasocket.pl -n r-19-VM -p > > >> > > > %template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= > > >> lax.ratespecial.com > > > %cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 > > >> . Output is: > > >> > > >> /var/log/cloudstack/agent/security_group.log: > > >> 2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next > > time! > > >> > > >> On Fri, Oct 10, 2014 at 3:04 PM, Ian Young <iyo...@ratespecial.com> > > >> wrote: > > >> > > >>> I tried to restart the network with the "clean up" option, via the > web > > >>> console. After several minutes, it failed to restart the network. > The > > >>> SSVM and CP are still running but the VR no longer exists. Why would > > these > > >>> be able to start but not the virtual router? > > >>> > > >>> On Fri, Oct 10, 2014 at 2:48 PM, Ian Young <iyo...@ratespecial.com> > > >>> wrote: > > >>> > > >>>> I restarted the libvirtd service and the management service is now > > >>>> fully started (there are services listening on ports 8250 and 9090). > > The > > >>>> SSVM health check script now reports no problems. > > >>>> > > >>>> However, I tried starting an instance and both the instance and the > > >>>> virtual router are in a "starting" state but have been so for almost > > 10 > > >>>> minutes. In the catalina.out log I see: > > >>>> INFO [c.c.v.VirtualMachineManagerImpl] > (AgentManager-Handler-10:null) > > >>>> There is pending job or HA tasks working on the VM. vm id: 4, > postpone > > >>>> power-change report by resetting power-change counters > > >>>> INFO [c.c.v.VirtualMachineManagerImpl] > (AgentManager-Handler-10:null) > > >>>> There is pending job or HA tasks working on the VM. vm id: 13, > > postpone > > >>>> power-change report by resetting power-change counters > > >>>> > > >>>> I'm also seeing this in the agent.log: > > >>>> 2014-10-10 14:43:26,833 WARN > [kvm.resource.LibvirtComputingResource] > > >>>> (Script-6:null) Interrupting script. > > >>>> 2014-10-10 14:43:26,833 WARN > [kvm.resource.LibvirtComputingResource] > > >>>> (agentRequest-Handler-2:null) Timed out: > > >>>> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/ > > >>>> patchviasocket.pl -n r-4-VM -p > > >>>> > > > %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= > > >>>> lax.ratespecial.com > > > %cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 > > >>>> . Output is: > > >>>> > > >>>> And in the security_group.log: > > >>>> 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next > > >>>> time! > > >>>> 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next > > >>>> time! > > >>>> > > >>>> What does this mean? > > >>>> > > >>>> On Fri, Oct 10, 2014 at 2:11 PM, Ian Young <iyo...@ratespecial.com> > > >>>> wrote: > > >>>> > > >>>>> This morning I was unable to start new instances. I discovered > that > > I > > >>>>> could SSH into the SSVM and the console proxy but not the virtual > > router. > > >>>>> Something strange was happening so I thought it might be a good > time > > to > > >>>>> gracefully stop all the instances and reboot the hypervisor to see > > if the > > >>>>> VR would start working again. I also rebooted the management > server > > (a > > >>>>> separate machine) to have a clean slate. Now that they've both > been > > >>>>> rebooted, the following symptoms exist: > > >>>>> > > >>>>> * On the management server, there is no services listening on 9090 > or > > >>>>> 8250. > > >>>>> * When I run the SSVM health check script, it says NFS is not > > >>>>> currently mounted. > > >>>>> * The management server log is reporting that Zone 1 is not ready > to > > >>>>> launch SSVM/CP yet, even though both of those are running. > > >>>>> > > >>>>> The NFS server is running just fine. I can mount it in the > > management > > >>>>> server with no problems. I've restarted cloudstack-management and > > >>>>> cloudstack-agent but the problems persist. The "not ready to > launch > > >>>>> SSVM/CP yet" messages sounds like the management server and the > > hypervisor > > >>>>> are not communicating or some information about the system state is > > out of > > >>>>> sync. How can I confirm this? > > >>>>> > > >>>> > > >>>> > > >>> > > >> > > > > > > > > > -- > Daan >