instances using virtual router for DNS instead of DNS servers
When I set up CloudStack, I chose my physical DNS servers as both the internal and external DNS servers. They perform recursion, so they are suitable for queries about hosts on the LAN as well as on the rest of the internet. However, the /etc/resolv.conf file in my instances lists the virtual router first, followed by the physical servers I chose during setup. The virtual router does not successfully return answers about internal hosts, causing the instances to be unable to reach each other. I'm aware of the use.external.dns option but the last time I set that to true and restarted the virtual router, it failed to start up again. Why is DHCP assigning the virtual router as the first name server instead of using the ones I selected during setup?
Re: services not running after reboot
I restarted the libvirtd service and the management service is now fully started (there are services listening on ports 8250 and 9090). The SSVM health check script now reports no problems. However, I tried starting an instance and both the instance and the virtual router are in a starting state but have been so for almost 10 minutes. In the catalina.out log I see: INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) There is pending job or HA tasks working on the VM. vm id: 4, postpone power-change report by resetting power-change counters INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) There is pending job or HA tasks working on the VM. vm id: 13, postpone power-change report by resetting power-change counters I'm also seeing this in the agent.log: 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] (Script-6:null) Interrupting script. 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-2:null) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl -n r-4-VM -p %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 . Output is: And in the security_group.log: 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time! 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time! What does this mean? On Fri, Oct 10, 2014 at 2:11 PM, Ian Young iyo...@ratespecial.com wrote: This morning I was unable to start new instances. I discovered that I could SSH into the SSVM and the console proxy but not the virtual router. Something strange was happening so I thought it might be a good time to gracefully stop all the instances and reboot the hypervisor to see if the VR would start working again. I also rebooted the management server (a separate machine) to have a clean slate. Now that they've both been rebooted, the following symptoms exist: * On the management server, there is no services listening on 9090 or 8250. * When I run the SSVM health check script, it says NFS is not currently mounted. * The management server log is reporting that Zone 1 is not ready to launch SSVM/CP yet, even though both of those are running. The NFS server is running just fine. I can mount it in the management server with no problems. I've restarted cloudstack-management and cloudstack-agent but the problems persist. The not ready to launch SSVM/CP yet messages sounds like the management server and the hypervisor are not communicating or some information about the system state is out of sync. How can I confirm this?
Re: services not running after reboot
I tried to restart the network with the clean up option, via the web console. After several minutes, it failed to restart the network. The SSVM and CP are still running but the VR no longer exists. Why would these be able to start but not the virtual router? On Fri, Oct 10, 2014 at 2:48 PM, Ian Young iyo...@ratespecial.com wrote: I restarted the libvirtd service and the management service is now fully started (there are services listening on ports 8250 and 9090). The SSVM health check script now reports no problems. However, I tried starting an instance and both the instance and the virtual router are in a starting state but have been so for almost 10 minutes. In the catalina.out log I see: INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) There is pending job or HA tasks working on the VM. vm id: 4, postpone power-change report by resetting power-change counters INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) There is pending job or HA tasks working on the VM. vm id: 13, postpone power-change report by resetting power-change counters I'm also seeing this in the agent.log: 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] (Script-6:null) Interrupting script. 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-2:null) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl -n r-4-VM -p %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 . Output is: And in the security_group.log: 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time! 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time! What does this mean? On Fri, Oct 10, 2014 at 2:11 PM, Ian Young iyo...@ratespecial.com wrote: This morning I was unable to start new instances. I discovered that I could SSH into the SSVM and the console proxy but not the virtual router. Something strange was happening so I thought it might be a good time to gracefully stop all the instances and reboot the hypervisor to see if the VR would start working again. I also rebooted the management server (a separate machine) to have a clean slate. Now that they've both been rebooted, the following symptoms exist: * On the management server, there is no services listening on 9090 or 8250. * When I run the SSVM health check script, it says NFS is not currently mounted. * The management server log is reporting that Zone 1 is not ready to launch SSVM/CP yet, even though both of those are running. The NFS server is running just fine. I can mount it in the management server with no problems. I've restarted cloudstack-management and cloudstack-agent but the problems persist. The not ready to launch SSVM/CP yet messages sounds like the management server and the hypervisor are not communicating or some information about the system state is out of sync. How can I confirm this?
Re: services not running after reboot
I've restarted all the services and restarted the servers too. The SSVM and CP start with no trouble. Every time I try to start or create an instance, I see repeated messages like these: /var/log/cloudstack/agent/cloudstack-agent.out: 2014-10-10 16:27:21,841{GMT} WARN [kvm.resource.LibvirtComputingResource] (Script-8:) Interrupting script. 2014-10-10 16:27:21,841{GMT} WARN [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-4:) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl -n r-19-VM -p %template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 . Output is: /var/log/cloudstack/agent/security_group.log: 2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next time! On Fri, Oct 10, 2014 at 3:04 PM, Ian Young iyo...@ratespecial.com wrote: I tried to restart the network with the clean up option, via the web console. After several minutes, it failed to restart the network. The SSVM and CP are still running but the VR no longer exists. Why would these be able to start but not the virtual router? On Fri, Oct 10, 2014 at 2:48 PM, Ian Young iyo...@ratespecial.com wrote: I restarted the libvirtd service and the management service is now fully started (there are services listening on ports 8250 and 9090). The SSVM health check script now reports no problems. However, I tried starting an instance and both the instance and the virtual router are in a starting state but have been so for almost 10 minutes. In the catalina.out log I see: INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) There is pending job or HA tasks working on the VM. vm id: 4, postpone power-change report by resetting power-change counters INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) There is pending job or HA tasks working on the VM. vm id: 13, postpone power-change report by resetting power-change counters I'm also seeing this in the agent.log: 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] (Script-6:null) Interrupting script. 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-2:null) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl -n r-4-VM -p %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 . Output is: And in the security_group.log: 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time! 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time! What does this mean? On Fri, Oct 10, 2014 at 2:11 PM, Ian Young iyo...@ratespecial.com wrote: This morning I was unable to start new instances. I discovered that I could SSH into the SSVM and the console proxy but not the virtual router. Something strange was happening so I thought it might be a good time to gracefully stop all the instances and reboot the hypervisor to see if the VR would start working again. I also rebooted the management server (a separate machine) to have a clean slate. Now that they've both been rebooted, the following symptoms exist: * On the management server, there is no services listening on 9090 or 8250. * When I run the SSVM health check script, it says NFS is not currently mounted. * The management server log is reporting that Zone 1 is not ready to launch SSVM/CP yet, even though both of those are running. The NFS server is running just fine. I can mount it in the management server with no problems. I've restarted cloudstack-management and cloudstack-agent but the problems persist. The not ready to launch SSVM/CP yet messages sounds like the management server and the hypervisor are not communicating or some information about the system state is out of sync. How can I confirm this?
Re: services not running after reboot
Aha! I restarted cloudstack-agent, which caused the virtual router to change to a stopped status in the management console. However, the console viewer icon was still visible, so I clicked it. The router had run out of memory and caused a kernel panic. I created a new system service offering with 500 MB of memory, changed the router's service offering, and started it. It booted with no problem. The default memory size of 128 MB is not enough. This is the system VM template I was using: http://cloudstack.apt-get.eu/systemvm/4.4/systemvm64template-4.4.0-6-kvm.qcow2.bz2 On Fri, Oct 10, 2014 at 7:28 PM, Ian Young iyo...@ratespecial.com wrote: I dropped all the cloud* databases, deleted everything in primary and secondary storage, and reinstalled the management server, following the guide I wrote for myself the last time I built a stable CloudStack system. Then I imported one of my backed up instances as a template and tried to create a new VM. Same problem as before. How is this possible? 2014-10-10 19:17:44,075 WARN [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-3:null) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl -n r-4-VM -p %template=domP%name=r-4-VM%eth0ip=192.168.102.222%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.0.33%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 . Output is: 2014-10-10 19:18:05,078 WARN [kvm.resource.LibvirtComputingResource] (Script-3:null) Interrupting script. On Fri, Oct 10, 2014 at 4:33 PM, Ian Young iyo...@ratespecial.com wrote: I've restarted all the services and restarted the servers too. The SSVM and CP start with no trouble. Every time I try to start or create an instance, I see repeated messages like these: /var/log/cloudstack/agent/cloudstack-agent.out: 2014-10-10 16:27:21,841{GMT} WARN [kvm.resource.LibvirtComputingResource] (Script-8:) Interrupting script. 2014-10-10 16:27:21,841{GMT} WARN [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-4:) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/ patchviasocket.pl -n r-19-VM -p %template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 . Output is: /var/log/cloudstack/agent/security_group.log: 2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next time! On Fri, Oct 10, 2014 at 3:04 PM, Ian Young iyo...@ratespecial.com wrote: I tried to restart the network with the clean up option, via the web console. After several minutes, it failed to restart the network. The SSVM and CP are still running but the VR no longer exists. Why would these be able to start but not the virtual router? On Fri, Oct 10, 2014 at 2:48 PM, Ian Young iyo...@ratespecial.com wrote: I restarted the libvirtd service and the management service is now fully started (there are services listening on ports 8250 and 9090). The SSVM health check script now reports no problems. However, I tried starting an instance and both the instance and the virtual router are in a starting state but have been so for almost 10 minutes. In the catalina.out log I see: INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) There is pending job or HA tasks working on the VM. vm id: 4, postpone power-change report by resetting power-change counters INFO [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null) There is pending job or HA tasks working on the VM. vm id: 13, postpone power-change report by resetting power-change counters I'm also seeing this in the agent.log: 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] (Script-6:null) Interrupting script. 2014-10-10 14:43:26,833 WARN [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-2:null) Timed out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/ patchviasocket.pl -n r-4-VM -p %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain= lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3 . Output is: And in the security_group.log: 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time! 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time! What does this mean? On Fri, Oct 10, 2014 at 2:11 PM, Ian Young iyo...@ratespecial.com wrote: This morning I was unable to start new instances. I discovered that I could SSH into the SSVM and the console proxy but not the virtual router. Something strange was happening so I thought it might
unable to start virtual router
I wanted to bypass the virtual router as the first DNS server so that my instances would use our existing physical DNS servers. I followed the instructions here: http://support.citrix.com/article/CTX138970 I set use.external.dns to true and then restarted the virtual router. The VR remained in a starting state indefinitely. Eventually it timed out. Now I can't start the VR. The management log says: 2014-10-09 14:00:13,275 ERROR [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95 ctx-ead24324) Failed to start instance VM[DomainRouter|r-4-VM] 2014-10-09 14:00:13,636 ERROR [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95 ctx-ead24324) Invocation exception, caused by: com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not retrying 2014-10-09 14:00:13,636 ERROR [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95) Unable to complete AsyncJobVO {id:95, userId: 2, accountId: 2, instanceType: null, instanceId: null, cmd: com.cloud.vm.VmWorkStart, cmdInfo: rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAACAAIABHQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAADHcIEAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 132241037805012, completeMsid: null, lastUpdated: null, lastPolled: null, created: Thu Oct 09 13:39:51 PDT 2014}, job origin:94 2014-10-09 14:00:13,763 ERROR [c.c.a.ApiAsyncJobDispatcher] (API-Job-Executor-1:ctx-e206f887 job-94) Unexpected exception while executing org.apache.cloudstack.api.command.admin.router.StartRouterCmd
Re: unable to start virtual router
I restarted the agent about a half dozen times and the router magically started itself. What's the best way to make my instances use our internal DNS servers? On Thu, Oct 9, 2014 at 3:33 PM, Ian Young iyo...@ratespecial.com wrote: I wanted to bypass the virtual router as the first DNS server so that my instances would use our existing physical DNS servers. I followed the instructions here: http://support.citrix.com/article/CTX138970 I set use.external.dns to true and then restarted the virtual router. The VR remained in a starting state indefinitely. Eventually it timed out. Now I can't start the VR. The management log says: 2014-10-09 14:00:13,275 ERROR [c.c.v.VirtualMachineManagerImpl] (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95 ctx-ead24324) Failed to start instance VM[DomainRouter|r-4-VM] 2014-10-09 14:00:13,636 ERROR [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95 ctx-ead24324) Invocation exception, caused by: com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not retrying 2014-10-09 14:00:13,636 ERROR [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95) Unable to complete AsyncJobVO {id:95, userId: 2, accountId: 2, instanceType: null, instanceId: null, cmd: com.cloud.vm.VmWorkStart, cmdInfo: rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAACAAIABHQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAADHcIEAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw, cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, result: null, initMsid: 132241037805012, completeMsid: null, lastUpdated: null, lastPolled: null, created: Thu Oct 09 13:39:51 PDT 2014}, job origin:94 2014-10-09 14:00:13,763 ERROR [c.c.a.ApiAsyncJobDispatcher] (API-Job-Executor-1:ctx-e206f887 job-94) Unexpected exception while executing org.apache.cloudstack.api.command.admin.router.StartRouterCmd
system capacity not updating
I set some new over provisioning values, stopped all running instances, and restarted the management service. When I logged back in, the system capacity has not changed. All instances are stopped, yet the dashboard still reports the same resource usage as before I shut them down. How do I refresh this information?
upgraded to 4.4.1, management console is broken
I tried upgrading from 4.4.0 to 4.4.1 but now I get a 404 error when I try to access the management console. The localhost.2014-10-03.log shows this error: Caused by: java.io.IOException: Resource [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.1.jar!/META-INF/cloudstack/api-config/module.properties] and [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.0.jar!/META-INF/cloudstack/api-config/module.properties] do not appear to be the same resource, please ensure the name property is correct or that the module is not defined twice Where is this value defined? The full log can be viewed here: pastebin.com/nrdEsxZK I also noticed the error 4.4.0 KVM SystemVm template not found. Cannot upgrade system Vms. The URL for the system VM template in the 4.4.1 upgrade instructions is the same as the one I used when I installed 4.4.0 initially. Is there really a need to install the same template again?
Re: upgraded to 4.4.1, management console is broken
It looks like this is the root of the problem: 2014-10-03 13:44:47,131 DEBUG [c.c.u.d.Upgrade440to441] (main:null) Updating System Vm template IDs 2014-10-03 13:44:47,136 DEBUG [c.c.u.d.Upgrade440to441] (main:null) Updating LXC System Vms 2014-10-03 13:44:47,137 WARN [c.c.u.d.Upgrade440to441] (main:null) 4.4.0 LXC SystemVm template not found. LXC hypervisor is not used, so not failing upgrade 2014-10-03 13:44:47,138 DEBUG [c.c.u.d.Upgrade440to441] (main:null) Updating KVM System Vms 2014-10-03 13:44:47,141 ERROR [c.c.u.DatabaseUpgradeChecker] (main:null) Unable to upgrade the database com.cloud.utils.exception.CloudRuntimeException: 4.4.0 KVM SystemVm template not found. Cannot upgrade system Vms Any ideas about how I can fix this? I had a 4.4.0 KVM SystemVm template prior to the upgrade. On Fri, Oct 3, 2014 at 1:39 PM, Ian Young iyo...@ratespecial.com wrote: I tried upgrading from 4.4.0 to 4.4.1 but now I get a 404 error when I try to access the management console. The localhost.2014-10-03.log shows this error: Caused by: java.io.IOException: Resource [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.1.jar!/META-INF/cloudstack/api-config/module.properties] and [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.0.jar!/META-INF/cloudstack/api-config/module.properties] do not appear to be the same resource, please ensure the name property is correct or that the module is not defined twice Where is this value defined? The full log can be viewed here: pastebin.com/nrdEsxZK I also noticed the error 4.4.0 KVM SystemVm template not found. Cannot upgrade system Vms. The URL for the system VM template in the 4.4.1 upgrade instructions is the same as the one I used when I installed 4.4.0 initially. Is there really a need to install the same template again?
Re: basic zone setup
Do you know which MySQL tables need to be updated to reference the new template? I'm worried that if I miss one the system will break unexpectedly the next time I launch a system VM. It might be worthwhile for me to simply reinstall the entire thing to be certain everything's set up correctly. On Thu, Jul 31, 2014 at 12:31 AM, Erik Weber terbol...@gmail.com wrote: Yes, if you don't want to reinstall/re-seed the system vm template, you should also download the new ones and do the mysql queries so that it is used for any future system vm deployments. Erik On Thu, Jul 31, 2014 at 3:17 AM, Ian Young iyo...@ratespecial.com wrote: Yes, that makes the ssvm-check pass all the tests. Thanks. Should I repeat that upgrade with the console proxy? On Wed, Jul 30, 2014 at 6:09 PM, Carlos Reategui car...@reategui.com wrote: There have been some messages going around about the template needing a fix for this. Per this link (https://gist.github.com/terbolous/102ae8edd1cda192561c) from one of the messages you can try the following on the ssvm itself: apt-get update apt-get -y install openjdk-7-jre-headless openjdk-7-jre-lib apt-get -y remove openjdk-6-jre-headless then you may also need to: service cloud stop sleep 3 service cloud start try the ssvm-check again after that. My understanding is that this is not a permanent fix, but should get you going for now. On Wed, Jul 30, 2014 at 6:00 PM, Ian Young iyo...@ratespecial.com wrote: I found this in the cloud.out log in the SSVM: Exception in thread main java.lang.UnsupportedClassVersionError: com/cloud/agent/AgentShell : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:634) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) at java.net.URLClassLoader.access$000(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:212) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) Could not find the main class: com.cloud.agent.AgentShell. Program will exit. It seems to have to do with a Java version mismatch. I'm using JDK 7 on both the management server and hypervisor but the SSVM is using version 6. Is this the most current system VM template? http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2 On Wed, Jul 30, 2014 at 5:25 PM, Ian Young iyo...@ratespecial.com wrote: root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh First DNS server is 192.168.100.2 PING 192.168.100.2 (192.168.100.2): 48 data bytes 56 bytes from 192.168.100.2: icmp_seq=0 ttl=64 time=1.227 ms 56 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.882 ms --- 192.168.100.2 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.882/1.054/1.227/0.173 ms Good: Can ping DNS server Good: DNS resolves download.cloud.com ERROR: NFS is not currently mounted Try manually mounting from inside the VM NFS server is 169.254.1.0 PING 169.254.1.0 (169.254.1.0): 48 data bytes 56 bytes from 169.254.1.0: icmp_seq=0 ttl=64 time=0.067 ms 56 bytes from 169.254.1.0: icmp_seq=1 ttl=64 time=0.039 ms --- 169.254.1.0 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.039/0.053/0.067/0.000 ms Good: Can ping nfs server Management server is 192.168.101.3. Checking connectivity. Good: Can connect to management server port 8250 ERROR: Java process not running. Try restarting the SSVM. It says the NFS server is 169.254.1.0 which is the SSVM's link local address. How did it decide that? During the zone configuration I specified virthost1.lax.ratespecial.com as the NFS server and that resolves to 192.168.101.4. Also, in what path does it expect the NFS volume to be mounted? On Wed, Jul 30, 2014 at 4:53 PM, Carlos Reategui car...@reategui.com wrote: If the template is not ready then your ssvm may be having problems downloading it. Have you followed the info here
basic zone setup
I've reinstalled CloudStack 4.4 again, configuring the network as follows: management server: p4p1 http://pastebin.com/skMXxVtk hypervisor/storage server: eth0 http://pastebin.com/LxUxFdpe eth1 http://pastebin.com/K5si1L4d cloudbr0 http://pastebin.com/heZGQfVs (management/storage traffic) cloudbr1 http://pastebin.com/nprB2Kx7 (guest traffic) When I logged into the management GUI for the first time, I skipped the wizard and went straight to the dashboard. There, I set up a basic zone as follows: http://imgur.com/a/R1dX0#0 Now that the infrastructure has been launched and the SSVM and console proxy are running, I noticed that the CentOS template is not ready. Neither the management server or the hypervisor are downloading anything, so it doesn't appear the CentOS template will be ready. If I try to register my own templates, I fill out all the fields but the window just disappears when I click OK and no template is added. I don't see any new messages in the management server log at the time this occurs. I suspect there is a storage problem. However, I can mount the NFS shares onto the management server with no problems. That's how I was able to manually download the system VM template, as the installation guide indicated. What's wrong with this setup? I don't see any obvious errors in the management log besides these repetitive messages, which seem to contradict the fact that there is a SSVM and console proxy running: http://pastebin.com/yvW5GmSB
Re: basic zone setup
root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh First DNS server is 192.168.100.2 PING 192.168.100.2 (192.168.100.2): 48 data bytes 56 bytes from 192.168.100.2: icmp_seq=0 ttl=64 time=1.227 ms 56 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.882 ms --- 192.168.100.2 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.882/1.054/1.227/0.173 ms Good: Can ping DNS server Good: DNS resolves download.cloud.com ERROR: NFS is not currently mounted Try manually mounting from inside the VM NFS server is 169.254.1.0 PING 169.254.1.0 (169.254.1.0): 48 data bytes 56 bytes from 169.254.1.0: icmp_seq=0 ttl=64 time=0.067 ms 56 bytes from 169.254.1.0: icmp_seq=1 ttl=64 time=0.039 ms --- 169.254.1.0 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.039/0.053/0.067/0.000 ms Good: Can ping nfs server Management server is 192.168.101.3. Checking connectivity. Good: Can connect to management server port 8250 ERROR: Java process not running. Try restarting the SSVM. It says the NFS server is 169.254.1.0 which is the SSVM's link local address. How did it decide that? During the zone configuration I specified virthost1.lax.ratespecial.com as the NFS server and that resolves to 192.168.101.4. Also, in what path does it expect the NFS volume to be mounted? On Wed, Jul 30, 2014 at 4:53 PM, Carlos Reategui car...@reategui.com wrote: If the template is not ready then your ssvm may be having problems downloading it. Have you followed the info here: https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting to make sure that the ssvm is actually working properly? On Wed, Jul 30, 2014 at 4:39 PM, Ian Young iyo...@ratespecial.com wrote: I've reinstalled CloudStack 4.4 again, configuring the network as follows: management server: p4p1 http://pastebin.com/skMXxVtk hypervisor/storage server: eth0 http://pastebin.com/LxUxFdpe eth1 http://pastebin.com/K5si1L4d cloudbr0 http://pastebin.com/heZGQfVs (management/storage traffic) cloudbr1 http://pastebin.com/nprB2Kx7 (guest traffic) When I logged into the management GUI for the first time, I skipped the wizard and went straight to the dashboard. There, I set up a basic zone as follows: http://imgur.com/a/R1dX0#0 Now that the infrastructure has been launched and the SSVM and console proxy are running, I noticed that the CentOS template is not ready. Neither the management server or the hypervisor are downloading anything, so it doesn't appear the CentOS template will be ready. If I try to register my own templates, I fill out all the fields but the window just disappears when I click OK and no template is added. I don't see any new messages in the management server log at the time this occurs. I suspect there is a storage problem. However, I can mount the NFS shares onto the management server with no problems. That's how I was able to manually download the system VM template, as the installation guide indicated. What's wrong with this setup? I don't see any obvious errors in the management log besides these repetitive messages, which seem to contradict the fact that there is a SSVM and console proxy running: http://pastebin.com/yvW5GmSB
Re: basic zone setup
I found this in the cloud.out log in the SSVM: Exception in thread main java.lang.UnsupportedClassVersionError: com/cloud/agent/AgentShell : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:634) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) at java.net.URLClassLoader.access$000(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:212) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) Could not find the main class: com.cloud.agent.AgentShell. Program will exit. It seems to have to do with a Java version mismatch. I'm using JDK 7 on both the management server and hypervisor but the SSVM is using version 6. Is this the most current system VM template? http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2 On Wed, Jul 30, 2014 at 5:25 PM, Ian Young iyo...@ratespecial.com wrote: root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh First DNS server is 192.168.100.2 PING 192.168.100.2 (192.168.100.2): 48 data bytes 56 bytes from 192.168.100.2: icmp_seq=0 ttl=64 time=1.227 ms 56 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.882 ms --- 192.168.100.2 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.882/1.054/1.227/0.173 ms Good: Can ping DNS server Good: DNS resolves download.cloud.com ERROR: NFS is not currently mounted Try manually mounting from inside the VM NFS server is 169.254.1.0 PING 169.254.1.0 (169.254.1.0): 48 data bytes 56 bytes from 169.254.1.0: icmp_seq=0 ttl=64 time=0.067 ms 56 bytes from 169.254.1.0: icmp_seq=1 ttl=64 time=0.039 ms --- 169.254.1.0 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.039/0.053/0.067/0.000 ms Good: Can ping nfs server Management server is 192.168.101.3. Checking connectivity. Good: Can connect to management server port 8250 ERROR: Java process not running. Try restarting the SSVM. It says the NFS server is 169.254.1.0 which is the SSVM's link local address. How did it decide that? During the zone configuration I specified virthost1.lax.ratespecial.com as the NFS server and that resolves to 192.168.101.4. Also, in what path does it expect the NFS volume to be mounted? On Wed, Jul 30, 2014 at 4:53 PM, Carlos Reategui car...@reategui.com wrote: If the template is not ready then your ssvm may be having problems downloading it. Have you followed the info here: https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting to make sure that the ssvm is actually working properly? On Wed, Jul 30, 2014 at 4:39 PM, Ian Young iyo...@ratespecial.com wrote: I've reinstalled CloudStack 4.4 again, configuring the network as follows: management server: p4p1 http://pastebin.com/skMXxVtk hypervisor/storage server: eth0 http://pastebin.com/LxUxFdpe eth1 http://pastebin.com/K5si1L4d cloudbr0 http://pastebin.com/heZGQfVs (management/storage traffic) cloudbr1 http://pastebin.com/nprB2Kx7 (guest traffic) When I logged into the management GUI for the first time, I skipped the wizard and went straight to the dashboard. There, I set up a basic zone as follows: http://imgur.com/a/R1dX0#0 Now that the infrastructure has been launched and the SSVM and console proxy are running, I noticed that the CentOS template is not ready. Neither the management server or the hypervisor are downloading anything, so it doesn't appear the CentOS template will be ready. If I try to register my own templates, I fill out all the fields but the window just disappears when I click OK and no template is added. I don't see any new messages in the management server log at the time this occurs. I suspect there is a storage problem. However, I can mount the NFS shares onto the management server with no problems. That's how I was able to manually download the system VM template, as the installation guide indicated. What's wrong with this setup? I don't see any obvious errors in the management log besides these repetitive messages, which seem to contradict the fact that there is a SSVM and console proxy running: http://pastebin.com/yvW5GmSB
Re: basic zone setup
Yes, that makes the ssvm-check pass all the tests. Thanks. Should I repeat that upgrade with the console proxy? On Wed, Jul 30, 2014 at 6:09 PM, Carlos Reategui car...@reategui.com wrote: There have been some messages going around about the template needing a fix for this. Per this link (https://gist.github.com/terbolous/102ae8edd1cda192561c) from one of the messages you can try the following on the ssvm itself: apt-get update apt-get -y install openjdk-7-jre-headless openjdk-7-jre-lib apt-get -y remove openjdk-6-jre-headless then you may also need to: service cloud stop sleep 3 service cloud start try the ssvm-check again after that. My understanding is that this is not a permanent fix, but should get you going for now. On Wed, Jul 30, 2014 at 6:00 PM, Ian Young iyo...@ratespecial.com wrote: I found this in the cloud.out log in the SSVM: Exception in thread main java.lang.UnsupportedClassVersionError: com/cloud/agent/AgentShell : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:634) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) at java.net.URLClassLoader.access$000(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:212) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) Could not find the main class: com.cloud.agent.AgentShell. Program will exit. It seems to have to do with a Java version mismatch. I'm using JDK 7 on both the management server and hypervisor but the SSVM is using version 6. Is this the most current system VM template? http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2 On Wed, Jul 30, 2014 at 5:25 PM, Ian Young iyo...@ratespecial.com wrote: root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh First DNS server is 192.168.100.2 PING 192.168.100.2 (192.168.100.2): 48 data bytes 56 bytes from 192.168.100.2: icmp_seq=0 ttl=64 time=1.227 ms 56 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.882 ms --- 192.168.100.2 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.882/1.054/1.227/0.173 ms Good: Can ping DNS server Good: DNS resolves download.cloud.com ERROR: NFS is not currently mounted Try manually mounting from inside the VM NFS server is 169.254.1.0 PING 169.254.1.0 (169.254.1.0): 48 data bytes 56 bytes from 169.254.1.0: icmp_seq=0 ttl=64 time=0.067 ms 56 bytes from 169.254.1.0: icmp_seq=1 ttl=64 time=0.039 ms --- 169.254.1.0 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.039/0.053/0.067/0.000 ms Good: Can ping nfs server Management server is 192.168.101.3. Checking connectivity. Good: Can connect to management server port 8250 ERROR: Java process not running. Try restarting the SSVM. It says the NFS server is 169.254.1.0 which is the SSVM's link local address. How did it decide that? During the zone configuration I specified virthost1.lax.ratespecial.com as the NFS server and that resolves to 192.168.101.4. Also, in what path does it expect the NFS volume to be mounted? On Wed, Jul 30, 2014 at 4:53 PM, Carlos Reategui car...@reategui.com wrote: If the template is not ready then your ssvm may be having problems downloading it. Have you followed the info here: https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting to make sure that the ssvm is actually working properly? On Wed, Jul 30, 2014 at 4:39 PM, Ian Young iyo...@ratespecial.com wrote: I've reinstalled CloudStack 4.4 again, configuring the network as follows: management server: p4p1 http://pastebin.com/skMXxVtk hypervisor/storage server: eth0 http://pastebin.com/LxUxFdpe eth1 http://pastebin.com/K5si1L4d cloudbr0 http://pastebin.com/heZGQfVs (management/storage traffic) cloudbr1 http://pastebin.com/nprB2Kx7 (guest traffic) When I logged into the management GUI for the first time, I skipped the wizard and went straight to the dashboard. There, I set up a basic zone as follows: http://imgur.com/a/R1dX0#0
Re: dual NIC VLAN configuration
Is private traffic the same thing as management/storage traffic? On Fri, Jul 25, 2014 at 11:17 PM, Geoff Higginbottom geoff.higginbot...@shapeblue.com wrote: Hi Ian, As you are deploying a Basic network there will be no public traffic. The private traffic, assuming you allocate an IP range to the POD which is in the same CIDR as the Management Server would typically be assigned to cloudbr0 private.network.device=cloudbr0 Guest traffic would then be assigned to cloudbr1 guest.network.device=cloudbr1 Regards Geoff Higginbottom CTO / Cloud Architect D: +44 20 3603 0542tel:+442036030542 | S: +44 20 3603 0540tel: +442036030540 | M: +447968161581tel:+447968161581 geoff.higginbot...@shapeblue.commailto:geoff.higginbot...@shapeblue.com | www.shapeblue.comhtp://www.shapeblue.com/ | Twitter:@cloudstackguru https://twitter.com/#!/cloudstackguru ShapeBlue Ltd, 53 Chandos Place, Covent Garden, London, WC2N 4HSx-apple-data-detectors://5 On 25 Jul 2014, at 19:18, Ian Young iyo...@ratespecial.commailto: iyo...@ratespecial.com wrote: So if management/storage traffic is on cloudbr0 and guest VMs are on cloudbr1, would these be the correct settings in agent.properties? guest.network.device=cloudbr1 private.network.device=cloudbr1 public.network.device=cloudbr1 On Fri, Jul 25, 2014 at 10:11 AM, Ian Young iyo...@ratespecial.com mailto:iyo...@ratespecial.com wrote: Thank you, Geoff. That was precisely the answer I was looking for. I knew I was doing something wrong. I didn't realize the second adapter could be used without an IP address explicitly assigned to it. Yes, this is a basic zone (just an internal project so we don't need any public IP addresses). I was planning to set up an NFS server on the 192.168.101.0/24 network so this is exactly what I was trying to accomplish. Thanks. On Fri, Jul 25, 2014 at 1:34 AM, Geoff Higginbottom geoff.higginbot...@shapeblue.commailto:geoff.higginbot...@shapeblue.com wrote: Ian, It looks like you are trying to setup a basic zone and have a Management Server on IP 192.168.101.3 and a Host on IP 192.168.101.4. The second interface on the host does not need any IP configuration on the Host as it will not be used by the Host so remove the 192.168.102.4 mapping.. This interface will be used by the Guest VMs running on the Host who will have their own IP schema. Your Guest IP range will be in the 192.168.102.0/24 CIDR with a gateway of 192.168.102.1 The Management Serve will talk to the Host via the 1st Interface, and Guest VMs will use the 2nd. You have not mentioned storage, but assuming you are using NFS for Primary and Secondary, put the NFS Server on the 192.168.101.0/24 network, and then all storage traffic will also go over the 1st interface. Regards Geoff Higginbottom D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 geoff.higginbot...@shapeblue.commailto:geoff.higginbot...@shapeblue.com -Original Message- From: Daan Hoogland [mailto:daan.hoogl...@gmail.com] Sent: 25 July 2014 08:47 To: users@cloudstack.apache.orgmailto:users@cloudstack.apache.org Subject: Re: dual NIC VLAN configuration Ian, I would imagine that guest traffic can't go out to the net this way. Maybe you should swap them. This is only guessing however. What are you seeing? On Fri, Jul 25, 2014 at 2:00 AM, Ian Young iyo...@ratespecial.commailto: iyo...@ratespecial.com wrote: Here's the less verbose version: My hypervisor has two NICs and I've set up a label on each. Traffic to and from cloudbr0 works perfectly. Traffic going into cloudbr1 goes out cloudbr0 because that interface has a default gateway. Will this pose a problem when I try to set up separate management and guest networks in CloudStack? On Thu, Jul 24, 2014 at 10:56 AM, Ian Young iyo...@ratespecial.com mailto:iyo...@ratespecial.com wrote: I am trying to set up a server with two NICs as a hypervisor. I would like to use the two interfaces to separate management and guest traffic, as recommended by the CloudStack installation guide. This server is connected to a managed switch, which is connected to a hardware firewall, both of which are set up with tagged VLANs. Some of the ports on the switch are designated as VLAN 6 and some are VLAN 7. I've confirmed the VLANs are set up correctly by configuring eth0 and eth1 (one at a time) with the appropriate IP address, netmask, and gateway. However, the difficulty arises when I try to configure both interfaces simultaneously. The return traffic tends to go out whichever interface is associated with the default gateway, a typical issue when using multiple network interfaces. I've followed numerous guides, which all basically say the same thing: Don't set a default gateway; use iproute2 to control the flow of traffic with route-eth0, rule-eth0, and rt_tables. I've tried setting this up numerous times to no avail, probably because the guides I'm reading
Re: dual NIC VLAN configuration
Thank you, Geoff. That was precisely the answer I was looking for. I knew I was doing something wrong. I didn't realize the second adapter could be used without an IP address explicitly assigned to it. Yes, this is a basic zone (just an internal project so we don't need any public IP addresses). I was planning to set up an NFS server on the 192.168.101.0/24 network so this is exactly what I was trying to accomplish. Thanks. On Fri, Jul 25, 2014 at 1:34 AM, Geoff Higginbottom geoff.higginbot...@shapeblue.com wrote: Ian, It looks like you are trying to setup a basic zone and have a Management Server on IP 192.168.101.3 and a Host on IP 192.168.101.4. The second interface on the host does not need any IP configuration on the Host as it will not be used by the Host so remove the 192.168.102.4 mapping.. This interface will be used by the Guest VMs running on the Host who will have their own IP schema. Your Guest IP range will be in the 192.168.102.0/24 CIDR with a gateway of 192.168.102.1 The Management Serve will talk to the Host via the 1st Interface, and Guest VMs will use the 2nd. You have not mentioned storage, but assuming you are using NFS for Primary and Secondary, put the NFS Server on the 192.168.101.0/24 network, and then all storage traffic will also go over the 1st interface. Regards Geoff Higginbottom D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 geoff.higginbot...@shapeblue.com -Original Message- From: Daan Hoogland [mailto:daan.hoogl...@gmail.com] Sent: 25 July 2014 08:47 To: users@cloudstack.apache.org Subject: Re: dual NIC VLAN configuration Ian, I would imagine that guest traffic can't go out to the net this way. Maybe you should swap them. This is only guessing however. What are you seeing? On Fri, Jul 25, 2014 at 2:00 AM, Ian Young iyo...@ratespecial.com wrote: Here's the less verbose version: My hypervisor has two NICs and I've set up a label on each. Traffic to and from cloudbr0 works perfectly. Traffic going into cloudbr1 goes out cloudbr0 because that interface has a default gateway. Will this pose a problem when I try to set up separate management and guest networks in CloudStack? On Thu, Jul 24, 2014 at 10:56 AM, Ian Young iyo...@ratespecial.com wrote: I am trying to set up a server with two NICs as a hypervisor. I would like to use the two interfaces to separate management and guest traffic, as recommended by the CloudStack installation guide. This server is connected to a managed switch, which is connected to a hardware firewall, both of which are set up with tagged VLANs. Some of the ports on the switch are designated as VLAN 6 and some are VLAN 7. I've confirmed the VLANs are set up correctly by configuring eth0 and eth1 (one at a time) with the appropriate IP address, netmask, and gateway. However, the difficulty arises when I try to configure both interfaces simultaneously. The return traffic tends to go out whichever interface is associated with the default gateway, a typical issue when using multiple network interfaces. I've followed numerous guides, which all basically say the same thing: Don't set a default gateway; use iproute2 to control the flow of traffic with route-eth0, rule-eth0, and rt_tables. I've tried setting this up numerous times to no avail, probably because the guides I'm reading don't involve VLANs. Add to that the the cloudbr0 and cloudbr1 bridges that CloudStack requires and now I'm really confused as to how to set up the network. I can't be the first person to have set up CloudStack this way; it sounds pretty common. Can someone explain to me the correct way to configure these interfaces? Here is my network information: VLAN 6 (management) 192.168.101.0/24 gateway: 192.168.101.1 VLAN 7 (guest) 192.168.102.0/24 gateway: 192.168.102.1 current hypervisor settings: eth0: 192.168.101.4 eth1: 192.168.102.4 current management server settings (this is a separate machine): p4p1: 192.168.101.3 -- Daan Find out more about ShapeBlue and our range of CloudStack related services IaaS Cloud Design Build http://shapeblue.com/iaas-cloud-design-and-build// CSForge – rapid IaaS deployment frameworkhttp://shapeblue.com/csforge/ CloudStack Consultinghttp://shapeblue.com/cloudstack-consultancy/ CloudStack Infrastructure Support http://shapeblue.com/cloudstack-infrastructure-support/ CloudStack Bootcamp Training Courses http://shapeblue.com/cloudstack-training/ This email and any attachments to it may be confidential and are intended solely for the use of the individual to whom it is addressed. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Shape Blue Ltd or related companies. If you are not the intended recipient of this email, you must neither take any action based upon its contents, nor copy or show it to anyone
Re: dual NIC VLAN configuration
So if management/storage traffic is on cloudbr0 and guest VMs are on cloudbr1, would these be the correct settings in agent.properties? guest.network.device=cloudbr1 private.network.device=cloudbr1 public.network.device=cloudbr1 On Fri, Jul 25, 2014 at 10:11 AM, Ian Young iyo...@ratespecial.com wrote: Thank you, Geoff. That was precisely the answer I was looking for. I knew I was doing something wrong. I didn't realize the second adapter could be used without an IP address explicitly assigned to it. Yes, this is a basic zone (just an internal project so we don't need any public IP addresses). I was planning to set up an NFS server on the 192.168.101.0/24 network so this is exactly what I was trying to accomplish. Thanks. On Fri, Jul 25, 2014 at 1:34 AM, Geoff Higginbottom geoff.higginbot...@shapeblue.com wrote: Ian, It looks like you are trying to setup a basic zone and have a Management Server on IP 192.168.101.3 and a Host on IP 192.168.101.4. The second interface on the host does not need any IP configuration on the Host as it will not be used by the Host so remove the 192.168.102.4 mapping.. This interface will be used by the Guest VMs running on the Host who will have their own IP schema. Your Guest IP range will be in the 192.168.102.0/24 CIDR with a gateway of 192.168.102.1 The Management Serve will talk to the Host via the 1st Interface, and Guest VMs will use the 2nd. You have not mentioned storage, but assuming you are using NFS for Primary and Secondary, put the NFS Server on the 192.168.101.0/24 network, and then all storage traffic will also go over the 1st interface. Regards Geoff Higginbottom D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 geoff.higginbot...@shapeblue.com -Original Message- From: Daan Hoogland [mailto:daan.hoogl...@gmail.com] Sent: 25 July 2014 08:47 To: users@cloudstack.apache.org Subject: Re: dual NIC VLAN configuration Ian, I would imagine that guest traffic can't go out to the net this way. Maybe you should swap them. This is only guessing however. What are you seeing? On Fri, Jul 25, 2014 at 2:00 AM, Ian Young iyo...@ratespecial.com wrote: Here's the less verbose version: My hypervisor has two NICs and I've set up a label on each. Traffic to and from cloudbr0 works perfectly. Traffic going into cloudbr1 goes out cloudbr0 because that interface has a default gateway. Will this pose a problem when I try to set up separate management and guest networks in CloudStack? On Thu, Jul 24, 2014 at 10:56 AM, Ian Young iyo...@ratespecial.com wrote: I am trying to set up a server with two NICs as a hypervisor. I would like to use the two interfaces to separate management and guest traffic, as recommended by the CloudStack installation guide. This server is connected to a managed switch, which is connected to a hardware firewall, both of which are set up with tagged VLANs. Some of the ports on the switch are designated as VLAN 6 and some are VLAN 7. I've confirmed the VLANs are set up correctly by configuring eth0 and eth1 (one at a time) with the appropriate IP address, netmask, and gateway. However, the difficulty arises when I try to configure both interfaces simultaneously. The return traffic tends to go out whichever interface is associated with the default gateway, a typical issue when using multiple network interfaces. I've followed numerous guides, which all basically say the same thing: Don't set a default gateway; use iproute2 to control the flow of traffic with route-eth0, rule-eth0, and rt_tables. I've tried setting this up numerous times to no avail, probably because the guides I'm reading don't involve VLANs. Add to that the the cloudbr0 and cloudbr1 bridges that CloudStack requires and now I'm really confused as to how to set up the network. I can't be the first person to have set up CloudStack this way; it sounds pretty common. Can someone explain to me the correct way to configure these interfaces? Here is my network information: VLAN 6 (management) 192.168.101.0/24 gateway: 192.168.101.1 VLAN 7 (guest) 192.168.102.0/24 gateway: 192.168.102.1 current hypervisor settings: eth0: 192.168.101.4 eth1: 192.168.102.4 current management server settings (this is a separate machine): p4p1: 192.168.101.3 -- Daan Find out more about ShapeBlue and our range of CloudStack related services IaaS Cloud Design Build http://shapeblue.com/iaas-cloud-design-and-build// CSForge – rapid IaaS deployment frameworkhttp://shapeblue.com/csforge/ CloudStack Consultinghttp://shapeblue.com/cloudstack-consultancy/ CloudStack Infrastructure Support http://shapeblue.com/cloudstack-infrastructure-support/ CloudStack Bootcamp Training Courses http://shapeblue.com/cloudstack-training/ This email and any attachments to it may be confidential and are intended solely for the use
dual NIC VLAN configuration
I am trying to set up a server with two NICs as a hypervisor. I would like to use the two interfaces to separate management and guest traffic, as recommended by the CloudStack installation guide. This server is connected to a managed switch, which is connected to a hardware firewall, both of which are set up with tagged VLANs. Some of the ports on the switch are designated as VLAN 6 and some are VLAN 7. I've confirmed the VLANs are set up correctly by configuring eth0 and eth1 (one at a time) with the appropriate IP address, netmask, and gateway. However, the difficulty arises when I try to configure both interfaces simultaneously. The return traffic tends to go out whichever interface is associated with the default gateway, a typical issue when using multiple network interfaces. I've followed numerous guides, which all basically say the same thing: Don't set a default gateway; use iproute2 to control the flow of traffic with route-eth0, rule-eth0, and rt_tables. I've tried setting this up numerous times to no avail, probably because the guides I'm reading don't involve VLANs. Add to that the the cloudbr0 and cloudbr1 bridges that CloudStack requires and now I'm really confused as to how to set up the network. I can't be the first person to have set up CloudStack this way; it sounds pretty common. Can someone explain to me the correct way to configure these interfaces? Here is my network information: VLAN 6 (management) 192.168.101.0/24 gateway: 192.168.101.1 VLAN 7 (guest) 192.168.102.0/24 gateway: 192.168.102.1 current hypervisor settings: eth0: 192.168.101.4 eth1: 192.168.102.4 current management server settings (this is a separate machine): p4p1: 192.168.101.3
Re: dual NIC VLAN configuration
Here's the less verbose version: My hypervisor has two NICs and I've set up a label on each. Traffic to and from cloudbr0 works perfectly. Traffic going into cloudbr1 goes out cloudbr0 because that interface has a default gateway. Will this pose a problem when I try to set up separate management and guest networks in CloudStack? On Thu, Jul 24, 2014 at 10:56 AM, Ian Young iyo...@ratespecial.com wrote: I am trying to set up a server with two NICs as a hypervisor. I would like to use the two interfaces to separate management and guest traffic, as recommended by the CloudStack installation guide. This server is connected to a managed switch, which is connected to a hardware firewall, both of which are set up with tagged VLANs. Some of the ports on the switch are designated as VLAN 6 and some are VLAN 7. I've confirmed the VLANs are set up correctly by configuring eth0 and eth1 (one at a time) with the appropriate IP address, netmask, and gateway. However, the difficulty arises when I try to configure both interfaces simultaneously. The return traffic tends to go out whichever interface is associated with the default gateway, a typical issue when using multiple network interfaces. I've followed numerous guides, which all basically say the same thing: Don't set a default gateway; use iproute2 to control the flow of traffic with route-eth0, rule-eth0, and rt_tables. I've tried setting this up numerous times to no avail, probably because the guides I'm reading don't involve VLANs. Add to that the the cloudbr0 and cloudbr1 bridges that CloudStack requires and now I'm really confused as to how to set up the network. I can't be the first person to have set up CloudStack this way; it sounds pretty common. Can someone explain to me the correct way to configure these interfaces? Here is my network information: VLAN 6 (management) 192.168.101.0/24 gateway: 192.168.101.1 VLAN 7 (guest) 192.168.102.0/24 gateway: 192.168.102.1 current hypervisor settings: eth0: 192.168.101.4 eth1: 192.168.102.4 current management server settings (this is a separate machine): p4p1: 192.168.101.3
Re: console proxy times out
The SSVM is stopped. If I try to start it, it complains about insufficient capacity. CPU? RAM? I have plenty of both available. 2014-05-23 10:36:51,196 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Listing clusters in order of aggregate capacity, that have (atleast one host with) enough CPU and RAM capacity under this Pod: 1 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Removing from the clusterId list these clusters from avoid set: [1] 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) No clusters found after removing disabled clusters and clusters in avoid list, returning. 2014-05-23 10:36:51,201 DEBUG [c.c.c.CapacityManagerImpl] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) VM state transitted from :Starting to Stopped with event: OperationFailedvm's original host id: 1 new host id: null host id before state transition: null 2014-05-23 10:36:51,201 WARN [c.c.s.s.SecondaryStorageManagerImpl] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Exception while trying to start secondary storage vm com.cloud.exception.InsufficientServerCapacityException: Unable to create a deployment for VM[SecondaryStorageVm|s-1-VM]Scope=interface com.cloud.dc.DataCenter; id=1 On Fri, May 23, 2014 at 10:35 AM, Ian Young iyo...@ratespecial.com wrote: I rebooted it and now it's in an even more broken state. It's repeatedly trying to stop the console proxy but can't because its state is Starting. Here is an excerpt from the management log: http://pastebin.com/FiaDzKXb The agent log keeps repeating these messages: http://pastebin.com/yDidSbrz What's wrong with it? On Thu, May 22, 2014 at 12:55 PM, Ian Young iyo...@ratespecial.comwrote: I wonder if something is wrong with the NFS mount. I see this error periodically in /var/log/messages even though I have set the Domain in /etc/idmapd.conf to the host's FQDN: May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 22 10:25:16 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' name '107' just started appearing in the log yesterday, which looks unusual. Up until then, the error was always name '0'. On Thu, May 22, 2014 at 11:15 AM, Andrija Panic andrija.pa...@gmail.comwrote: I have observed this kind of problems (process blocked for more than xx sec...) when I had access with storage - check your disks, smartctl etc... best Sent from Google Nexus 4 On May 22, 2014 7:49 PM, Ian Young iyo...@ratespecial.com wrote: And this is in /var/log/messages right before that event: May 22 10:16:07 virthost1 kernel: INFO: task qemu-kvm:2971 blocked for more than 120 seconds. May 22 10:16:07 virthost1 kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1 May 22 10:16:07 virthost1 kernel: echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. May 22 10:16:07 virthost1 kernel: qemu-kvm D 0002 0 2971 1 0x0080 May 22 10:16:07 virthost1 kernel: 8810724e9be8 0082 88106b6529d8 May 22 10:16:07 virthost1 kernel: 880871b3e8d8 880871b3e8f0 8100bb8e 8810724e9be8 May 22 10:16:07 virthost1 kernel: 881073525058 8810724e9fd8 fbc8 881073525058 May 22 10:16:07 virthost1 kernel: Call Trace: May 22 10:16:07 virthost1 kernel: [8100bb8e] ? apic_timer_interrupt+0xe/0x20 May 22 10:16:07 virthost1 kernel: [810555ef] ? mutex_spin_on_owner+0x9f/0xc0 May 22 10:16:07 virthost1 kernel: [8152969e] __mutex_lock_slowpath+0x13e/0x180 May 22 10:16:07 virthost1 kernel: [8152953b] mutex_lock+0x2b/0x50 May 22 10:16:07 virthost1 kernel: [a021c2cf] memory_access_ok+0x7f/0xc0 [vhost_net
Re: console proxy times out
Also, is this normal? Every time the server is rebooted, it adds another record to the mshost table but the removed field is always NULL. http://pastebin.com/q5zDCu4b On Fri, May 23, 2014 at 10:39 AM, Ian Young iyo...@ratespecial.com wrote: The SSVM is stopped. If I try to start it, it complains about insufficient capacity. CPU? RAM? I have plenty of both available. 2014-05-23 10:36:51,196 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Listing clusters in order of aggregate capacity, that have (atleast one host with) enough CPU and RAM capacity under this Pod: 1 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Removing from the clusterId list these clusters from avoid set: [1] 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) No clusters found after removing disabled clusters and clusters in avoid list, returning. 2014-05-23 10:36:51,201 DEBUG [c.c.c.CapacityManagerImpl] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) VM state transitted from :Starting to Stopped with event: OperationFailedvm's original host id: 1 new host id: null host id before state transition: null 2014-05-23 10:36:51,201 WARN [c.c.s.s.SecondaryStorageManagerImpl] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Exception while trying to start secondary storage vm com.cloud.exception.InsufficientServerCapacityException: Unable to create a deployment for VM[SecondaryStorageVm|s-1-VM]Scope=interface com.cloud.dc.DataCenter; id=1 On Fri, May 23, 2014 at 10:35 AM, Ian Young iyo...@ratespecial.comwrote: I rebooted it and now it's in an even more broken state. It's repeatedly trying to stop the console proxy but can't because its state is Starting. Here is an excerpt from the management log: http://pastebin.com/FiaDzKXb The agent log keeps repeating these messages: http://pastebin.com/yDidSbrz What's wrong with it? On Thu, May 22, 2014 at 12:55 PM, Ian Young iyo...@ratespecial.comwrote: I wonder if something is wrong with the NFS mount. I see this error periodically in /var/log/messages even though I have set the Domain in /etc/idmapd.conf to the host's FQDN: May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 22 10:25:16 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' name '107' just started appearing in the log yesterday, which looks unusual. Up until then, the error was always name '0'. On Thu, May 22, 2014 at 11:15 AM, Andrija Panic andrija.pa...@gmail.com wrote: I have observed this kind of problems (process blocked for more than xx sec...) when I had access with storage - check your disks, smartctl etc... best Sent from Google Nexus 4 On May 22, 2014 7:49 PM, Ian Young iyo...@ratespecial.com wrote: And this is in /var/log/messages right before that event: May 22 10:16:07 virthost1 kernel: INFO: task qemu-kvm:2971 blocked for more than 120 seconds. May 22 10:16:07 virthost1 kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1 May 22 10:16:07 virthost1 kernel: echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. May 22 10:16:07 virthost1 kernel: qemu-kvm D 0002 0 2971 1 0x0080 May 22 10:16:07 virthost1 kernel: 8810724e9be8 0082 88106b6529d8 May 22 10:16:07 virthost1 kernel: 880871b3e8d8 880871b3e8f0 8100bb8e 8810724e9be8 May 22 10:16:07 virthost1 kernel: 881073525058 8810724e9fd8 fbc8 881073525058 May 22 10:16:07 virthost1 kernel: Call Trace: May 22 10:16:07 virthost1 kernel: [8100bb8e] ? apic_timer_interrupt+0xe/0x20 May 22 10:16:07 virthost1 kernel: [810555ef] ? mutex_spin_on_owner
Re: console proxy times out
I destroyed the SSVM and then tried hacking the database to make CloudStack realize that the console proxy is in fact stopped. mysql update vm_instance set state='Stopped' where name='v-2-VM'; mysql update host set status='Up' where name='v-2-VM'; Now they're both running and I can see the console. There's got to be a better way to use this system without having to reboot or hack the database daily. On Fri, May 23, 2014 at 10:42 AM, Ian Young iyo...@ratespecial.com wrote: Also, is this normal? Every time the server is rebooted, it adds another record to the mshost table but the removed field is always NULL. http://pastebin.com/q5zDCu4b On Fri, May 23, 2014 at 10:39 AM, Ian Young iyo...@ratespecial.comwrote: The SSVM is stopped. If I try to start it, it complains about insufficient capacity. CPU? RAM? I have plenty of both available. 2014-05-23 10:36:51,196 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Listing clusters in order of aggregate capacity, that have (atleast one host with) enough CPU and RAM capacity under this Pod: 1 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Removing from the clusterId list these clusters from avoid set: [1] 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) No clusters found after removing disabled clusters and clusters in avoid list, returning. 2014-05-23 10:36:51,201 DEBUG [c.c.c.CapacityManagerImpl] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) VM state transitted from :Starting to Stopped with event: OperationFailedvm's original host id: 1 new host id: null host id before state transition: null 2014-05-23 10:36:51,201 WARN [c.c.s.s.SecondaryStorageManagerImpl] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Exception while trying to start secondary storage vm com.cloud.exception.InsufficientServerCapacityException: Unable to create a deployment for VM[SecondaryStorageVm|s-1-VM]Scope=interface com.cloud.dc.DataCenter; id=1 On Fri, May 23, 2014 at 10:35 AM, Ian Young iyo...@ratespecial.comwrote: I rebooted it and now it's in an even more broken state. It's repeatedly trying to stop the console proxy but can't because its state is Starting. Here is an excerpt from the management log: http://pastebin.com/FiaDzKXb The agent log keeps repeating these messages: http://pastebin.com/yDidSbrz What's wrong with it? On Thu, May 22, 2014 at 12:55 PM, Ian Young iyo...@ratespecial.comwrote: I wonder if something is wrong with the NFS mount. I see this error periodically in /var/log/messages even though I have set the Domain in /etc/idmapd.conf to the host's FQDN: May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 22 10:25:16 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' name '107' just started appearing in the log yesterday, which looks unusual. Up until then, the error was always name '0'. On Thu, May 22, 2014 at 11:15 AM, Andrija Panic andrija.pa...@gmail.com wrote: I have observed this kind of problems (process blocked for more than xx sec...) when I had access with storage - check your disks, smartctl etc... best Sent from Google Nexus 4 On May 22, 2014 7:49 PM, Ian Young iyo...@ratespecial.com wrote: And this is in /var/log/messages right before that event: May 22 10:16:07 virthost1 kernel: INFO: task qemu-kvm:2971 blocked for more than 120 seconds. May 22 10:16:07 virthost1 kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1 May 22 10:16:07 virthost1 kernel: echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. May 22 10:16:07 virthost1 kernel: qemu-kvm D 0002 0 2971 1 0x0080 May 22 10:16:07 virthost1 kernel
Re: console proxy times out
I'm still getting a lot of these, though. 2014-05-23 10:52:27,908 INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-838a6dc4 work-873) HA on VM[ConsoleProxy|v-2-VM] 2014-05-23 10:52:27,908 INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-838a6dc4 work-873) VM VM[ConsoleProxy|v-2-VM] has been changed. Current State = Running Previous State = Starting last updated = 571 previous updated = 568 2014-05-23 10:52:27,908 INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-838a6dc4 work-873) Completed HAWork[873-HA-2-Starting-Investigating] On Fri, May 23, 2014 at 10:50 AM, Ian Young iyo...@ratespecial.com wrote: I destroyed the SSVM and then tried hacking the database to make CloudStack realize that the console proxy is in fact stopped. mysql update vm_instance set state='Stopped' where name='v-2-VM'; mysql update host set status='Up' where name='v-2-VM'; Now they're both running and I can see the console. There's got to be a better way to use this system without having to reboot or hack the database daily. On Fri, May 23, 2014 at 10:42 AM, Ian Young iyo...@ratespecial.comwrote: Also, is this normal? Every time the server is rebooted, it adds another record to the mshost table but the removed field is always NULL. http://pastebin.com/q5zDCu4b On Fri, May 23, 2014 at 10:39 AM, Ian Young iyo...@ratespecial.comwrote: The SSVM is stopped. If I try to start it, it complains about insufficient capacity. CPU? RAM? I have plenty of both available. 2014-05-23 10:36:51,196 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Listing clusters in order of aggregate capacity, that have (atleast one host with) enough CPU and RAM capacity under this Pod: 1 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Removing from the clusterId list these clusters from avoid set: [1] 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) No clusters found after removing disabled clusters and clusters in avoid list, returning. 2014-05-23 10:36:51,201 DEBUG [c.c.c.CapacityManagerImpl] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) VM state transitted from :Starting to Stopped with event: OperationFailedvm's original host id: 1 new host id: null host id before state transition: null 2014-05-23 10:36:51,201 WARN [c.c.s.s.SecondaryStorageManagerImpl] (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Exception while trying to start secondary storage vm com.cloud.exception.InsufficientServerCapacityException: Unable to create a deployment for VM[SecondaryStorageVm|s-1-VM]Scope=interface com.cloud.dc.DataCenter; id=1 On Fri, May 23, 2014 at 10:35 AM, Ian Young iyo...@ratespecial.comwrote: I rebooted it and now it's in an even more broken state. It's repeatedly trying to stop the console proxy but can't because its state is Starting. Here is an excerpt from the management log: http://pastebin.com/FiaDzKXb The agent log keeps repeating these messages: http://pastebin.com/yDidSbrz What's wrong with it? On Thu, May 22, 2014 at 12:55 PM, Ian Young iyo...@ratespecial.comwrote: I wonder if something is wrong with the NFS mount. I see this error periodically in /var/log/messages even though I have set the Domain in /etc/idmapd.conf to the host's FQDN: May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 22 10:25:16 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' name '107' just started appearing in the log yesterday, which looks unusual. Up until then, the error was always name '0'. On Thu, May 22, 2014 at 11:15 AM, Andrija Panic andrija.pa...@gmail.com wrote: I have observed this kind of problems (process blocked for more than xx sec...) when I had access
local storage for system VMs
My CloudStack 4.3 system is a single server (for the time being, at least). Since a system VM malfunction is a show-stopper, I would like to host those on local storage to avoid issues with NFS mounts. I have changed the value system.vm.use.local.storage to true. I don't see an option for use.local.storage, so maybe that's been removed. The system offerings for SSVM, console proxy, and software router are now set to Storage Type = local. Do I need to create new compute offerings, or is that for regular instances? I want to keep normal instances on shared storage. How do I make sure the system VMs are running on local storage? I've restarted them but the qemu process still says -drive file=/mnt/2a7ec307-d797-3287-aa31-7e280afb56cf/d8668fbc-dd3b-4c85-952e-40947eda7b99,if=none,id=drive-virtio-disk0,format=qcow2,cache=none which is a shared volume. Do I need to destroy them and create new ones?
Re: console proxy times out
The console proxy became unavailable again yesterday afternoon. I could SSH into it via its link local address and nothing seemed to be wrong inside the VM itself. However, the qemu-kvm process for that VM was at almost 100% CPU. Inside the VM, the CPU usage was minimal and the java process was running and listening on port 443. So there seems to be something wrong with it down at the KVM/QEMU level. It's weird how this keeps happening to the console proxy only and not any of the other VMs. I tried to reboot it from the management UI and after about 15 minutes, it finally did. Now the console proxy is working but I don't know how long it will last before it breaks again. I found this in libvirtd.log, which corresponds with the time the console proxy rebooted: 2014-05-22 17:17:04.362+: 25195: info : libvirt version: 0.10.2, package: 29.el6_5.7 (CentOS BuildSystem http://bugs.centos.org, 2014-04-07-07:42:04, c6b9.bsys.dev.centos.org) 2014-05-22 17:17:04.362+: 25195: error : qemuMonitorIO:614 : internal error End of file from monitor On Wed, May 21, 2014 at 2:07 PM, Ian Young iyo...@ratespecial.com wrote: I built and installed a libvirt 1.04 package from the Fedora src rpm. It installed fine inside a test VM but installing it on the real hypervisor was a bad idea and I doubt I'll be pursuing it further. All VMs promptly stopped and this appeared in libvirtd.log: 2014-05-21 20:36:19.260+: 23567: info : libvirt version: 1.0.4, package: 1.el6 (Unknown, 2014-05-21-11:36:09, redacted.com) 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_network.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_storage.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nodedev.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_secret.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nwfilter.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_interface.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_lxc.so not accessible 2014-05-21 20:36:49.471+: 23570: error : do_open:1220 : no connection driver available for qemu:///system 2014-05-21 20:36:49.472+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error 2014-05-21 20:36:49.473+: 23571: error : do_open:1220 : no connection driver available for lxc:/// 2014-05-21 20:36:49.474+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error 2014-05-21 20:36:49.475+: 23568: error : do_open:1220 : no connection driver available for qemu:///system 2014-05-21 20:36:49.476+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error 2014-05-21 20:36:49.678+: 23575: error : do_open:1220 : no connection driver available for qemu:///system 2014-05-21 20:36:49.678+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error 2014-05-21 20:36:49.681+: 23572: error : do_open:1220 : no connection driver available for qemu:///system 2014-05-21 20:36:49.682+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error On Wed, May 21, 2014 at 10:45 AM, Ian Young iyo...@ratespecial.comwrote: I was able to get it working by following these steps: 1. stop all instances 2. service cloudstack-management stop 3. service cloudstack-agent stop 4. virsh shutdown {domain} (for each of the system VMs) 5. service libvirtd stop 6. umount primary and secondary 7. reboot The console proxy is working again. I expect it will probably break again in a day or two. I have a feeling it's a result of this libvirtd bug, since I've seen the cannot acquire state change lock several times. https://bugs.launchpad.net/nova/+bug/1254872 I might try building my own libvirtd 1.0.3 for EL6. On Tue, May 20, 2014 at 6:21 PM, Ian Young iyo...@ratespecial.comwrote: So I got the console proxy working via HTTPS (by managing my own realhostip.com DNS) last week and everything was working fine. Today, all of a sudden, the console proxy stopped working again. The browser says, Connecting to 192-168-100-159
Re: console proxy times out
And this is in /var/log/messages right before that event: May 22 10:16:07 virthost1 kernel: INFO: task qemu-kvm:2971 blocked for more than 120 seconds. May 22 10:16:07 virthost1 kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1 May 22 10:16:07 virthost1 kernel: echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. May 22 10:16:07 virthost1 kernel: qemu-kvm D 0002 0 2971 1 0x0080 May 22 10:16:07 virthost1 kernel: 8810724e9be8 0082 88106b6529d8 May 22 10:16:07 virthost1 kernel: 880871b3e8d8 880871b3e8f0 8100bb8e 8810724e9be8 May 22 10:16:07 virthost1 kernel: 881073525058 8810724e9fd8 fbc8 881073525058 May 22 10:16:07 virthost1 kernel: Call Trace: May 22 10:16:07 virthost1 kernel: [8100bb8e] ? apic_timer_interrupt+0xe/0x20 May 22 10:16:07 virthost1 kernel: [810555ef] ? mutex_spin_on_owner+0x9f/0xc0 May 22 10:16:07 virthost1 kernel: [8152969e] __mutex_lock_slowpath+0x13e/0x180 May 22 10:16:07 virthost1 kernel: [8152953b] mutex_lock+0x2b/0x50 May 22 10:16:07 virthost1 kernel: [a021c2cf] memory_access_ok+0x7f/0xc0 [vhost_net] May 22 10:16:07 virthost1 kernel: [a021d89c] vhost_dev_ioctl+0x2ec/0xa50 [vhost_net] May 22 10:16:07 virthost1 kernel: [a021c411] ? vhost_work_flush+0xe1/0x120 [vhost_net] May 22 10:16:07 virthost1 kernel: [8122db91] ? avc_has_perm+0x71/0x90 May 22 10:16:07 virthost1 kernel: [a021f11a] vhost_net_ioctl+0x7a/0x5d0 [vhost_net] May 22 10:16:07 virthost1 kernel: [8122f914] ? inode_has_perm+0x54/0xa0 May 22 10:16:07 virthost1 kernel: [a01a28b7] ? kvm_vcpu_ioctl+0x1e7/0x580 [kvm] May 22 10:16:07 virthost1 kernel: [8108b14e] ? send_signal+0x3e/0x90 May 22 10:16:07 virthost1 kernel: [8119dc12] vfs_ioctl+0x22/0xa0 May 22 10:16:07 virthost1 kernel: [8119ddb4] do_vfs_ioctl+0x84/0x580 May 22 10:16:07 virthost1 kernel: [8119e331] sys_ioctl+0x81/0xa0 May 22 10:16:07 virthost1 kernel: [810e1e4e] ? __audit_syscall_exit+0x25e/0x290 May 22 10:16:07 virthost1 kernel: [8100b072] system_call_fastpath+0x16/0x1b On Thu, May 22, 2014 at 10:39 AM, Ian Young iyo...@ratespecial.com wrote: The console proxy became unavailable again yesterday afternoon. I could SSH into it via its link local address and nothing seemed to be wrong inside the VM itself. However, the qemu-kvm process for that VM was at almost 100% CPU. Inside the VM, the CPU usage was minimal and the java process was running and listening on port 443. So there seems to be something wrong with it down at the KVM/QEMU level. It's weird how this keeps happening to the console proxy only and not any of the other VMs. I tried to reboot it from the management UI and after about 15 minutes, it finally did. Now the console proxy is working but I don't know how long it will last before it breaks again. I found this in libvirtd.log, which corresponds with the time the console proxy rebooted: 2014-05-22 17:17:04.362+: 25195: info : libvirt version: 0.10.2, package: 29.el6_5.7 (CentOS BuildSystem http://bugs.centos.org, 2014-04-07-07:42:04, c6b9.bsys.dev.centos.org) 2014-05-22 17:17:04.362+: 25195: error : qemuMonitorIO:614 : internal error End of file from monitor On Wed, May 21, 2014 at 2:07 PM, Ian Young iyo...@ratespecial.com wrote: I built and installed a libvirt 1.04 package from the Fedora src rpm. It installed fine inside a test VM but installing it on the real hypervisor was a bad idea and I doubt I'll be pursuing it further. All VMs promptly stopped and this appeared in libvirtd.log: 2014-05-21 20:36:19.260+: 23567: info : libvirt version: 1.0.4, package: 1.el6 (Unknown, 2014-05-21-11:36:09, redacted.com) 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_network.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_storage.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nodedev.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_secret.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nwfilter.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_interface.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so not accessible 2014-05-21 20:36:19.260+: 23567
Re: console proxy times out
I wonder if something is wrong with the NFS mount. I see this error periodically in /var/log/messages even though I have set the Domain in /etc/idmapd.conf to the host's FQDN: May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not map into domain 'redacted.com' May 22 10:25:16 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does not map into domain 'redacted.com' name '107' just started appearing in the log yesterday, which looks unusual. Up until then, the error was always name '0'. On Thu, May 22, 2014 at 11:15 AM, Andrija Panic andrija.pa...@gmail.comwrote: I have observed this kind of problems (process blocked for more than xx sec...) when I had access with storage - check your disks, smartctl etc... best Sent from Google Nexus 4 On May 22, 2014 7:49 PM, Ian Young iyo...@ratespecial.com wrote: And this is in /var/log/messages right before that event: May 22 10:16:07 virthost1 kernel: INFO: task qemu-kvm:2971 blocked for more than 120 seconds. May 22 10:16:07 virthost1 kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1 May 22 10:16:07 virthost1 kernel: echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. May 22 10:16:07 virthost1 kernel: qemu-kvm D 0002 0 2971 1 0x0080 May 22 10:16:07 virthost1 kernel: 8810724e9be8 0082 88106b6529d8 May 22 10:16:07 virthost1 kernel: 880871b3e8d8 880871b3e8f0 8100bb8e 8810724e9be8 May 22 10:16:07 virthost1 kernel: 881073525058 8810724e9fd8 fbc8 881073525058 May 22 10:16:07 virthost1 kernel: Call Trace: May 22 10:16:07 virthost1 kernel: [8100bb8e] ? apic_timer_interrupt+0xe/0x20 May 22 10:16:07 virthost1 kernel: [810555ef] ? mutex_spin_on_owner+0x9f/0xc0 May 22 10:16:07 virthost1 kernel: [8152969e] __mutex_lock_slowpath+0x13e/0x180 May 22 10:16:07 virthost1 kernel: [8152953b] mutex_lock+0x2b/0x50 May 22 10:16:07 virthost1 kernel: [a021c2cf] memory_access_ok+0x7f/0xc0 [vhost_net] May 22 10:16:07 virthost1 kernel: [a021d89c] vhost_dev_ioctl+0x2ec/0xa50 [vhost_net] May 22 10:16:07 virthost1 kernel: [a021c411] ? vhost_work_flush+0xe1/0x120 [vhost_net] May 22 10:16:07 virthost1 kernel: [8122db91] ? avc_has_perm+0x71/0x90 May 22 10:16:07 virthost1 kernel: [a021f11a] vhost_net_ioctl+0x7a/0x5d0 [vhost_net] May 22 10:16:07 virthost1 kernel: [8122f914] ? inode_has_perm+0x54/0xa0 May 22 10:16:07 virthost1 kernel: [a01a28b7] ? kvm_vcpu_ioctl+0x1e7/0x580 [kvm] May 22 10:16:07 virthost1 kernel: [8108b14e] ? send_signal+0x3e/0x90 May 22 10:16:07 virthost1 kernel: [8119dc12] vfs_ioctl+0x22/0xa0 May 22 10:16:07 virthost1 kernel: [8119ddb4] do_vfs_ioctl+0x84/0x580 May 22 10:16:07 virthost1 kernel: [8119e331] sys_ioctl+0x81/0xa0 May 22 10:16:07 virthost1 kernel: [810e1e4e] ? __audit_syscall_exit+0x25e/0x290 May 22 10:16:07 virthost1 kernel: [8100b072] system_call_fastpath+0x16/0x1b On Thu, May 22, 2014 at 10:39 AM, Ian Young iyo...@ratespecial.com wrote: The console proxy became unavailable again yesterday afternoon. I could SSH into it via its link local address and nothing seemed to be wrong inside the VM itself. However, the qemu-kvm process for that VM was at almost 100% CPU. Inside the VM, the CPU usage was minimal and the java process was running and listening on port 443. So there seems to be something wrong with it down at the KVM/QEMU level. It's weird how this keeps happening to the console proxy only and not any of the other VMs. I tried to reboot it from the management UI and after about 15 minutes, it finally did. Now the console proxy is working but I don't know how
Re: console proxy times out
I was able to get it working by following these steps: 1. stop all instances 2. service cloudstack-management stop 3. service cloudstack-agent stop 4. virsh shutdown {domain} (for each of the system VMs) 5. service libvirtd stop 6. umount primary and secondary 7. reboot The console proxy is working again. I expect it will probably break again in a day or two. I have a feeling it's a result of this libvirtd bug, since I've seen the cannot acquire state change lock several times. https://bugs.launchpad.net/nova/+bug/1254872 I might try building my own libvirtd 1.0.3 for EL6. On Tue, May 20, 2014 at 6:21 PM, Ian Young iyo...@ratespecial.com wrote: So I got the console proxy working via HTTPS (by managing my own realhostip.com DNS) last week and everything was working fine. Today, all of a sudden, the console proxy stopped working again. The browser says, Connecting to 192-168-100-159.realhostip.com... and eventually times out. I tried to restart it and it went into a Stopping state that never completed and the Agent State was Disconnected. I could not shut down the VM using virsh or with kill -9 because libvirtd kept saying, cannot acquire state change lock, so I gracefully shut down the remaining instances and rebooted the entire management server/hypervisor. Start over. When it came back up, the SSVM and console proxy started but the virtual router was stopped. I was able to manually start it from the UI. The console proxy still times out when I try to access it from a browser. I don't see any errors in the management or agent logs, just this: 2014-05-20 18:04:27,632 DEBUG [c.c.a.t.Request] (catalina-exec-10:null) Seq 1-2130378876: Sending { Cmd , MgmtId: 55157049428734, via: 1( virthost1.redacted.com), Ver: v1, Flags: 100011, [{com.cloud.agent.api.GetVncPortCommand:{id:4,name:r-4-VM,wait:0}}] } 2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request] (AgentManager-Handler-3:null) Seq 1-2130378876: Processing: { Ans: , MgmtId: 55157049428734, via: 1, Ver: v1, Flags: 10, [{com.cloud.agent.api.GetVncPortAnswer:{address:192.168.100.6,port:5902,result:true,wait:0}}] } 2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request] (catalina-exec-10:null) Seq 1-2130378876: Received: { Ans: , MgmtId: 55157049428734, via: 1, Ver: v1, Flags: 10, { GetVncPortAnswer } } 2014-05-20 18:04:27,684 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-10:null) Port info 192.168.100.6 2014-05-20 18:04:27,684 INFO [c.c.s.ConsoleProxyServlet] (catalina-exec-10:null) Parse host info returned from executing GetVNCPortCommand. host info: 192.168.100.6 2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-10:null) Compose console url: https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A 2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-10:null) the console url is :: htmltitler-4-VM/titleframesetframe src= https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A /frame/frameset/html 2014-05-20 18:04:29,216 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-4:null) SeqA 2-545: Processing Seq 2-545: { Cmd , MgmtId: -1, via: 2, Ver: v1, Flags: 11, [{com.cloud.agent.api.ConsoleProxyLoadReportCommand:{_proxyVmId:2,_loadInfo:{\n \connections\: []\n},wait:0}}] } If I try to restart the system VMs with cloudstack-sysvmadm, it says: Stopping and starting 1 secondary storage vm(s)... curl: (7) couldn't connect to host ERROR: Failed to stop secondary storage vm with id 1 Done stopping and starting secondary storage vm(s) Stopping and starting 1 console proxy vm(s)... curl: (7) couldn't connect to host ERROR: Failed to stop console proxy vm with id 2 Done stopping and starting console proxy vm(s) . Stopping and starting 1 running routing vm(s)... curl: (7) couldn't connect to host 2 Done restarting router(s). I notice there are now four entries for the same management server in the mshost table, and they all are in an Up state and the removed field is NULL. What's wrong with this system?
Re: console proxy times out
I built and installed a libvirt 1.04 package from the Fedora src rpm. It installed fine inside a test VM but installing it on the real hypervisor was a bad idea and I doubt I'll be pursuing it further. All VMs promptly stopped and this appeared in libvirtd.log: 2014-05-21 20:36:19.260+: 23567: info : libvirt version: 1.0.4, package: 1.el6 (Unknown, 2014-05-21-11:36:09, redacted.com) 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_network.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_storage.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nodedev.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_secret.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nwfilter.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_interface.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so not accessible 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 : Module /usr/lib64/libvirt/connection-driver/libvirt_driver_lxc.so not accessible 2014-05-21 20:36:49.471+: 23570: error : do_open:1220 : no connection driver available for qemu:///system 2014-05-21 20:36:49.472+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error 2014-05-21 20:36:49.473+: 23571: error : do_open:1220 : no connection driver available for lxc:/// 2014-05-21 20:36:49.474+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error 2014-05-21 20:36:49.475+: 23568: error : do_open:1220 : no connection driver available for qemu:///system 2014-05-21 20:36:49.476+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error 2014-05-21 20:36:49.678+: 23575: error : do_open:1220 : no connection driver available for qemu:///system 2014-05-21 20:36:49.678+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error 2014-05-21 20:36:49.681+: 23572: error : do_open:1220 : no connection driver available for qemu:///system 2014-05-21 20:36:49.682+: 23567: error : virNetSocketReadWire:1370 : End of file while reading data: Input/output error On Wed, May 21, 2014 at 10:45 AM, Ian Young iyo...@ratespecial.com wrote: I was able to get it working by following these steps: 1. stop all instances 2. service cloudstack-management stop 3. service cloudstack-agent stop 4. virsh shutdown {domain} (for each of the system VMs) 5. service libvirtd stop 6. umount primary and secondary 7. reboot The console proxy is working again. I expect it will probably break again in a day or two. I have a feeling it's a result of this libvirtd bug, since I've seen the cannot acquire state change lock several times. https://bugs.launchpad.net/nova/+bug/1254872 I might try building my own libvirtd 1.0.3 for EL6. On Tue, May 20, 2014 at 6:21 PM, Ian Young iyo...@ratespecial.com wrote: So I got the console proxy working via HTTPS (by managing my own realhostip.com DNS) last week and everything was working fine. Today, all of a sudden, the console proxy stopped working again. The browser says, Connecting to 192-168-100-159.realhostip.com... and eventually times out. I tried to restart it and it went into a Stopping state that never completed and the Agent State was Disconnected. I could not shut down the VM using virsh or with kill -9 because libvirtd kept saying, cannot acquire state change lock, so I gracefully shut down the remaining instances and rebooted the entire management server/hypervisor. Start over. When it came back up, the SSVM and console proxy started but the virtual router was stopped. I was able to manually start it from the UI. The console proxy still times out when I try to access it from a browser. I don't see any errors in the management or agent logs, just this: 2014-05-20 18:04:27,632 DEBUG [c.c.a.t.Request] (catalina-exec-10:null) Seq 1-2130378876: Sending { Cmd , MgmtId: 55157049428734, via: 1( virthost1.redacted.com), Ver: v1, Flags: 100011, [{com.cloud.agent.api.GetVncPortCommand:{id:4,name:r-4-VM,wait:0}}] } 2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request] (AgentManager-Handler-3:null) Seq 1-2130378876: Processing: { Ans: , MgmtId: 55157049428734, via: 1, Ver: v1, Flags: 10, [{com.cloud.agent.api.GetVncPortAnswer:{address:192.168.100.6
console proxy times out
So I got the console proxy working via HTTPS (by managing my own realhostip.com DNS) last week and everything was working fine. Today, all of a sudden, the console proxy stopped working again. The browser says, Connecting to 192-168-100-159.realhostip.com... and eventually times out. I tried to restart it and it went into a Stopping state that never completed and the Agent State was Disconnected. I could not shut down the VM using virsh or with kill -9 because libvirtd kept saying, cannot acquire state change lock, so I gracefully shut down the remaining instances and rebooted the entire management server/hypervisor. Start over. When it came back up, the SSVM and console proxy started but the virtual router was stopped. I was able to manually start it from the UI. The console proxy still times out when I try to access it from a browser. I don't see any errors in the management or agent logs, just this: 2014-05-20 18:04:27,632 DEBUG [c.c.a.t.Request] (catalina-exec-10:null) Seq 1-2130378876: Sending { Cmd , MgmtId: 55157049428734, via: 1( virthost1.redacted.com), Ver: v1, Flags: 100011, [{com.cloud.agent.api.GetVncPortCommand:{id:4,name:r-4-VM,wait:0}}] } 2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request] (AgentManager-Handler-3:null) Seq 1-2130378876: Processing: { Ans: , MgmtId: 55157049428734, via: 1, Ver: v1, Flags: 10, [{com.cloud.agent.api.GetVncPortAnswer:{address:192.168.100.6,port:5902,result:true,wait:0}}] } 2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request] (catalina-exec-10:null) Seq 1-2130378876: Received: { Ans: , MgmtId: 55157049428734, via: 1, Ver: v1, Flags: 10, { GetVncPortAnswer } } 2014-05-20 18:04:27,684 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-10:null) Port info 192.168.100.6 2014-05-20 18:04:27,684 INFO [c.c.s.ConsoleProxyServlet] (catalina-exec-10:null) Parse host info returned from executing GetVNCPortCommand. host info: 192.168.100.6 2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-10:null) Compose console url: https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A 2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-10:null) the console url is :: htmltitler-4-VM/titleframesetframe src= https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A /frame/frameset/html 2014-05-20 18:04:29,216 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-4:null) SeqA 2-545: Processing Seq 2-545: { Cmd , MgmtId: -1, via: 2, Ver: v1, Flags: 11, [{com.cloud.agent.api.ConsoleProxyLoadReportCommand:{_proxyVmId:2,_loadInfo:{\n \connections\: []\n},wait:0}}] } If I try to restart the system VMs with cloudstack-sysvmadm, it says: Stopping and starting 1 secondary storage vm(s)... curl: (7) couldn't connect to host ERROR: Failed to stop secondary storage vm with id 1 Done stopping and starting secondary storage vm(s) Stopping and starting 1 console proxy vm(s)... curl: (7) couldn't connect to host ERROR: Failed to stop console proxy vm with id 2 Done stopping and starting console proxy vm(s) . Stopping and starting 1 running routing vm(s)... curl: (7) couldn't connect to host 2 Done restarting router(s). I notice there are now four entries for the same management server in the mshost table, and they all are in an Up state and the removed field is NULL. What's wrong with this system?
can't ping guest network
My VMs can reach the rest of our internal network and even the internet but nothing except the management/hypervisor can reach the VMs. I monitored eth0 on one of the VMs while I tried to SSH to it from another workstation and it displayed this: 17:05:29.031584 ARP, Request who-has monitor.cs1cloud.internal tell 192.168.100.166, length 46 I have the network bridge set up correctly and I've tried disabling iptables and SELinux just to rule those out. There must be something simple I overlooked. Why does outbound traffic work but inbound traffic doesn't?
Re: can't ping guest network
I forgot to add ingress rules to the security group. It works now. On Mon, May 19, 2014 at 5:18 PM, Ian Young iyo...@ratespecial.com wrote: My VMs can reach the rest of our internal network and even the internet but nothing except the management/hypervisor can reach the VMs. I monitored eth0 on one of the VMs while I tried to SSH to it from another workstation and it displayed this: 17:05:29.031584 ARP, Request who-has monitor.cs1cloud.internal tell 192.168.100.166, length 46 I have the network bridge set up correctly and I've tried disabling iptables and SELinux just to rule those out. There must be something simple I overlooked. Why does outbound traffic work but inbound traffic doesn't?
Re: replacement for realhostip
The problem appears to be with the console proxy itself. Here are the ports that are listening on the public interface, according to an nmap TCP scan: PORTSTATE SERVICE 80/tcp open http 443/tcp closed https When I logged into the console proxy through the link local address, I checked for processes on port 443 and there are none, so obviously an HTTPS connection can't be made. There is a Java process listening on port 80 but nothing on 443. Is there something in the global settings that will enable HTTPS, or is this a bug? root@v-2-VM:~# netstat -lnp | grep java tcp0 0 0.0.0.0:80010.0.0.0:* LISTEN 3491/java tcp0 0 0.0.0.0:80 0.0.0.0:* LISTEN 3491/java On Thu, May 15, 2014 at 2:53 PM, Ian Young iyo...@ratespecial.com wrote: I just realized I had to set the consoleproxy.url.domain field to realhostip.com but now when I try to view the console, the browser says The server refused the connection. Does that indicate a problem with the SSL certificate? management-server.log: 2014-05-15 14:43:55,506 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) Seq 1-90898443: Sending { Cmd , MgmtId: 161342909744, via: 1( virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, [{com.cloud.agent.api.GetVncPortCommand:{id:2,name:v-2-VM,wait:0}}] } 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (AgentManager-Handler-5:null) Seq 1-90898443: Processing: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, [{com.cloud.agent.api.GetVncPortAnswer:{address:192.168.100.6,port:5901,result:true,wait:0}}] } 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) Seq 1-90898443: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, { GetVncPortAnswer } } 2014-05-15 14:43:55,563 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) Port info 192.168.100.6 2014-05-15 14:43:55,563 INFO [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) Parse host info returned from executing GetVNCPortCommand. host info: 192.168.100.6 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) Compose console url: https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) the console url is :: htmltitlev-2-VM/titleframesetframe src= https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg /frame/frameset/html ssl_access_log: 192.168.100.166 - - [15/May/2014:14:44:55 -0700] GET /client/console?cmd=accessvm=086b5822-de00-4764-8b05-d8e00657ee54 HTTP/1.1 200 405 On Wed, May 14, 2014 at 5:56 PM, Ian Young iyo...@ratespecial.com wrote: Looks like it's still using HTTP, not HTTPS: 2014-05-14 17:52:35,812 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) Seq 1-800529939: Sending { Cmd , MgmtId: 161342909744, via: 1( virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, [{com.cloud.agent.api.GetVncPortCommand:{id:6,name:i-5-6-VM,wait:0}}] } 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (AgentManager-Handler-1:null) Seq 1-800529939: Processing: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, [{com.cloud.agent.api.GetVncPortAnswer:{address:192.168.100.6,port:5903,result:true,wait:0}}] } 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) Seq 1-800529939: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, { GetVncPortAnswer } } 2014-05-14 17:52:35,861 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Port info 192.168.100.6 2014-05-14 17:52:35,861 INFO [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Parse host info returned from executing GetVNCPortCommand. host info: 192.168.100.6 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Compose console url: http://192.168.100.159/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn_GSECIK5nC2lBX8cMHvt1_GrmwDVK1PEEAwyueLlgNRgodobz8Lsyv2jEc-mUvMH340AYGt0FyZOuXIA6dunN3yx-bP-vp4rao5Up61eJwOvqFr3PhggNpbq5Up59ObOdYMe2GsBP_3FrL8ZQfBhNBSmViHQ0fKJSyUHDoC9tKlfs2Bb0rPOBxsZeTPfe-hDuaVT-pZxjQXCKM93sujnWw 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) the console url is :: htmltitlephonesynergy/titleframesetframe src= http://192.168.100.159/ajax?token
Re: replacement for realhostip
Ok, so the console proxy needed to be restarted in order for the consoleproxy.url.domain setting to take effect. However, I still can't see the console. In Chrome, it just shows a frowning face with no error message (not very useful). In Firefox, at least it tells me the certificate is not trusted because it is self-signed but it doesn't give me the option to accept it. It's not an unreasonable expectation to be able to use self-signed SSL certificates for an internal site. Is there a setting in CloudStack that allows them to be trusted? On Fri, May 16, 2014 at 10:38 AM, Ian Young iyo...@ratespecial.com wrote: The problem appears to be with the console proxy itself. Here are the ports that are listening on the public interface, according to an nmap TCP scan: PORTSTATE SERVICE 80/tcp open http 443/tcp closed https When I logged into the console proxy through the link local address, I checked for processes on port 443 and there are none, so obviously an HTTPS connection can't be made. There is a Java process listening on port 80 but nothing on 443. Is there something in the global settings that will enable HTTPS, or is this a bug? root@v-2-VM:~# netstat -lnp | grep java tcp0 0 0.0.0.0:80010.0.0.0:* LISTEN 3491/java tcp0 0 0.0.0.0:80 0.0.0.0:* LISTEN 3491/java On Thu, May 15, 2014 at 2:53 PM, Ian Young iyo...@ratespecial.com wrote: I just realized I had to set the consoleproxy.url.domain field to realhostip.com but now when I try to view the console, the browser says The server refused the connection. Does that indicate a problem with the SSL certificate? management-server.log: 2014-05-15 14:43:55,506 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) Seq 1-90898443: Sending { Cmd , MgmtId: 161342909744, via: 1( virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, [{com.cloud.agent.api.GetVncPortCommand:{id:2,name:v-2-VM,wait:0}}] } 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (AgentManager-Handler-5:null) Seq 1-90898443: Processing: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, [{com.cloud.agent.api.GetVncPortAnswer:{address:192.168.100.6,port:5901,result:true,wait:0}}] } 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) Seq 1-90898443: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, { GetVncPortAnswer } } 2014-05-15 14:43:55,563 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) Port info 192.168.100.6 2014-05-15 14:43:55,563 INFO [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) Parse host info returned from executing GetVNCPortCommand. host info: 192.168.100.6 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) Compose console url: https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) the console url is :: htmltitlev-2-VM/titleframesetframe src= https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg /frame/frameset/html ssl_access_log: 192.168.100.166 - - [15/May/2014:14:44:55 -0700] GET /client/console?cmd=accessvm=086b5822-de00-4764-8b05-d8e00657ee54 HTTP/1.1 200 405 On Wed, May 14, 2014 at 5:56 PM, Ian Young iyo...@ratespecial.comwrote: Looks like it's still using HTTP, not HTTPS: 2014-05-14 17:52:35,812 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) Seq 1-800529939: Sending { Cmd , MgmtId: 161342909744, via: 1( virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, [{com.cloud.agent.api.GetVncPortCommand:{id:6,name:i-5-6-VM,wait:0}}] } 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (AgentManager-Handler-1:null) Seq 1-800529939: Processing: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, [{com.cloud.agent.api.GetVncPortAnswer:{address:192.168.100.6,port:5903,result:true,wait:0}}] } 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) Seq 1-800529939: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, { GetVncPortAnswer } } 2014-05-14 17:52:35,861 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Port info 192.168.100.6 2014-05-14 17:52:35,861 INFO [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Parse host info returned from executing GetVNCPortCommand. host info: 192.168.100.6 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20
Re: replacement for realhostip
4.3 consoleproxy.url.domain = realhostip.com It's working now. I'm just responding to clarify those questions. On Thu, May 15, 2014 at 10:43 AM, Amogh Vasekar amogh.vase...@citrix.comwrote: Hi, Which version of CloudStack are you on? Also, what does the config console proxy.url.domain refer to? Thanks, Amogh On 5/14/14 5:41 PM, Ian Young iyo...@ratespecial.com wrote: I decided to create my own internal realhostip.com. My DNS servers use PowerDNS, not BIND, so the $GENERATE directive was not an option and I didn't want to have to populate my DNS servers' databases with a record for every possible IP address. Fortunately, I found the following Lua script: https://github.com/terbolous/powerdns-cloudstack-proxy-dns I can confirm the Lua script works as expected and my CloudStack server can be tricked into believing my internal DNS servers are the authority for realhostip.com: [root@virthost1 ]# dig +short 1-2-3-4.realhostip.com 1.2.3.4 I followed this guide and updated the console proxy/SSVM SSL certificate with my own *.realhostip.com certificate. http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/la test/systemvm.html#changing-the-console-proxy-ssl-certificate-and-domain The console proxy restarted but it's still blank when I try to view the console. Does the domain have to be something other than realhostip.com?
Re: cloudstack 4.3 installation on CentOS
Also, which version of NFS are you using? For NFSv4 you need to export a pseudo file system designated by fsid=0, which this guide doesn't mention. See section 18.7.1.1. Using exportfs with NFSv4 in the following article for more information: http://www.centos.org/docs/5/html/Deployment_Guide-en-US/s1-nfs-server-config-exports.html On Thu, May 15, 2014 at 10:37 AM, Ian Young iyo...@ratespecial.com wrote: That quick start guide used to have the wrong URL for the system VM template but it looks like it has been corrected since then. Check your command line history and see if the template you downloaded was the same as the one in this section: http://cloudstack-installation.readthedocs.org/en/latest/qig.html#system-template-setup On Wed, May 7, 2014 at 10:26 AM, dimas yoga pratama smid...@gmail.comwrote: yes I follow the wizard. This is my configuration with basic installation. management server IP : 10.151.32.51 Zone Configuration: Name : Zone1 Public DNS1: 202.46.129.2 (College DNS) Public DNS2: - Internal DNS1: 10.151.32.6 (My Lab DNS) Internal DNS2: - Pod Configuration: Name : Pod1 Gateway : 10.151.32.1 Netmask : 255.255.255.0 Start/end reserved system IPs : 10.151.32.60 - 10.151.32.80 Guest Gateway : 10.151.32.1 Guest Netmask : 255.255.255.0 Guest start/end IP : 10.151.32.90 - 10.151.32.200 Cluster Configuration: Name: Cluster1 Hypervisor:KVM Host Configuration: Hostname:10.151.32.51 (Because I build cloudstack with single hardware) Username:root Password:password Primary storage: Name: Primary1 Server: 10.151.32.51 Path :/primary Secondary storage: NFS server: 10.151.32.51 Path :/secondary Is there anything wrong with my configurations? I wanna test cloudstack in my Lab environment. Thanks. On Wed, May 7, 2014 at 6:31 AM, Pierre-Luc Dion pd...@cloudops.com wrote: Hi, Did you follow the zone creation wizard from the ui? Could it be possible their is not enough management IP in the pod? Le mardi 6 mai 2014, dimas yoga pratama smid...@gmail.com a écrit : Hi all, I'm trying to install cloudstack 4.3 on CentOS 6.5 with single hardware with proxy environment. I followed this guide http://cloudstack-installation.readthedocs.org/en/latest/qig.html. I managed to login to cloudstack dashboard and succeded to add host. When it comes to creating system VMs(this may take a while) step, it takes a very very loong time and when I refresh the browser suddenly it redirect to dashboard. I check the infrastructure tab and I found 2 system VMs already created and the VM state showed starting but the agent showed nothing. Also in dashboard tab I found this notification : Management Server: Management network CIDR is not configured original.type 14 why is that happening? anything wrong with my installation? looking forward for your answer. -- Pierre-Luc Dion Architecte de Solution Cloud | Cloud Solutions Architect 855-OK-CLOUD (855-652-5683) x1101 - - - *CloudOps*420 rue Guy Montréal QC H3J 1S6 www.cloudops.com @CloudOps_
Re: replacement for realhostip
I just realized I had to set the consoleproxy.url.domain field to realhostip.com but now when I try to view the console, the browser says The server refused the connection. Does that indicate a problem with the SSL certificate? management-server.log: 2014-05-15 14:43:55,506 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) Seq 1-90898443: Sending { Cmd , MgmtId: 161342909744, via: 1( virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, [{com.cloud.agent.api.GetVncPortCommand:{id:2,name:v-2-VM,wait:0}}] } 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (AgentManager-Handler-5:null) Seq 1-90898443: Processing: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, [{com.cloud.agent.api.GetVncPortAnswer:{address:192.168.100.6,port:5901,result:true,wait:0}}] } 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) Seq 1-90898443: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, { GetVncPortAnswer } } 2014-05-15 14:43:55,563 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) Port info 192.168.100.6 2014-05-15 14:43:55,563 INFO [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) Parse host info returned from executing GetVNCPortCommand. host info: 192.168.100.6 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) Compose console url: https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-15:null) the console url is :: htmltitlev-2-VM/titleframesetframe src= https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg /frame/frameset/html ssl_access_log: 192.168.100.166 - - [15/May/2014:14:44:55 -0700] GET /client/console?cmd=accessvm=086b5822-de00-4764-8b05-d8e00657ee54 HTTP/1.1 200 405 On Wed, May 14, 2014 at 5:56 PM, Ian Young iyo...@ratespecial.com wrote: Looks like it's still using HTTP, not HTTPS: 2014-05-14 17:52:35,812 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) Seq 1-800529939: Sending { Cmd , MgmtId: 161342909744, via: 1( virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, [{com.cloud.agent.api.GetVncPortCommand:{id:6,name:i-5-6-VM,wait:0}}] } 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (AgentManager-Handler-1:null) Seq 1-800529939: Processing: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, [{com.cloud.agent.api.GetVncPortAnswer:{address:192.168.100.6,port:5903,result:true,wait:0}}] } 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) Seq 1-800529939: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, { GetVncPortAnswer } } 2014-05-14 17:52:35,861 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Port info 192.168.100.6 2014-05-14 17:52:35,861 INFO [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Parse host info returned from executing GetVNCPortCommand. host info: 192.168.100.6 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Compose console url: http://192.168.100.159/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn_GSECIK5nC2lBX8cMHvt1_GrmwDVK1PEEAwyueLlgNRgodobz8Lsyv2jEc-mUvMH340AYGt0FyZOuXIA6dunN3yx-bP-vp4rao5Up61eJwOvqFr3PhggNpbq5Up59ObOdYMe2GsBP_3FrL8ZQfBhNBSmViHQ0fKJSyUHDoC9tKlfs2Bb0rPOBxsZeTPfe-hDuaVT-pZxjQXCKM93sujnWw 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) the console url is :: htmltitlephonesynergy/titleframesetframe src= http://192.168.100.159/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn_GSECIK5nC2lBX8cMHvt1_GrmwDVK1PEEAwyueLlgNRgodobz8Lsyv2jEc-mUvMH340AYGt0FyZOuXIA6dunN3yx-bP-vp4rao5Up61eJwOvqFr3PhggNpbq5Up59ObOdYMe2GsBP_3FrL8ZQfBhNBSmViHQ0fKJSyUHDoC9tKlfs2Bb0rPOBxsZeTPfe-hDuaVT-pZxjQXCKM93sujnWw /frame/frameset/html On Wed, May 14, 2014 at 5:41 PM, Ian Young iyo...@ratespecial.com wrote: I decided to create my own internal realhostip.com. My DNS servers use PowerDNS, not BIND, so the $GENERATE directive was not an option and I didn't want to have to populate my DNS servers' databases with a record for every possible IP address. Fortunately, I found the following Lua script: https://github.com/terbolous/powerdns-cloudstack-proxy-dns I can confirm the Lua script works as expected and my CloudStack server can be tricked into believing my internal DNS servers are the authority for realhostip.com: [root@virthost1 ]# dig +short 1-2-3-4.realhostip.com
Re: Quick Installation Guide for CentOS 6.5
Two things to check: That quick start guide used to have the wrong URL for the system VM template but it looks like it has been corrected since then. Check your command line history and see if the template you downloaded was the same as the one in this section: http://cloudstack-installation.readthedocs.org/en/latest/qig.html#system-template-setup For NFSv4 you need to export a pseudo file system designated by fsid=0, which this guide doesn't mention. See section 18.7.1.1. Using exportfs with NFSv4 in the following article for more information: http://www.centos.org/docs/5/html/Deployment_Guide-en-US/s1-nfs-server-config-exports.html On Tue, May 6, 2014 at 5:33 AM, dimas smid...@gmail.com wrote: Samuel Winchenbach swinchen@... writes: Hi all, I am following the quick installation guide exactly (I even setup a gateway w/ IP 172.16.10.1) but I can not get the setup to complete. The System VMs remain stuck on Starting. ​ Any help would be greatly appreciated!​ Now I logged into the management console and used the exact settings as listed in the Quick Install Guide. It seems to hang forever (2 hours) on Creating system VMs (this may take a while) ​ Hi I encountered same problem like you. did you managed to find a solution? looking forward for your answer
Re: cloudstack 4.3 installation on CentOS
That quick start guide used to have the wrong URL for the system VM template but it looks like it has been corrected since then. Check your command line history and see if the template you downloaded was the same as the one in this section: http://cloudstack-installation.readthedocs.org/en/latest/qig.html#system-template-setup On Wed, May 7, 2014 at 10:26 AM, dimas yoga pratama smid...@gmail.comwrote: yes I follow the wizard. This is my configuration with basic installation. management server IP : 10.151.32.51 Zone Configuration: Name : Zone1 Public DNS1: 202.46.129.2 (College DNS) Public DNS2: - Internal DNS1: 10.151.32.6 (My Lab DNS) Internal DNS2: - Pod Configuration: Name : Pod1 Gateway : 10.151.32.1 Netmask : 255.255.255.0 Start/end reserved system IPs : 10.151.32.60 - 10.151.32.80 Guest Gateway : 10.151.32.1 Guest Netmask : 255.255.255.0 Guest start/end IP : 10.151.32.90 - 10.151.32.200 Cluster Configuration: Name: Cluster1 Hypervisor:KVM Host Configuration: Hostname:10.151.32.51 (Because I build cloudstack with single hardware) Username:root Password:password Primary storage: Name: Primary1 Server: 10.151.32.51 Path :/primary Secondary storage: NFS server: 10.151.32.51 Path :/secondary Is there anything wrong with my configurations? I wanna test cloudstack in my Lab environment. Thanks. On Wed, May 7, 2014 at 6:31 AM, Pierre-Luc Dion pd...@cloudops.com wrote: Hi, Did you follow the zone creation wizard from the ui? Could it be possible their is not enough management IP in the pod? Le mardi 6 mai 2014, dimas yoga pratama smid...@gmail.com a écrit : Hi all, I'm trying to install cloudstack 4.3 on CentOS 6.5 with single hardware with proxy environment. I followed this guide http://cloudstack-installation.readthedocs.org/en/latest/qig.html. I managed to login to cloudstack dashboard and succeded to add host. When it comes to creating system VMs(this may take a while) step, it takes a very very loong time and when I refresh the browser suddenly it redirect to dashboard. I check the infrastructure tab and I found 2 system VMs already created and the VM state showed starting but the agent showed nothing. Also in dashboard tab I found this notification : Management Server: Management network CIDR is not configured original.type 14 why is that happening? anything wrong with my installation? looking forward for your answer. -- Pierre-Luc Dion Architecte de Solution Cloud | Cloud Solutions Architect 855-OK-CLOUD (855-652-5683) x1101 - - - *CloudOps*420 rue Guy Montréal QC H3J 1S6 www.cloudops.com @CloudOps_
Re: new installation--ssvm won't start
I noticed that in Home Infrastructure Zones Zone1, Resources tab, the Secondary Storage says Allocated 0.00 KB / 0.00 KB. However, the secondary storage NFS mount is listed in Home Infrastructure Secondary Storage and the URL is correct. Does this mean the secondary storage is unreachable? On Wed, May 7, 2014 at 10:26 AM, Ian Young iyo...@ratespecial.com wrote: I reinstalled my single server CloudStack system yesterday, following the quick start guide precisely. The only difference was that I used /var/primary and /var/secondary instead of /primary and /secondary, because the /var partition on this machine is very large. The UI installer reached the point where it says Creating system VMs (this may take a while) but never finished. I left it overnight and it still hadn't completed. This is typically the step that fails, most of the times I've installed CloudStack, so I imagine I must be making the same fundamental mistake each time, and I'd like to know what that is. I checked management.log and it's in a loop where it creates a secondary storage VM, fails to start it, destroys it, and tries again. It says Host 1 is unreachable but I'm using the correct password, SELinux is permissive, and all the iptables rules are in place. In what way is it trying to connect to Host 1? SSH? NFS? Here's a log excerpt of messages related to the SSVM: http://pastebin.com/X11A51bh NFS appears to be functional, since CloudStack automatically mounted the primary storage. FilesystemSize Used Avail Use% Mounted on /dev/sda3 20G 1.8G 17G 10% / tmpfs 32G 0 32G 0% /dev/shm /dev/sda1 194M 42M 143M 23% /boot /dev/sda4 1.8T 1.9G 1.7T 1% /var 192.168.100.6:/var/primary 1.8T 1.9G 1.7T 1% /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af How can I identify whatever it is that's preventing the SSVM from starting? Here is another log excerpt, without any filtering: http://pastebin.com/XsPGJQik
Re: new installation--ssvm won't start
I know this has something to do with idmapd and NFS. This error keeps appearing in /var/log/messages: May 8 10:29:54 virthost1 rpc.idmapd[11044]: nss_getpwnam: name '0' does not map into domain 'redacted.com' On Thu, May 8, 2014 at 5:20 PM, Ian Young iyo...@ratespecial.com wrote: I wiped the server clean and started over again today. In the process, I realized that, the previous time, I forgot to uncomment the Domain line in /etc/idmapd.conf. However, even though I included the step this time, the GUI installer still seems to hang on the final Creating system VMs step. I see two VMs running when I run virsh list (the secondary storage VM keeps getting regenerated). In the primary storage, it looks like there is one complete 693 MB image but the other two are only 11 and 12 MB, although they are gradually growing. What's happening here? [root@virthost1 ~]# ls -hl /var/primary/ total 715M -rwxr--r--. 1 nobody nobody 11M May 8 09:55 54de167f-ad9c-453b-91c7-fdd644922932 -rwxr--r--. 1 nobody nobody 12M May 8 09:55 91069b66-b1b3-41aa-8995-874fd4353473 -rwxr--r--. 1 nobody nobody 693M May 8 09:16 c2e6efba-d6c7-11e3-9e76-002590c96d30 The management server log keeps reporting that There is no secondary storage VM for secondary storage host nfs://192.168.100.6/var/secondary. Here is a larger section of logs: http://pastebin.com/NFf5cBx3 On Wed, May 7, 2014 at 10:49 AM, Ian Young iyo...@ratespecial.com wrote: I noticed that in Home Infrastructure Zones Zone1, Resources tab, the Secondary Storage says Allocated 0.00 KB / 0.00 KB. However, the secondary storage NFS mount is listed in Home Infrastructure Secondary Storage and the URL is correct. Does this mean the secondary storage is unreachable? On Wed, May 7, 2014 at 10:26 AM, Ian Young iyo...@ratespecial.comwrote: I reinstalled my single server CloudStack system yesterday, following the quick start guide precisely. The only difference was that I used /var/primary and /var/secondary instead of /primary and /secondary, because the /var partition on this machine is very large. The UI installer reached the point where it says Creating system VMs (this may take a while) but never finished. I left it overnight and it still hadn't completed. This is typically the step that fails, most of the times I've installed CloudStack, so I imagine I must be making the same fundamental mistake each time, and I'd like to know what that is. I checked management.log and it's in a loop where it creates a secondary storage VM, fails to start it, destroys it, and tries again. It says Host 1 is unreachable but I'm using the correct password, SELinux is permissive, and all the iptables rules are in place. In what way is it trying to connect to Host 1? SSH? NFS? Here's a log excerpt of messages related to the SSVM: http://pastebin.com/X11A51bh NFS appears to be functional, since CloudStack automatically mounted the primary storage. FilesystemSize Used Avail Use% Mounted on /dev/sda3 20G 1.8G 17G 10% / tmpfs 32G 0 32G 0% /dev/shm /dev/sda1 194M 42M 143M 23% /boot /dev/sda4 1.8T 1.9G 1.7T 1% /var 192.168.100.6:/var/primary 1.8T 1.9G 1.7T 1% /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af How can I identify whatever it is that's preventing the SSVM from starting? Here is another log excerpt, without any filtering: http://pastebin.com/XsPGJQik
replacement for realhostip
I decided to create my own internal realhostip.com. My DNS servers use PowerDNS, not BIND, so the $GENERATE directive was not an option and I didn't want to have to populate my DNS servers' databases with a record for every possible IP address. Fortunately, I found the following Lua script: https://github.com/terbolous/powerdns-cloudstack-proxy-dns I can confirm the Lua script works as expected and my CloudStack server can be tricked into believing my internal DNS servers are the authority for realhostip.com: [root@virthost1 ]# dig +short 1-2-3-4.realhostip.com 1.2.3.4 I followed this guide and updated the console proxy/SSVM SSL certificate with my own *.realhostip.com certificate. http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/latest/systemvm.html#changing-the-console-proxy-ssl-certificate-and-domain The console proxy restarted but it's still blank when I try to view the console. Does the domain have to be something other than realhostip.com?
Re: replacement for realhostip
Looks like it's still using HTTP, not HTTPS: 2014-05-14 17:52:35,812 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) Seq 1-800529939: Sending { Cmd , MgmtId: 161342909744, via: 1( virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011, [{com.cloud.agent.api.GetVncPortCommand:{id:6,name:i-5-6-VM,wait:0}}] } 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (AgentManager-Handler-1:null) Seq 1-800529939: Processing: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, [{com.cloud.agent.api.GetVncPortAnswer:{address:192.168.100.6,port:5903,result:true,wait:0}}] } 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) Seq 1-800529939: Received: { Ans: , MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10, { GetVncPortAnswer } } 2014-05-14 17:52:35,861 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Port info 192.168.100.6 2014-05-14 17:52:35,861 INFO [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Parse host info returned from executing GetVNCPortCommand. host info: 192.168.100.6 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) Compose console url: http://192.168.100.159/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn_GSECIK5nC2lBX8cMHvt1_GrmwDVK1PEEAwyueLlgNRgodobz8Lsyv2jEc-mUvMH340AYGt0FyZOuXIA6dunN3yx-bP-vp4rao5Up61eJwOvqFr3PhggNpbq5Up59ObOdYMe2GsBP_3FrL8ZQfBhNBSmViHQ0fKJSyUHDoC9tKlfs2Bb0rPOBxsZeTPfe-hDuaVT-pZxjQXCKM93sujnWw 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet] (catalina-exec-20:null) the console url is :: htmltitlephonesynergy/titleframesetframe src= http://192.168.100.159/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn_GSECIK5nC2lBX8cMHvt1_GrmwDVK1PEEAwyueLlgNRgodobz8Lsyv2jEc-mUvMH340AYGt0FyZOuXIA6dunN3yx-bP-vp4rao5Up61eJwOvqFr3PhggNpbq5Up59ObOdYMe2GsBP_3FrL8ZQfBhNBSmViHQ0fKJSyUHDoC9tKlfs2Bb0rPOBxsZeTPfe-hDuaVT-pZxjQXCKM93sujnWw /frame/frameset/html On Wed, May 14, 2014 at 5:41 PM, Ian Young iyo...@ratespecial.com wrote: I decided to create my own internal realhostip.com. My DNS servers use PowerDNS, not BIND, so the $GENERATE directive was not an option and I didn't want to have to populate my DNS servers' databases with a record for every possible IP address. Fortunately, I found the following Lua script: https://github.com/terbolous/powerdns-cloudstack-proxy-dns I can confirm the Lua script works as expected and my CloudStack server can be tricked into believing my internal DNS servers are the authority for realhostip.com: [root@virthost1 ]# dig +short 1-2-3-4.realhostip.com 1.2.3.4 I followed this guide and updated the console proxy/SSVM SSL certificate with my own *.realhostip.com certificate. http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/latest/systemvm.html#changing-the-console-proxy-ssl-certificate-and-domain The console proxy restarted but it's still blank when I try to view the console. Does the domain have to be something other than realhostip.com?
Re: new installation--ssvm won't start
I wiped the server clean and started over again today. In the process, I realized that, the previous time, I forgot to uncomment the Domain line in /etc/idmapd.conf. However, even though I included the step this time, the GUI installer still seems to hang on the final Creating system VMs step. I see two VMs running when I run virsh list (the secondary storage VM keeps getting regenerated). In the primary storage, it looks like there is one complete 693 MB image but the other two are only 11 and 12 MB, although they are gradually growing. What's happening here? [root@virthost1 ~]# ls -hl /var/primary/ total 715M -rwxr--r--. 1 nobody nobody 11M May 8 09:55 54de167f-ad9c-453b-91c7-fdd644922932 -rwxr--r--. 1 nobody nobody 12M May 8 09:55 91069b66-b1b3-41aa-8995-874fd4353473 -rwxr--r--. 1 nobody nobody 693M May 8 09:16 c2e6efba-d6c7-11e3-9e76-002590c96d30 The management server log keeps reporting that There is no secondary storage VM for secondary storage host nfs://192.168.100.6/var/secondary. Here is a larger section of logs: http://pastebin.com/NFf5cBx3 On Wed, May 7, 2014 at 10:49 AM, Ian Young iyo...@ratespecial.com wrote: I noticed that in Home Infrastructure Zones Zone1, Resources tab, the Secondary Storage says Allocated 0.00 KB / 0.00 KB. However, the secondary storage NFS mount is listed in Home Infrastructure Secondary Storage and the URL is correct. Does this mean the secondary storage is unreachable? On Wed, May 7, 2014 at 10:26 AM, Ian Young iyo...@ratespecial.com wrote: I reinstalled my single server CloudStack system yesterday, following the quick start guide precisely. The only difference was that I used /var/primary and /var/secondary instead of /primary and /secondary, because the /var partition on this machine is very large. The UI installer reached the point where it says Creating system VMs (this may take a while) but never finished. I left it overnight and it still hadn't completed. This is typically the step that fails, most of the times I've installed CloudStack, so I imagine I must be making the same fundamental mistake each time, and I'd like to know what that is. I checked management.log and it's in a loop where it creates a secondary storage VM, fails to start it, destroys it, and tries again. It says Host 1 is unreachable but I'm using the correct password, SELinux is permissive, and all the iptables rules are in place. In what way is it trying to connect to Host 1? SSH? NFS? Here's a log excerpt of messages related to the SSVM: http://pastebin.com/X11A51bh NFS appears to be functional, since CloudStack automatically mounted the primary storage. FilesystemSize Used Avail Use% Mounted on /dev/sda3 20G 1.8G 17G 10% / tmpfs 32G 0 32G 0% /dev/shm /dev/sda1 194M 42M 143M 23% /boot /dev/sda4 1.8T 1.9G 1.7T 1% /var 192.168.100.6:/var/primary 1.8T 1.9G 1.7T 1% /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af How can I identify whatever it is that's preventing the SSVM from starting? Here is another log excerpt, without any filtering: http://pastebin.com/XsPGJQik
Re: new installation--ssvm won't start
I'm using 4.3. The Quick Installation Guide for CentOS (which is what I was following) still has the old URL. I forgot to mention changing the URL was another thing I did differently in order to get it working. http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/latest/qig.html On Tue, May 13, 2014 at 2:01 AM, sebgoa run...@gmail.com wrote: On May 13, 2014, at 10:02 AM, Geoff Higginbottom geoff.higginbot...@shapeblue.com wrote: Just for the record, the latest install doc does have the correct URLs for the System VM Templates http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/installation.html Yep, I just checked the master and the 4.3 version and the url seem correct: http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/installation.html#prepare-the-system-vm-template if it' snot let me know or submit a patch Regards Geoff Higginbottom D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 geoff.higginbot...@shapeblue.com -Original Message- From: dimas yoga pratama [mailto:smid...@gmail.com] Sent: 12 May 2014 17:44 To: users@cloudstack.apache.org Subject: Re: new installation--ssvm won't start Which version of Cloudstack you installd? If you follow the Cloudstack 4.3 installation guide there is a mistake in system template setup section, you should change the old URL with: http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2 hope it works. On Fri, May 9, 2014 at 7:20 AM, Ian Young iyo...@ratespecial.com wrote: I wiped the server clean and started over again today. In the process, I realized that, the previous time, I forgot to uncomment the Domain line in /etc/idmapd.conf. However, even though I included the step this time, the GUI installer still seems to hang on the final Creating system VMs step. I see two VMs running when I run virsh list (the secondary storage VM keeps getting regenerated). In the primary storage, it looks like there is one complete 693 MB image but the other two are only 11 and 12 MB, although they are gradually growing. What's happening here? [root@virthost1 ~]# ls -hl /var/primary/ total 715M -rwxr--r--. 1 nobody nobody 11M May 8 09:55 54de167f-ad9c-453b-91c7-fdd644922932 -rwxr--r--. 1 nobody nobody 12M May 8 09:55 91069b66-b1b3-41aa-8995-874fd4353473 -rwxr--r--. 1 nobody nobody 693M May 8 09:16 c2e6efba-d6c7-11e3-9e76-002590c96d30 The management server log keeps reporting that There is no secondary storage VM for secondary storage host nfs://192.168.100.6/var/secondary . Here is a larger section of logs: http://pastebin.com/NFf5cBx3 On Wed, May 7, 2014 at 10:49 AM, Ian Young iyo...@ratespecial.com wrote: I noticed that in Home Infrastructure Zones Zone1, Resources tab, the Secondary Storage says Allocated 0.00 KB / 0.00 KB. However, the secondary storage NFS mount is listed in Home Infrastructure Secondary Storage and the URL is correct. Does this mean the secondary storage is unreachable? On Wed, May 7, 2014 at 10:26 AM, Ian Young iyo...@ratespecial.com wrote: I reinstalled my single server CloudStack system yesterday, following the quick start guide precisely. The only difference was that I used /var/primary and /var/secondary instead of /primary and /secondary, because the /var partition on this machine is very large. The UI installer reached the point where it says Creating system VMs (this may take a while) but never finished. I left it overnight and it still hadn't completed. This is typically the step that fails, most of the times I've installed CloudStack, so I imagine I must be making the same fundamental mistake each time, and I'd like to know what that is. I checked management.log and it's in a loop where it creates a secondary storage VM, fails to start it, destroys it, and tries again. It says Host 1 is unreachable but I'm using the correct password, SELinux is permissive, and all the iptables rules are in place. In what way is it trying to connect to Host 1? SSH? NFS? Here's a log excerpt of messages related to the SSVM: http://pastebin.com/X11A51bh NFS appears to be functional, since CloudStack automatically mounted the primary storage. FilesystemSize Used Avail Use% Mounted on /dev/sda3 20G 1.8G 17G 10% / tmpfs 32G 0 32G 0% /dev/shm /dev/sda1 194M 42M 143M 23% /boot /dev/sda4 1.8T 1.9G 1.7T 1% /var 192.168.100.6:/var/primary 1.8T 1.9G 1.7T 1% /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af How can I identify whatever it is that's preventing the SSVM from starting? Here is another log excerpt, without any filtering: http://pastebin.com/XsPGJQik Find out more
Re: new installation--ssvm won't start
Exactly. That's the URL I was referring to. I changed it to the 2014-01-14 template and it worked. On Tue, May 13, 2014 at 9:52 AM, dimas yoga pratama smid...@gmail.comwrote: oh okay, I should have read that part as well. What I mean is this : http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/qig.html#system-template-setup On Tue, May 13, 2014 at 4:01 PM, sebgoa run...@gmail.com wrote: On May 13, 2014, at 10:02 AM, Geoff Higginbottom geoff.higginbot...@shapeblue.com wrote: Just for the record, the latest install doc does have the correct URLs for the System VM Templates http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/installation.html Yep, I just checked the master and the 4.3 version and the url seem correct: http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/installation.html#prepare-the-system-vm-template if it' snot let me know or submit a patch Regards Geoff Higginbottom D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581 geoff.higginbot...@shapeblue.com -Original Message- From: dimas yoga pratama [mailto:smid...@gmail.com] Sent: 12 May 2014 17:44 To: users@cloudstack.apache.org Subject: Re: new installation--ssvm won't start Which version of Cloudstack you installd? If you follow the Cloudstack 4.3 installation guide there is a mistake in system template setup section, you should change the old URL with: http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2 hope it works. On Fri, May 9, 2014 at 7:20 AM, Ian Young iyo...@ratespecial.com wrote: I wiped the server clean and started over again today. In the process, I realized that, the previous time, I forgot to uncomment the Domain line in /etc/idmapd.conf. However, even though I included the step this time, the GUI installer still seems to hang on the final Creating system VMs step. I see two VMs running when I run virsh list (the secondary storage VM keeps getting regenerated). In the primary storage, it looks like there is one complete 693 MB image but the other two are only 11 and 12 MB, although they are gradually growing. What's happening here? [root@virthost1 ~]# ls -hl /var/primary/ total 715M -rwxr--r--. 1 nobody nobody 11M May 8 09:55 54de167f-ad9c-453b-91c7-fdd644922932 -rwxr--r--. 1 nobody nobody 12M May 8 09:55 91069b66-b1b3-41aa-8995-874fd4353473 -rwxr--r--. 1 nobody nobody 693M May 8 09:16 c2e6efba-d6c7-11e3-9e76-002590c96d30 The management server log keeps reporting that There is no secondary storage VM for secondary storage host nfs:// 192.168.100.6/var/secondary . Here is a larger section of logs: http://pastebin.com/NFf5cBx3 On Wed, May 7, 2014 at 10:49 AM, Ian Young iyo...@ratespecial.com wrote: I noticed that in Home Infrastructure Zones Zone1, Resources tab, the Secondary Storage says Allocated 0.00 KB / 0.00 KB. However, the secondary storage NFS mount is listed in Home Infrastructure Secondary Storage and the URL is correct. Does this mean the secondary storage is unreachable? On Wed, May 7, 2014 at 10:26 AM, Ian Young iyo...@ratespecial.com wrote: I reinstalled my single server CloudStack system yesterday, following the quick start guide precisely. The only difference was that I used /var/primary and /var/secondary instead of /primary and /secondary, because the /var partition on this machine is very large. The UI installer reached the point where it says Creating system VMs (this may take a while) but never finished. I left it overnight and it still hadn't completed. This is typically the step that fails, most of the times I've installed CloudStack, so I imagine I must be making the same fundamental mistake each time, and I'd like to know what that is. I checked management.log and it's in a loop where it creates a secondary storage VM, fails to start it, destroys it, and tries again. It says Host 1 is unreachable but I'm using the correct password, SELinux is permissive, and all the iptables rules are in place. In what way is it trying to connect to Host 1? SSH? NFS? Here's a log excerpt of messages related to the SSVM: http://pastebin.com/X11A51bh NFS appears to be functional, since CloudStack automatically mounted the primary storage. FilesystemSize Used Avail Use% Mounted on /dev/sda3 20G 1.8G 17G 10% / tmpfs 32G 0 32G 0% /dev/shm /dev/sda1 194M 42M 143M 23% /boot /dev/sda4 1.8T 1.9G 1.7T 1% /var 192.168.100.6:/var/primary 1.8T 1.9G 1.7T 1% /mnt/0594caa2-ceb4-36c6-9b13
Re: new installation--ssvm won't start
I was able to complete the installation on Friday. Two things I did differently that were not mentioned in the quick start guide were to disable requiretty in /etc/sudoers and to set up NFSv4 correctly (i.e. set up a global root directory with fsid=0). I'm not sure how much impact the sudoers configuration had on my problem but I'm pretty sure the NFS setup was the main issue. On Thu, May 8, 2014 at 5:33 PM, Ian Young iyo...@ratespecial.com wrote: I know this has something to do with idmapd and NFS. This error keeps appearing in /var/log/messages: May 8 10:29:54 virthost1 rpc.idmapd[11044]: nss_getpwnam: name '0' does not map into domain 'redacted.com' On Thu, May 8, 2014 at 5:20 PM, Ian Young iyo...@ratespecial.com wrote: I wiped the server clean and started over again today. In the process, I realized that, the previous time, I forgot to uncomment the Domain line in /etc/idmapd.conf. However, even though I included the step this time, the GUI installer still seems to hang on the final Creating system VMs step. I see two VMs running when I run virsh list (the secondary storage VM keeps getting regenerated). In the primary storage, it looks like there is one complete 693 MB image but the other two are only 11 and 12 MB, although they are gradually growing. What's happening here? [root@virthost1 ~]# ls -hl /var/primary/ total 715M -rwxr--r--. 1 nobody nobody 11M May 8 09:55 54de167f-ad9c-453b-91c7-fdd644922932 -rwxr--r--. 1 nobody nobody 12M May 8 09:55 91069b66-b1b3-41aa-8995-874fd4353473 -rwxr--r--. 1 nobody nobody 693M May 8 09:16 c2e6efba-d6c7-11e3-9e76-002590c96d30 The management server log keeps reporting that There is no secondary storage VM for secondary storage host nfs://192.168.100.6/var/secondary. Here is a larger section of logs: http://pastebin.com/NFf5cBx3 On Wed, May 7, 2014 at 10:49 AM, Ian Young iyo...@ratespecial.comwrote: I noticed that in Home Infrastructure Zones Zone1, Resources tab, the Secondary Storage says Allocated 0.00 KB / 0.00 KB. However, the secondary storage NFS mount is listed in Home Infrastructure Secondary Storage and the URL is correct. Does this mean the secondary storage is unreachable? On Wed, May 7, 2014 at 10:26 AM, Ian Young iyo...@ratespecial.comwrote: I reinstalled my single server CloudStack system yesterday, following the quick start guide precisely. The only difference was that I used /var/primary and /var/secondary instead of /primary and /secondary, because the /var partition on this machine is very large. The UI installer reached the point where it says Creating system VMs (this may take a while) but never finished. I left it overnight and it still hadn't completed. This is typically the step that fails, most of the times I've installed CloudStack, so I imagine I must be making the same fundamental mistake each time, and I'd like to know what that is. I checked management.log and it's in a loop where it creates a secondary storage VM, fails to start it, destroys it, and tries again. It says Host 1 is unreachable but I'm using the correct password, SELinux is permissive, and all the iptables rules are in place. In what way is it trying to connect to Host 1? SSH? NFS? Here's a log excerpt of messages related to the SSVM: http://pastebin.com/X11A51bh NFS appears to be functional, since CloudStack automatically mounted the primary storage. FilesystemSize Used Avail Use% Mounted on /dev/sda3 20G 1.8G 17G 10% / tmpfs 32G 0 32G 0% /dev/shm /dev/sda1 194M 42M 143M 23% /boot /dev/sda4 1.8T 1.9G 1.7T 1% /var 192.168.100.6:/var/primary 1.8T 1.9G 1.7T 1% /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af How can I identify whatever it is that's preventing the SSVM from starting? Here is another log excerpt, without any filtering: http://pastebin.com/XsPGJQik
basic networking, single server
I'm reinstalling CloudStack on a single server with lots of RAM, CPU cores, and storage. I also have a single 192.168.100.0/24 private network, which was set up before I was hired and can't be easily reconfigured due to the high number of employee workstations currently connected to it and occupying IP addresses across this range. I see that the CloudStack documentation strongly recommends separate NICs for management traffic and guest traffic. This server does have two NICs, so what would be the ideal way to configure the network? Another switch with a different subnet for the management network? What about the storage network?
Re: basic networking, single server
I forgot to mention this system is entirely for internal purposes. We don't need a public network. On Mon, May 5, 2014 at 3:39 PM, Ian Young iyo...@ratespecial.com wrote: I'm reinstalling CloudStack on a single server with lots of RAM, CPU cores, and storage. I also have a single 192.168.100.0/24 private network, which was set up before I was hired and can't be easily reconfigured due to the high number of employee workstations currently connected to it and occupying IP addresses across this range. I see that the CloudStack documentation strongly recommends separate NICs for management traffic and guest traffic. This server does have two NICs, so what would be the ideal way to configure the network? Another switch with a different subnet for the management network? What about the storage network?
management server IP address
My management server IP address has always been 192.168.100.6. Ever since I upgraded to 4.3, it's been set to 127.0.0.1 (mshost.service_ip in the database). Where is this value being set and how can I change it back to the original IP address?
Re: failed to start virtual router
I think my problem stems from a partially downloaded system VM template. I just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must have been interrupted during the upgrade to 4.3. At the moment I've rolled back to 4.2.1 with a somewhat usable management interface, although the system VMs won't start. I suspect there is something in the database that is causing it to try to use the 4.3 template. How can I delete the template and make sure the management server is using the older one? On Tue, Apr 29, 2014 at 8:23 PM, Ian Young iyo...@ratespecial.com wrote: Ok, so I've figured out a way to identify volumes in the filesystem. For instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is the root volume for an instance I want to back up. Is this in qcow2 format or something else? I'm using KVM. On Tue, Apr 29, 2014 at 7:38 PM, Ian Young iyo...@ratespecial.com wrote: Now I can't start cloudstack-agent. The agent.log says: Unable to start agent: Failed to get private nic name I know this is because the network bridge is no longer set up correctly. I used to have a cloud0 and a cloudbr0 interface. Now I only have cloudbr0. I haven't changed my network configuration. Somehow it's been changed by CloudStack during the upgrade/downgrade. This is getting worse and worse the more I try to recover my data. Is there any way to back up the instances' volumes via the command line? I can't tell which is which because the filenames are all hashes. I really need to get these instances up and running--there are several months worth of work at stake here. On Tue, Apr 29, 2014 at 6:13 PM, ma y breeze7...@gmail.com wrote: I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely? 2014-04-30 8:45 GMT+08:00 Ian Young iyo...@ratespecial.com: Ok, my Cloudstack installation is now so broken that I think it's probably best to backup all my instances and templates, wipe the databases, and start from scratch. However, I can't take snapshots or download volumes anymore. What's causing these errors? 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,269 WARN [o.a.c.s.d.ObjectInDataStoreManagerImpl] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object (VOLUME, org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@7bbbd901), no need to delete from object in store ref table 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher] (Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd com.cloud.utils.exception.CloudRuntimeException: Failed to copy the volume from the source primary storage pool to secondary storage. 2014-04-29 17:40:51,282 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Job-Executor-11:ctx-0a3ead79) Complete async job-501, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/null/{uuidList:[],errorcode:530,errortext:Failed to copy the volume from the source primary storage pool to secondary storage.} On Tue, Apr 29, 2014 at 4:15 PM, Ian Young iyo...@ratespecial.com wrote: I downgraded to 4.2.1 again but cloudstack-management won't start because the database is version 4.3. Is it safe to restore the database backup I made prior to this whole process? In the meantime I have destroyed and created system VMs, so I'm not sure it's a good idea. On Apr 29, 2014 3:09 PM, Ian Young iyo...@ratespecial.com wrote: @stevenliang: I take it back--you can't set the VM size when you register the template. On Tue, Apr 29, 2014 at 3:02 PM, motty cruz motty.c...@gmail.com wrote: yes, you would have to shutdown the router, then click on Change Service Offering restart the VR. To Ian, I suspect you forgot the last step: cloudstack-setup-management that would fix your issue, I think, Thanks, --- I downgraded to 4.2.1 and then upgraded to 4.3. Now the cloudstack-management service can't start because it can't connect to the database. 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable to get a new db connection Caused by: java.sql.SQLException: Access denied for user 'cloud
Re: failed to start virtual router
Yes, I restored the DB from the backup. When I try to start the router it says: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying The management server log says: 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl] (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ]) Unexpected exception while executing org.apache.cloudstack.api.command.admin.router.StartRouterCmd com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying On Wed, Apr 30, 2014 at 12:02 PM, stevenliang stevenli...@yesup.com wrote: I think you had backed up database, when you upgraded. When you downgraded CS, you also need to restore DB. On 30/04/14 02:58 PM, Ian Young wrote: I think my problem stems from a partially downloaded system VM template. I just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must have been interrupted during the upgrade to 4.3. At the moment I've rolled back to 4.2.1 with a somewhat usable management interface, although the system VMs won't start. I suspect there is something in the database that is causing it to try to use the 4.3 template. How can I delete the template and make sure the management server is using the older one? On Tue, Apr 29, 2014 at 8:23 PM, Ian Young iyo...@ratespecial.com wrote: Ok, so I've figured out a way to identify volumes in the filesystem. For instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is the root volume for an instance I want to back up. Is this in qcow2 format or something else? I'm using KVM. On Tue, Apr 29, 2014 at 7:38 PM, Ian Young iyo...@ratespecial.com wrote: Now I can't start cloudstack-agent. The agent.log says: Unable to start agent: Failed to get private nic name I know this is because the network bridge is no longer set up correctly. I used to have a cloud0 and a cloudbr0 interface. Now I only have cloudbr0. I haven't changed my network configuration. Somehow it's been changed by CloudStack during the upgrade/downgrade. This is getting worse and worse the more I try to recover my data. Is there any way to back up the instances' volumes via the command line? I can't tell which is which because the filenames are all hashes. I really need to get these instances up and running--there are several months worth of work at stake here. On Tue, Apr 29, 2014 at 6:13 PM, ma y breeze7...@gmail.com wrote: I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely? 2014-04-30 8:45 GMT+08:00 Ian Young iyo...@ratespecial.com: Ok, my Cloudstack installation is now so broken that I think it's probably best to backup all my instances and templates, wipe the databases, and start from scratch. However, I can't take snapshots or download volumes anymore. What's causing these errors? 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,269 WARN [o.a.c.s.d. ObjectInDataStoreManagerImpl] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object (VOLUME, org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@7bbbd901 ), no need to delete from object in store ref table 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher] (Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd com.cloud.utils.exception.CloudRuntimeException: Failed to copy the volume from the source primary storage pool to secondary storage. 2014-04-29 17:40:51,282 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Job-Executor-11:ctx-0a3ead79) Complete async job-501, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/ null/{uuidList:[],errorcode:530,errortext:Failed to copy the volume from the source primary storage pool to secondary storage.} On Tue, Apr 29, 2014 at 4:15 PM, Ian Young iyo...@ratespecial.com wrote: I downgraded to 4.2.1 again but cloudstack-management won't start because the database is version 4.3. Is it safe to restore the database backup I made prior
Re: failed to start virtual router
Yes, I replaced the new files with the rpmsave ones, which allowed the agent to start. However, most of the functions in the management console fail. On Wed, Apr 30, 2014 at 12:34 PM, stevenliang stevenli...@yesup.com wrote: Do you have the file db.properties.rpmsave on management server and agent.properties.rpmsave on agents? If so, and the date is correct, you can use it rather than db.properties and agent.properties. And then restart management and agent services. On 30/04/14 03:26 PM, Ian Young wrote: Yes, I restored the DB from the backup. When I try to start the router it says: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying The management server log says: 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl] (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ]) Unexpected exception while executing org.apache.cloudstack.api.command.admin.router.StartRouterCmd com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying On Wed, Apr 30, 2014 at 12:02 PM, stevenliang stevenli...@yesup.com wrote: I think you had backed up database, when you upgraded. When you downgraded CS, you also need to restore DB. On 30/04/14 02:58 PM, Ian Young wrote: I think my problem stems from a partially downloaded system VM template. I just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must have been interrupted during the upgrade to 4.3. At the moment I've rolled back to 4.2.1 with a somewhat usable management interface, although the system VMs won't start. I suspect there is something in the database that is causing it to try to use the 4.3 template. How can I delete the template and make sure the management server is using the older one? On Tue, Apr 29, 2014 at 8:23 PM, Ian Young iyo...@ratespecial.com wrote: Ok, so I've figured out a way to identify volumes in the filesystem. For instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is the root volume for an instance I want to back up. Is this in qcow2 format or something else? I'm using KVM. On Tue, Apr 29, 2014 at 7:38 PM, Ian Young iyo...@ratespecial.com wrote: Now I can't start cloudstack-agent. The agent.log says: Unable to start agent: Failed to get private nic name I know this is because the network bridge is no longer set up correctly. I used to have a cloud0 and a cloudbr0 interface. Now I only have cloudbr0. I haven't changed my network configuration. Somehow it's been changed by CloudStack during the upgrade/downgrade. This is getting worse and worse the more I try to recover my data. Is there any way to back up the instances' volumes via the command line? I can't tell which is which because the filenames are all hashes. I really need to get these instances up and running--there are several months worth of work at stake here. On Tue, Apr 29, 2014 at 6:13 PM, ma y breeze7...@gmail.com wrote: I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely? 2014-04-30 8:45 GMT+08:00 Ian Young iyo...@ratespecial.com: Ok, my Cloudstack installation is now so broken that I think it's probably best to backup all my instances and templates, wipe the databases, and start from scratch. However, I can't take snapshots or download volumes anymore. What's causing these errors? 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,269 WARN [o.a.c.s.d. ObjectInDataStoreManagerImpl] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object (VOLUME, org.apache.cloudstack.storage.datastore. PrimaryDataStoreImpl@7bbbd901 ), no need to delete from object in store ref table 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher] (Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd com.cloud.utils.exception.CloudRuntimeException: Failed to copy the volume from the source primary storage pool to secondary storage. 2014-04-29 17:40:51,282 DEBUG
Re: failed to start virtual router
The address in Infrastructure Hosts (management server) is set to the correct IP address, not 127.0.0.1. Why are the logs referring to 127.0.0.1? On Wed, Apr 30, 2014 at 3:00 PM, Ian Young iyo...@ratespecial.com wrote: I notice my dashboard says Management server node 127.0.0.1 is up. It used to have an actual address, not localhost. Could this be causing problems and if so, how can I set it back? On Wed, Apr 30, 2014 at 12:40 PM, Ian Young iyo...@ratespecial.comwrote: Yes, I replaced the new files with the rpmsave ones, which allowed the agent to start. However, most of the functions in the management console fail. On Wed, Apr 30, 2014 at 12:34 PM, stevenliang stevenli...@yesup.comwrote: Do you have the file db.properties.rpmsave on management server and agent.properties.rpmsave on agents? If so, and the date is correct, you can use it rather than db.properties and agent.properties. And then restart management and agent services. On 30/04/14 03:26 PM, Ian Young wrote: Yes, I restored the DB from the backup. When I try to start the router it says: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying The management server log says: 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl] (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ]) Unexpected exception while executing org.apache.cloudstack.api.command.admin.router.StartRouterCmd com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying On Wed, Apr 30, 2014 at 12:02 PM, stevenliang stevenli...@yesup.com wrote: I think you had backed up database, when you upgraded. When you downgraded CS, you also need to restore DB. On 30/04/14 02:58 PM, Ian Young wrote: I think my problem stems from a partially downloaded system VM template. I just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must have been interrupted during the upgrade to 4.3. At the moment I've rolled back to 4.2.1 with a somewhat usable management interface, although the system VMs won't start. I suspect there is something in the database that is causing it to try to use the 4.3 template. How can I delete the template and make sure the management server is using the older one? On Tue, Apr 29, 2014 at 8:23 PM, Ian Young iyo...@ratespecial.com wrote: Ok, so I've figured out a way to identify volumes in the filesystem. For instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is the root volume for an instance I want to back up. Is this in qcow2 format or something else? I'm using KVM. On Tue, Apr 29, 2014 at 7:38 PM, Ian Young iyo...@ratespecial.com wrote: Now I can't start cloudstack-agent. The agent.log says: Unable to start agent: Failed to get private nic name I know this is because the network bridge is no longer set up correctly. I used to have a cloud0 and a cloudbr0 interface. Now I only have cloudbr0. I haven't changed my network configuration. Somehow it's been changed by CloudStack during the upgrade/downgrade. This is getting worse and worse the more I try to recover my data. Is there any way to back up the instances' volumes via the command line? I can't tell which is which because the filenames are all hashes. I really need to get these instances up and running--there are several months worth of work at stake here. On Tue, Apr 29, 2014 at 6:13 PM, ma y breeze7...@gmail.com wrote: I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely? 2014-04-30 8:45 GMT+08:00 Ian Young iyo...@ratespecial.com: Ok, my Cloudstack installation is now so broken that I think it's probably best to backup all my instances and templates, wipe the databases, and start from scratch. However, I can't take snapshots or download volumes anymore. What's causing these errors? 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m. AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m. AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,269 WARN [o.a.c.s.d. ObjectInDataStoreManagerImpl] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object (VOLUME
Re: failed to start virtual router
I notice my dashboard says Management server node 127.0.0.1 is up. It used to have an actual address, not localhost. Could this be causing problems and if so, how can I set it back? On Wed, Apr 30, 2014 at 12:40 PM, Ian Young iyo...@ratespecial.com wrote: Yes, I replaced the new files with the rpmsave ones, which allowed the agent to start. However, most of the functions in the management console fail. On Wed, Apr 30, 2014 at 12:34 PM, stevenliang stevenli...@yesup.comwrote: Do you have the file db.properties.rpmsave on management server and agent.properties.rpmsave on agents? If so, and the date is correct, you can use it rather than db.properties and agent.properties. And then restart management and agent services. On 30/04/14 03:26 PM, Ian Young wrote: Yes, I restored the DB from the backup. When I try to start the router it says: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying The management server log says: 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl] (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ]) Unexpected exception while executing org.apache.cloudstack.api.command.admin.router.StartRouterCmd com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying On Wed, Apr 30, 2014 at 12:02 PM, stevenliang stevenli...@yesup.com wrote: I think you had backed up database, when you upgraded. When you downgraded CS, you also need to restore DB. On 30/04/14 02:58 PM, Ian Young wrote: I think my problem stems from a partially downloaded system VM template. I just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must have been interrupted during the upgrade to 4.3. At the moment I've rolled back to 4.2.1 with a somewhat usable management interface, although the system VMs won't start. I suspect there is something in the database that is causing it to try to use the 4.3 template. How can I delete the template and make sure the management server is using the older one? On Tue, Apr 29, 2014 at 8:23 PM, Ian Young iyo...@ratespecial.com wrote: Ok, so I've figured out a way to identify volumes in the filesystem. For instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is the root volume for an instance I want to back up. Is this in qcow2 format or something else? I'm using KVM. On Tue, Apr 29, 2014 at 7:38 PM, Ian Young iyo...@ratespecial.com wrote: Now I can't start cloudstack-agent. The agent.log says: Unable to start agent: Failed to get private nic name I know this is because the network bridge is no longer set up correctly. I used to have a cloud0 and a cloudbr0 interface. Now I only have cloudbr0. I haven't changed my network configuration. Somehow it's been changed by CloudStack during the upgrade/downgrade. This is getting worse and worse the more I try to recover my data. Is there any way to back up the instances' volumes via the command line? I can't tell which is which because the filenames are all hashes. I really need to get these instances up and running--there are several months worth of work at stake here. On Tue, Apr 29, 2014 at 6:13 PM, ma y breeze7...@gmail.com wrote: I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely? 2014-04-30 8:45 GMT+08:00 Ian Young iyo...@ratespecial.com: Ok, my Cloudstack installation is now so broken that I think it's probably best to backup all my instances and templates, wipe the databases, and start from scratch. However, I can't take snapshots or download volumes anymore. What's causing these errors? 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m. AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m. AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,269 WARN [o.a.c.s.d. ObjectInDataStoreManagerImpl] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object (VOLUME, org.apache.cloudstack.storage.datastore. PrimaryDataStoreImpl@7bbbd901 ), no need to delete from object in store ref table 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher] (Job-Executor-11:ctx-0a3ead79) Unexpected
Re: failed to start virtual router
I've tried upgrading to 4.3 again. After poking around some more in the database, I've discovered that the KVM system VM template was only 27% downloaded. I think this is why the virtual router was unable to start--the template was incomplete. Is there a way to force it to resume downloading? On Wed, Apr 30, 2014 at 3:03 PM, Ian Young iyo...@ratespecial.com wrote: The address in Infrastructure Hosts (management server) is set to the correct IP address, not 127.0.0.1. Why are the logs referring to 127.0.0.1? On Wed, Apr 30, 2014 at 3:00 PM, Ian Young iyo...@ratespecial.com wrote: I notice my dashboard says Management server node 127.0.0.1 is up. It used to have an actual address, not localhost. Could this be causing problems and if so, how can I set it back? On Wed, Apr 30, 2014 at 12:40 PM, Ian Young iyo...@ratespecial.comwrote: Yes, I replaced the new files with the rpmsave ones, which allowed the agent to start. However, most of the functions in the management console fail. On Wed, Apr 30, 2014 at 12:34 PM, stevenliang stevenli...@yesup.comwrote: Do you have the file db.properties.rpmsave on management server and agent.properties.rpmsave on agents? If so, and the date is correct, you can use it rather than db.properties and agent.properties. And then restart management and agent services. On 30/04/14 03:26 PM, Ian Young wrote: Yes, I restored the DB from the backup. When I try to start the router it says: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying The management server log says: 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl] (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ]) Unexpected exception while executing org.apache.cloudstack.api.command.admin.router.StartRouterCmd com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying On Wed, Apr 30, 2014 at 12:02 PM, stevenliang stevenli...@yesup.com wrote: I think you had backed up database, when you upgraded. When you downgraded CS, you also need to restore DB. On 30/04/14 02:58 PM, Ian Young wrote: I think my problem stems from a partially downloaded system VM template. I just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must have been interrupted during the upgrade to 4.3. At the moment I've rolled back to 4.2.1 with a somewhat usable management interface, although the system VMs won't start. I suspect there is something in the database that is causing it to try to use the 4.3 template. How can I delete the template and make sure the management server is using the older one? On Tue, Apr 29, 2014 at 8:23 PM, Ian Young iyo...@ratespecial.com wrote: Ok, so I've figured out a way to identify volumes in the filesystem. For instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is the root volume for an instance I want to back up. Is this in qcow2 format or something else? I'm using KVM. On Tue, Apr 29, 2014 at 7:38 PM, Ian Young iyo...@ratespecial.com wrote: Now I can't start cloudstack-agent. The agent.log says: Unable to start agent: Failed to get private nic name I know this is because the network bridge is no longer set up correctly. I used to have a cloud0 and a cloudbr0 interface. Now I only have cloudbr0. I haven't changed my network configuration. Somehow it's been changed by CloudStack during the upgrade/downgrade. This is getting worse and worse the more I try to recover my data. Is there any way to back up the instances' volumes via the command line? I can't tell which is which because the filenames are all hashes. I really need to get these instances up and running--there are several months worth of work at stake here. On Tue, Apr 29, 2014 at 6:13 PM, ma y breeze7...@gmail.com wrote: I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely? 2014-04-30 8:45 GMT+08:00 Ian Young iyo...@ratespecial.com: Ok, my Cloudstack installation is now so broken that I think it's probably best to backup all my instances and templates, wipe the databases, and start from scratch. However, I can't take snapshots or download volumes anymore. What's causing these errors? 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m. AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m. AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed
Re: failed to start virtual router
I read this article about upgrading the system VMs: https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+4.2+(KVM)+System+Vm+Upgrade However, there's just an empty set in the template_host_ref table. Is this no longer used in 4.3? On Wed, Apr 30, 2014 at 5:41 PM, Ian Young iyo...@ratespecial.com wrote: I've tried upgrading to 4.3 again. After poking around some more in the database, I've discovered that the KVM system VM template was only 27% downloaded. I think this is why the virtual router was unable to start--the template was incomplete. Is there a way to force it to resume downloading? On Wed, Apr 30, 2014 at 3:03 PM, Ian Young iyo...@ratespecial.com wrote: The address in Infrastructure Hosts (management server) is set to the correct IP address, not 127.0.0.1. Why are the logs referring to 127.0.0.1? On Wed, Apr 30, 2014 at 3:00 PM, Ian Young iyo...@ratespecial.comwrote: I notice my dashboard says Management server node 127.0.0.1 is up. It used to have an actual address, not localhost. Could this be causing problems and if so, how can I set it back? On Wed, Apr 30, 2014 at 12:40 PM, Ian Young iyo...@ratespecial.comwrote: Yes, I replaced the new files with the rpmsave ones, which allowed the agent to start. However, most of the functions in the management console fail. On Wed, Apr 30, 2014 at 12:34 PM, stevenliang stevenli...@yesup.comwrote: Do you have the file db.properties.rpmsave on management server and agent.properties.rpmsave on agents? If so, and the date is correct, you can use it rather than db.properties and agent.properties. And then restart management and agent services. On 30/04/14 03:26 PM, Ian Young wrote: Yes, I restored the DB from the backup. When I try to start the router it says: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying The management server log says: 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl] (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ]) Unexpected exception while executing org.apache.cloudstack.api.command.admin.router.StartRouterCmd com.cloud.exception.AgentUnavailableException: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying On Wed, Apr 30, 2014 at 12:02 PM, stevenliang stevenli...@yesup.com wrote: I think you had backed up database, when you upgraded. When you downgraded CS, you also need to restore DB. On 30/04/14 02:58 PM, Ian Young wrote: I think my problem stems from a partially downloaded system VM template. I just noticed systemvm-kvm-4.3 is stuck at 27% downloaded. It must have been interrupted during the upgrade to 4.3. At the moment I've rolled back to 4.2.1 with a somewhat usable management interface, although the system VMs won't start. I suspect there is something in the database that is causing it to try to use the 4.3 template. How can I delete the template and make sure the management server is using the older one? On Tue, Apr 29, 2014 at 8:23 PM, Ian Young iyo...@ratespecial.com wrote: Ok, so I've figured out a way to identify volumes in the filesystem. For instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is the root volume for an instance I want to back up. Is this in qcow2 format or something else? I'm using KVM. On Tue, Apr 29, 2014 at 7:38 PM, Ian Young iyo...@ratespecial.com wrote: Now I can't start cloudstack-agent. The agent.log says: Unable to start agent: Failed to get private nic name I know this is because the network bridge is no longer set up correctly. I used to have a cloud0 and a cloudbr0 interface. Now I only have cloudbr0. I haven't changed my network configuration. Somehow it's been changed by CloudStack during the upgrade/downgrade. This is getting worse and worse the more I try to recover my data. Is there any way to back up the instances' volumes via the command line? I can't tell which is which because the filenames are all hashes. I really need to get these instances up and running--there are several months worth of work at stake here. On Tue, Apr 29, 2014 at 6:13 PM, ma y breeze7...@gmail.com wrote: I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely? 2014-04-30 8:45 GMT+08:00 Ian Young iyo...@ratespecial.com: Ok, my Cloudstack installation is now so broken that I think it's probably best to backup all my instances and templates, wipe the databases, and start from scratch. However, I can't take snapshots or download volumes anymore. What's causing these errors? 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m. AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed
Re: failed to start virtual router
I destroyed the old virtual router and was able to create a new one by adding a new instance. However, this new router also failed to start, citing the same error. After that, the expungement delay elapsed and the virtual router was expunged, so now I have none. On Mon, Apr 28, 2014 at 8:52 PM, Ian Young iyo...@ratespecial.com wrote: I upgraded from 4.2.1 to 4.3.0 tonight, following the instructions here: http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 At the last step, I tried to restart the system VMs. The virtual router failed to start. Here is the message that was displayed in the web UI: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not retrying I tried running the script to restart the VMs but this time it failed to start the console proxy: [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u cloud -p -a Stopping and starting 1 secondary storage vm(s)... Done stopping and starting secondary storage vm(s) Stopping and starting 1 console proxy vm(s)... ERROR: Failed to start console proxy vm with id 2 Done stopping and starting console proxy vm(s) . Stopping and starting 0 running routing vm(s)... Is there a way to wipe the system VMs out and start over?
Re: failed to start virtual router
Did rolling back to 4.2 fix the problem? On Tue, Apr 29, 2014 at 1:22 PM, stevenliang stevenli...@yesup.com wrote: I met your situation before. Finally I rolled back to 4.2 On 29/04/14 04:18 PM, Ian Young wrote: I destroyed the old virtual router and was able to create a new one by adding a new instance. However, this new router also failed to start, citing the same error. After that, the expungement delay elapsed and the virtual router was expunged, so now I have none. On Mon, Apr 28, 2014 at 8:52 PM, Ian Young iyo...@ratespecial.com wrote: I upgraded from 4.2.1 to 4.3.0 tonight, following the instructions here: http://docs.cloudstack.apache.org/projects/cloudstack- release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 At the last step, I tried to restart the system VMs. The virtual router failed to start. Here is the message that was displayed in the web UI: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not retrying I tried running the script to restart the VMs but this time it failed to start the console proxy: [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u cloud -p -a Stopping and starting 1 secondary storage vm(s)... Done stopping and starting secondary storage vm(s) Stopping and starting 1 console proxy vm(s)... ERROR: Failed to start console proxy vm with id 2 Done stopping and starting console proxy vm(s) . Stopping and starting 0 running routing vm(s)... Is there a way to wipe the system VMs out and start over?
Re: failed to start virtual router
I think you can do that when you register the new templates in step 1 of this guide: http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 On Tue, Apr 29, 2014 at 2:53 PM, motty cruz motty.c...@gmail.com wrote: for my VR, I created a new System Offering For Software Router CPU in (MHz) 1.00GHz Memory (in MB) 1.00GB this are my current offerings, I'm sure the more RAM and CPU better performance. Thanks, On Tue, Apr 29, 2014 at 2:44 PM, stevenliang stevenli...@yesup.com wrote: Thank you again, motty. I didn't notice this earlier. BTW, how did you make your vr had 1GB CPU and 512MB RAM? On 29/04/14 05:33 PM, motty cruz wrote: Stevellang, I not sure if you saw this in the forums earlier : http://mail-archives.apache.org/mod_mbox/cloudstack-users/201404.mbox/% 3CCALoOYy6A10bz1zOQQs1VyFb9epqLfhf7mu6hc=c2rfedroy...@mail.gmail.com%3E I don't know if the bug was fixed yet, I will try upgrade in the next couple of days on a testing cluster, will report back if the bug was fixed. Thanks, On Tue, Apr 29, 2014 at 2:25 PM, stevenliang stevenli...@yesup.com wrote: Thank you, motty. I am also running kvm. Since that time I failed upgrade, I am still using 4.2.1. I'll try as your advice. On 29/04/14 05:19 PM, motty cruz wrote: Stevenllang, I had the similar issue with VR, I notice it was because I leave the default system specs on the VR, for instance by default 500MHz on CPU and 128MB on RAM, if you upgrade to at least 1GB on CPU and 512MB of RAM your VR will survive the upgrade from 4.2.1 to 4.3.1. I am running KVM, when I upgrade from 4.2.1 to 4.3 my VMs were not able to access outside world, even if I created a new router. wish you the best, -motty On Tue, Apr 29, 2014 at 2:13 PM, stevenliang stevenli...@yesup.com wrote: Yes, I had two zones(one is basic, another is advanced mode). After I upgraded from 4.2.1 to 4.3, the vrouter lost. So I rolled back to 4.2.1, the vrouter came back. On 29/04/14 04:54 PM, Ian Young wrote: Did rolling back to 4.2 fix the problem? On Tue, Apr 29, 2014 at 1:22 PM, stevenliang stevenli...@yesup.com wrote: I met your situation before. Finally I rolled back to 4.2 On 29/04/14 04:18 PM, Ian Young wrote: I destroyed the old virtual router and was able to create a new one by adding a new instance. However, this new router also failed to start, citing the same error. After that, the expungement delay elapsed and the virtual router was expunged, so now I have none. On Mon, Apr 28, 2014 at 8:52 PM, Ian Young iyo...@ratespecial.com wrote: I upgraded from 4.2.1 to 4.3.0 tonight, following the instructions here: http://docs.cloudstack.apache.org/projects/cloudstack- release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 At the last step, I tried to restart the system VMs. The virtual router failed to start. Here is the message that was displayed in the web UI: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not retrying I tried running the script to restart the VMs but this time it failed to start the console proxy: [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u cloud -p -a Stopping and starting 1 secondary storage vm(s)... Done stopping and starting secondary storage vm(s) Stopping and starting 1 console proxy vm(s)... ERROR: Failed to start console proxy vm with id 2 Done stopping and starting console proxy vm(s) . Stopping and starting 0 running routing vm(s)... Is there a way to wipe the system VMs out and start over?
Re: failed to start virtual router
I downgraded to 4.2.1 and then upgraded to 4.3. Now the cloudstack-management service can't start because it can't connect to the database. 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable to get a new db connection Caused by: java.sql.SQLException: Access denied for user 'cloud'@'localhost' (using password: YES) Where are the credentials stored? On Tue, Apr 29, 2014 at 2:55 PM, Ian Young iyo...@ratespecial.com wrote: I think you can do that when you register the new templates in step 1 of this guide: http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 On Tue, Apr 29, 2014 at 2:53 PM, motty cruz motty.c...@gmail.com wrote: for my VR, I created a new System Offering For Software Router CPU in (MHz) 1.00GHz Memory (in MB) 1.00GB this are my current offerings, I'm sure the more RAM and CPU better performance. Thanks, On Tue, Apr 29, 2014 at 2:44 PM, stevenliang stevenli...@yesup.com wrote: Thank you again, motty. I didn't notice this earlier. BTW, how did you make your vr had 1GB CPU and 512MB RAM? On 29/04/14 05:33 PM, motty cruz wrote: Stevellang, I not sure if you saw this in the forums earlier : http://mail-archives.apache.org/mod_mbox/cloudstack-users/201404.mbox/% 3CCALoOYy6A10bz1zOQQs1VyFb9epqLfhf7mu6hc=c2rfedroy...@mail.gmail.com %3E I don't know if the bug was fixed yet, I will try upgrade in the next couple of days on a testing cluster, will report back if the bug was fixed. Thanks, On Tue, Apr 29, 2014 at 2:25 PM, stevenliang stevenli...@yesup.com wrote: Thank you, motty. I am also running kvm. Since that time I failed upgrade, I am still using 4.2.1. I'll try as your advice. On 29/04/14 05:19 PM, motty cruz wrote: Stevenllang, I had the similar issue with VR, I notice it was because I leave the default system specs on the VR, for instance by default 500MHz on CPU and 128MB on RAM, if you upgrade to at least 1GB on CPU and 512MB of RAM your VR will survive the upgrade from 4.2.1 to 4.3.1. I am running KVM, when I upgrade from 4.2.1 to 4.3 my VMs were not able to access outside world, even if I created a new router. wish you the best, -motty On Tue, Apr 29, 2014 at 2:13 PM, stevenliang stevenli...@yesup.com wrote: Yes, I had two zones(one is basic, another is advanced mode). After I upgraded from 4.2.1 to 4.3, the vrouter lost. So I rolled back to 4.2.1, the vrouter came back. On 29/04/14 04:54 PM, Ian Young wrote: Did rolling back to 4.2 fix the problem? On Tue, Apr 29, 2014 at 1:22 PM, stevenliang stevenli...@yesup.com wrote: I met your situation before. Finally I rolled back to 4.2 On 29/04/14 04:18 PM, Ian Young wrote: I destroyed the old virtual router and was able to create a new one by adding a new instance. However, this new router also failed to start, citing the same error. After that, the expungement delay elapsed and the virtual router was expunged, so now I have none. On Mon, Apr 28, 2014 at 8:52 PM, Ian Young iyo...@ratespecial.com wrote: I upgraded from 4.2.1 to 4.3.0 tonight, following the instructions here: http://docs.cloudstack.apache.org/projects/cloudstack- release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 At the last step, I tried to restart the system VMs. The virtual router failed to start. Here is the message that was displayed in the web UI: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not retrying I tried running the script to restart the VMs but this time it failed to start the console proxy: [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u cloud -p -a Stopping and starting 1 secondary storage vm(s)... Done stopping and starting secondary storage vm(s) Stopping and starting 1 console proxy vm(s)... ERROR: Failed to start console proxy vm with id 2 Done stopping and starting console proxy vm(s) . Stopping and starting 0 running routing vm(s)... Is there a way to wipe the system VMs out and start over?
Re: failed to start virtual router
@stevenliang: I take it back--you can't set the VM size when you register the template. On Tue, Apr 29, 2014 at 3:02 PM, motty cruz motty.c...@gmail.com wrote: yes, you would have to shutdown the router, then click on Change Service Offering restart the VR. To Ian, I suspect you forgot the last step: cloudstack-setup-management that would fix your issue, I think, Thanks, --- I downgraded to 4.2.1 and then upgraded to 4.3. Now the cloudstack-management service can't start because it can't connect to the database. 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable to get a new db connection Caused by: java.sql.SQLException: Access denied for user 'cloud'@ 'localhost' (using password: YES) Where are the credentials stored? On Tue, Apr 29, 2014 at 2:57 PM, stevenliang stevenli...@yesup.com wrote: oh, then change service offering for vr? On 29/04/14 05:53 PM, motty cruz wrote: for my VR, I created a new System Offering For Software Router CPU in (MHz) 1.00GHz Memory (in MB) 1.00GB this are my current offerings, I'm sure the more RAM and CPU better performance. Thanks, On Tue, Apr 29, 2014 at 2:44 PM, stevenliang stevenli...@yesup.com wrote: Thank you again, motty. I didn't notice this earlier. BTW, how did you make your vr had 1GB CPU and 512MB RAM? On 29/04/14 05:33 PM, motty cruz wrote: Stevellang, I not sure if you saw this in the forums earlier : http://mail-archives.apache.org/mod_mbox/cloudstack-users/201404.mbox/% 3CCALoOYy6A10bz1zOQQs1VyFb9epqLfhf7mu6hc=c2rfedroy...@mail.gmail.com %3E I don't know if the bug was fixed yet, I will try upgrade in the next couple of days on a testing cluster, will report back if the bug was fixed. Thanks, On Tue, Apr 29, 2014 at 2:25 PM, stevenliang stevenli...@yesup.com wrote: Thank you, motty. I am also running kvm. Since that time I failed upgrade, I am still using 4.2.1. I'll try as your advice. On 29/04/14 05:19 PM, motty cruz wrote: Stevenllang, I had the similar issue with VR, I notice it was because I leave the default system specs on the VR, for instance by default 500MHz on CPU and 128MB on RAM, if you upgrade to at least 1GB on CPU and 512MB of RAM your VR will survive the upgrade from 4.2.1 to 4.3.1. I am running KVM, when I upgrade from 4.2.1 to 4.3 my VMs were not able to access outside world, even if I created a new router. wish you the best, -motty On Tue, Apr 29, 2014 at 2:13 PM, stevenliang stevenli...@yesup.com wrote: Yes, I had two zones(one is basic, another is advanced mode). After I upgraded from 4.2.1 to 4.3, the vrouter lost. So I rolled back to 4.2.1, the vrouter came back. On 29/04/14 04:54 PM, Ian Young wrote: Did rolling back to 4.2 fix the problem? On Tue, Apr 29, 2014 at 1:22 PM, stevenliang stevenli...@yesup.com wrote: I met your situation before. Finally I rolled back to 4.2 On 29/04/14 04:18 PM, Ian Young wrote: I destroyed the old virtual router and was able to create a new one by adding a new instance. However, this new router also failed to start, citing the same error. After that, the expungement delay elapsed and the virtual router was expunged, so now I have none. On Mon, Apr 28, 2014 at 8:52 PM, Ian Young iyo...@ratespecial.com wrote: I upgraded from 4.2.1 to 4.3.0 tonight, following the instructions here: http://docs.cloudstack.apache.org/projects/cloudstack- release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 At the last step, I tried to restart the system VMs. The virtual router failed to start. Here is the message that was displayed in the web UI: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not retrying I tried running the script to restart the VMs but this time it failed to start the console proxy: [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u cloud -p -a Stopping and starting 1 secondary storage vm(s)... Done stopping and starting secondary storage vm(s) Stopping and starting 1 console proxy vm(s)... ERROR: Failed to start console proxy vm with id 2 Done stopping and starting console proxy vm(s) . Stopping and starting 0 running routing vm(s)... Is there a way to wipe the system VMs out and start over?
Re: failed to start virtual router
I downgraded to 4.2.1 again but cloudstack-management won't start because the database is version 4.3. Is it safe to restore the database backup I made prior to this whole process? In the meantime I have destroyed and created system VMs, so I'm not sure it's a good idea. On Apr 29, 2014 3:09 PM, Ian Young iyo...@ratespecial.com wrote: @stevenliang: I take it back--you can't set the VM size when you register the template. On Tue, Apr 29, 2014 at 3:02 PM, motty cruz motty.c...@gmail.com wrote: yes, you would have to shutdown the router, then click on Change Service Offering restart the VR. To Ian, I suspect you forgot the last step: cloudstack-setup-management that would fix your issue, I think, Thanks, --- I downgraded to 4.2.1 and then upgraded to 4.3. Now the cloudstack-management service can't start because it can't connect to the database. 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable to get a new db connection Caused by: java.sql.SQLException: Access denied for user 'cloud'@ 'localhost' (using password: YES) Where are the credentials stored? On Tue, Apr 29, 2014 at 2:57 PM, stevenliang stevenli...@yesup.com wrote: oh, then change service offering for vr? On 29/04/14 05:53 PM, motty cruz wrote: for my VR, I created a new System Offering For Software Router CPU in (MHz) 1.00GHz Memory (in MB) 1.00GB this are my current offerings, I'm sure the more RAM and CPU better performance. Thanks, On Tue, Apr 29, 2014 at 2:44 PM, stevenliang stevenli...@yesup.com wrote: Thank you again, motty. I didn't notice this earlier. BTW, how did you make your vr had 1GB CPU and 512MB RAM? On 29/04/14 05:33 PM, motty cruz wrote: Stevellang, I not sure if you saw this in the forums earlier : http://mail-archives.apache.org/mod_mbox/cloudstack-users/201404.mbox/% 3CCALoOYy6A10bz1zOQQs1VyFb9epqLfhf7mu6hc=c2rfedroy...@mail.gmail.com %3E I don't know if the bug was fixed yet, I will try upgrade in the next couple of days on a testing cluster, will report back if the bug was fixed. Thanks, On Tue, Apr 29, 2014 at 2:25 PM, stevenliang stevenli...@yesup.com wrote: Thank you, motty. I am also running kvm. Since that time I failed upgrade, I am still using 4.2.1. I'll try as your advice. On 29/04/14 05:19 PM, motty cruz wrote: Stevenllang, I had the similar issue with VR, I notice it was because I leave the default system specs on the VR, for instance by default 500MHz on CPU and 128MB on RAM, if you upgrade to at least 1GB on CPU and 512MB of RAM your VR will survive the upgrade from 4.2.1 to 4.3.1. I am running KVM, when I upgrade from 4.2.1 to 4.3 my VMs were not able to access outside world, even if I created a new router. wish you the best, -motty On Tue, Apr 29, 2014 at 2:13 PM, stevenliang stevenli...@yesup.com wrote: Yes, I had two zones(one is basic, another is advanced mode). After I upgraded from 4.2.1 to 4.3, the vrouter lost. So I rolled back to 4.2.1, the vrouter came back. On 29/04/14 04:54 PM, Ian Young wrote: Did rolling back to 4.2 fix the problem? On Tue, Apr 29, 2014 at 1:22 PM, stevenliang stevenli...@yesup.com wrote: I met your situation before. Finally I rolled back to 4.2 On 29/04/14 04:18 PM, Ian Young wrote: I destroyed the old virtual router and was able to create a new one by adding a new instance. However, this new router also failed to start, citing the same error. After that, the expungement delay elapsed and the virtual router was expunged, so now I have none. On Mon, Apr 28, 2014 at 8:52 PM, Ian Young iyo...@ratespecial.com wrote: I upgraded from 4.2.1 to 4.3.0 tonight, following the instructions here: http://docs.cloudstack.apache.org/projects/cloudstack- release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 At the last step, I tried to restart the system VMs. The virtual router failed to start. Here is the message that was displayed in the web UI: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not retrying I tried running the script to restart the VMs but this time it failed to start the console proxy: [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u cloud -p -a Stopping and starting 1 secondary storage vm(s)... Done stopping and starting secondary storage vm(s) Stopping and starting 1 console proxy vm(s)... ERROR: Failed to start console proxy vm with id 2 Done stopping and starting console proxy vm(s) . Stopping and starting 0 running routing vm(s)... Is there a way to wipe the system VMs out and start over?
Re: failed to start virtual router
Ok, my Cloudstack installation is now so broken that I think it's probably best to backup all my instances and templates, wipe the databases, and start from scratch. However, I can't take snapshots or download volumes anymore. What's causing these errors? 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,269 WARN [o.a.c.s.d.ObjectInDataStoreManagerImpl] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object (VOLUME, org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@7bbbd901), no need to delete from object in store ref table 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher] (Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd com.cloud.utils.exception.CloudRuntimeException: Failed to copy the volume from the source primary storage pool to secondary storage. 2014-04-29 17:40:51,282 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Job-Executor-11:ctx-0a3ead79) Complete async job-501, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/null/{uuidList:[],errorcode:530,errortext:Failed to copy the volume from the source primary storage pool to secondary storage.} On Tue, Apr 29, 2014 at 4:15 PM, Ian Young iyo...@ratespecial.com wrote: I downgraded to 4.2.1 again but cloudstack-management won't start because the database is version 4.3. Is it safe to restore the database backup I made prior to this whole process? In the meantime I have destroyed and created system VMs, so I'm not sure it's a good idea. On Apr 29, 2014 3:09 PM, Ian Young iyo...@ratespecial.com wrote: @stevenliang: I take it back--you can't set the VM size when you register the template. On Tue, Apr 29, 2014 at 3:02 PM, motty cruz motty.c...@gmail.com wrote: yes, you would have to shutdown the router, then click on Change Service Offering restart the VR. To Ian, I suspect you forgot the last step: cloudstack-setup-management that would fix your issue, I think, Thanks, --- I downgraded to 4.2.1 and then upgraded to 4.3. Now the cloudstack-management service can't start because it can't connect to the database. 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable to get a new db connection Caused by: java.sql.SQLException: Access denied for user 'cloud'@ 'localhost' (using password: YES) Where are the credentials stored? On Tue, Apr 29, 2014 at 2:57 PM, stevenliang stevenli...@yesup.com wrote: oh, then change service offering for vr? On 29/04/14 05:53 PM, motty cruz wrote: for my VR, I created a new System Offering For Software Router CPU in (MHz) 1.00GHz Memory (in MB) 1.00GB this are my current offerings, I'm sure the more RAM and CPU better performance. Thanks, On Tue, Apr 29, 2014 at 2:44 PM, stevenliang stevenli...@yesup.com wrote: Thank you again, motty. I didn't notice this earlier. BTW, how did you make your vr had 1GB CPU and 512MB RAM? On 29/04/14 05:33 PM, motty cruz wrote: Stevellang, I not sure if you saw this in the forums earlier : http://mail-archives.apache.org/mod_mbox/cloudstack-users/201404.mbox/% 3CCALoOYy6A10bz1zOQQs1VyFb9epqLfhf7mu6hc= c2rfedroy...@mail.gmail.com%3E I don't know if the bug was fixed yet, I will try upgrade in the next couple of days on a testing cluster, will report back if the bug was fixed. Thanks, On Tue, Apr 29, 2014 at 2:25 PM, stevenliang stevenli...@yesup.com wrote: Thank you, motty. I am also running kvm. Since that time I failed upgrade, I am still using 4.2.1. I'll try as your advice. On 29/04/14 05:19 PM, motty cruz wrote: Stevenllang, I had the similar issue with VR, I notice it was because I leave the default system specs on the VR, for instance by default 500MHz on CPU and 128MB on RAM, if you upgrade to at least 1GB on CPU and 512MB of RAM your VR will survive the upgrade from 4.2.1 to 4.3.1. I am running KVM, when I upgrade from 4.2.1 to 4.3 my VMs were not able to access outside world, even if I created a new router. wish you the best, -motty On Tue, Apr 29, 2014 at 2:13 PM, stevenliang stevenli...@yesup.com wrote: Yes, I had two zones(one is basic, another is advanced
Re: failed to start virtual router
Now I can't start cloudstack-agent. The agent.log says: Unable to start agent: Failed to get private nic name I know this is because the network bridge is no longer set up correctly. I used to have a cloud0 and a cloudbr0 interface. Now I only have cloudbr0. I haven't changed my network configuration. Somehow it's been changed by CloudStack during the upgrade/downgrade. This is getting worse and worse the more I try to recover my data. Is there any way to back up the instances' volumes via the command line? I can't tell which is which because the filenames are all hashes. I really need to get these instances up and running--there are several months worth of work at stake here. On Tue, Apr 29, 2014 at 6:13 PM, ma y breeze7...@gmail.com wrote: I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely? 2014-04-30 8:45 GMT+08:00 Ian Young iyo...@ratespecial.com: Ok, my Cloudstack installation is now so broken that I think it's probably best to backup all my instances and templates, wipe the databases, and start from scratch. However, I can't take snapshots or download volumes anymore. What's causing these errors? 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,269 WARN [o.a.c.s.d.ObjectInDataStoreManagerImpl] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object (VOLUME, org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@7bbbd901), no need to delete from object in store ref table 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher] (Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd com.cloud.utils.exception.CloudRuntimeException: Failed to copy the volume from the source primary storage pool to secondary storage. 2014-04-29 17:40:51,282 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Job-Executor-11:ctx-0a3ead79) Complete async job-501, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/null/{uuidList:[],errorcode:530,errortext:Failed to copy the volume from the source primary storage pool to secondary storage.} On Tue, Apr 29, 2014 at 4:15 PM, Ian Young iyo...@ratespecial.com wrote: I downgraded to 4.2.1 again but cloudstack-management won't start because the database is version 4.3. Is it safe to restore the database backup I made prior to this whole process? In the meantime I have destroyed and created system VMs, so I'm not sure it's a good idea. On Apr 29, 2014 3:09 PM, Ian Young iyo...@ratespecial.com wrote: @stevenliang: I take it back--you can't set the VM size when you register the template. On Tue, Apr 29, 2014 at 3:02 PM, motty cruz motty.c...@gmail.com wrote: yes, you would have to shutdown the router, then click on Change Service Offering restart the VR. To Ian, I suspect you forgot the last step: cloudstack-setup-management that would fix your issue, I think, Thanks, --- I downgraded to 4.2.1 and then upgraded to 4.3. Now the cloudstack-management service can't start because it can't connect to the database. 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable to get a new db connection Caused by: java.sql.SQLException: Access denied for user 'cloud'@ 'localhost' (using password: YES) Where are the credentials stored? On Tue, Apr 29, 2014 at 2:57 PM, stevenliang stevenli...@yesup.com wrote: oh, then change service offering for vr? On 29/04/14 05:53 PM, motty cruz wrote: for my VR, I created a new System Offering For Software Router CPU in (MHz) 1.00GHz Memory (in MB) 1.00GB this are my current offerings, I'm sure the more RAM and CPU better performance. Thanks, On Tue, Apr 29, 2014 at 2:44 PM, stevenliang stevenli...@yesup.com wrote: Thank you again, motty. I didn't notice this earlier. BTW, how did you make your vr had 1GB CPU and 512MB RAM? On 29/04/14 05:33 PM, motty cruz wrote: Stevellang, I not sure if you saw this in the forums earlier : http://mail-archives.apache.org/mod_mbox/cloudstack-users
Re: failed to start virtual router
Ok, so I've figured out a way to identify volumes in the filesystem. For instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is the root volume for an instance I want to back up. Is this in qcow2 format or something else? I'm using KVM. On Tue, Apr 29, 2014 at 7:38 PM, Ian Young iyo...@ratespecial.com wrote: Now I can't start cloudstack-agent. The agent.log says: Unable to start agent: Failed to get private nic name I know this is because the network bridge is no longer set up correctly. I used to have a cloud0 and a cloudbr0 interface. Now I only have cloudbr0. I haven't changed my network configuration. Somehow it's been changed by CloudStack during the upgrade/downgrade. This is getting worse and worse the more I try to recover my data. Is there any way to back up the instances' volumes via the command line? I can't tell which is which because the filenames are all hashes. I really need to get these instances up and running--there are several months worth of work at stake here. On Tue, Apr 29, 2014 at 6:13 PM, ma y breeze7...@gmail.com wrote: I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely? 2014-04-30 8:45 GMT+08:00 Ian Young iyo...@ratespecial.com: Ok, my Cloudstack installation is now so broken that I think it's probably best to backup all my instances and templates, wipe the databases, and start from scratch. However, I can't take snapshots or download volumes anymore. What's causing these errors? 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed com.cloud.utils.exception.CloudRuntimeException: com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands 841744457 to Host 1 timed out after 21600 2014-04-29 17:40:51,269 WARN [o.a.c.s.d.ObjectInDataStoreManagerImpl] (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object (VOLUME, org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@7bbbd901), no need to delete from object in store ref table 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher] (Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd com.cloud.utils.exception.CloudRuntimeException: Failed to copy the volume from the source primary storage pool to secondary storage. 2014-04-29 17:40:51,282 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (Job-Executor-11:ctx-0a3ead79) Complete async job-501, jobStatus: FAILED, resultCode: 530, result: org.apache.cloudstack.api.response.ExceptionResponse/null/{uuidList:[],errorcode:530,errortext:Failed to copy the volume from the source primary storage pool to secondary storage.} On Tue, Apr 29, 2014 at 4:15 PM, Ian Young iyo...@ratespecial.com wrote: I downgraded to 4.2.1 again but cloudstack-management won't start because the database is version 4.3. Is it safe to restore the database backup I made prior to this whole process? In the meantime I have destroyed and created system VMs, so I'm not sure it's a good idea. On Apr 29, 2014 3:09 PM, Ian Young iyo...@ratespecial.com wrote: @stevenliang: I take it back--you can't set the VM size when you register the template. On Tue, Apr 29, 2014 at 3:02 PM, motty cruz motty.c...@gmail.com wrote: yes, you would have to shutdown the router, then click on Change Service Offering restart the VR. To Ian, I suspect you forgot the last step: cloudstack-setup-management that would fix your issue, I think, Thanks, --- I downgraded to 4.2.1 and then upgraded to 4.3. Now the cloudstack-management service can't start because it can't connect to the database. 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable to get a new db connection Caused by: java.sql.SQLException: Access denied for user 'cloud'@ 'localhost' (using password: YES) Where are the credentials stored? On Tue, Apr 29, 2014 at 2:57 PM, stevenliang stevenli...@yesup.com wrote: oh, then change service offering for vr? On 29/04/14 05:53 PM, motty cruz wrote: for my VR, I created a new System Offering For Software Router CPU in (MHz) 1.00GHz Memory (in MB) 1.00GB this are my current offerings, I'm sure the more RAM and CPU better performance. Thanks, On Tue, Apr 29, 2014 at 2:44 PM, stevenliang stevenli
failed to start virtual router
I upgraded from 4.2.1 to 4.3.0 tonight, following the instructions here: http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3 At the last step, I tried to restart the system VMs. The virtual router failed to start. Here is the message that was displayed in the web UI: Resource [Host:1] is unreachable: Host 1: Unable to start instance due to Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not retrying I tried running the script to restart the VMs but this time it failed to start the console proxy: [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u cloud -p -a Stopping and starting 1 secondary storage vm(s)... Done stopping and starting secondary storage vm(s) Stopping and starting 1 console proxy vm(s)... ERROR: Failed to start console proxy vm with id 2 Done stopping and starting console proxy vm(s) . Stopping and starting 0 running routing vm(s)... Is there a way to wipe the system VMs out and start over?
Management node is detected inactive by timestamp but is pingable
Yesterday I tried to start an existing instance but it failed. Since it was basically a brand new installation, I just decided to destroy it and start over. However, it stayed in an expunging state and remains so today. I cannot create new instances now. The management-server.log shows numerous messages like this: 2014-02-20 10:04:04,442 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Detected management node left, id:11, nodeIP:192.168.100.6 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Trying to connect to 192.168.100.6 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Management node 11 is detected inactive by timestamp but is pingable The management server and the hypervisor host are the same machine (budgetary constraints necessitated this setup) so, obviously, it should be able to connect to itself. What is this timestamp it's referring to? Is it simply a matter of updating this so the management server is no longer considered inactive?
Re: Management node is detected inactive by timestamp but is pingable
I noticed that there are 12 records in the cloud.mshost table, all of which have an Up state. I only have one management server. Should I delete the other 11 records? On Thu, Feb 20, 2014 at 10:20 AM, Ian Young iyo...@ratespecial.com wrote: Yesterday I tried to start an existing instance but it failed. Since it was basically a brand new installation, I just decided to destroy it and start over. However, it stayed in an expunging state and remains so today. I cannot create new instances now. The management-server.log shows numerous messages like this: 2014-02-20 10:04:04,442 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Detected management node left, id:11, nodeIP:192.168.100.6 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Trying to connect to 192.168.100.6 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Management node 11 is detected inactive by timestamp but is pingable The management server and the hypervisor host are the same machine (budgetary constraints necessitated this setup) so, obviously, it should be able to connect to itself. What is this timestamp it's referring to? Is it simply a matter of updating this so the management server is no longer considered inactive?
Re: Management node is detected inactive by timestamp but is pingable
Restarting cloudstack-agent and cloudstack-management made the inactive management node notice go away but the instance is still stuck in an expunging state. How can I get rid of it? On Thu, Feb 20, 2014 at 10:55 AM, Ian Young iyo...@ratespecial.com wrote: I noticed that there are 12 records in the cloud.mshost table, all of which have an Up state. I only have one management server. Should I delete the other 11 records? On Thu, Feb 20, 2014 at 10:20 AM, Ian Young iyo...@ratespecial.comwrote: Yesterday I tried to start an existing instance but it failed. Since it was basically a brand new installation, I just decided to destroy it and start over. However, it stayed in an expunging state and remains so today. I cannot create new instances now. The management-server.log shows numerous messages like this: 2014-02-20 10:04:04,442 DEBUG [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Detected management node left, id:11, nodeIP:192.168.100.6 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Trying to connect to 192.168.100.6 2014-02-20 10:04:04,442 INFO [cloud.cluster.ClusterManagerImpl] (Cluster-Heartbeat-1:null) Management node 11 is detected inactive by timestamp but is pingable The management server and the hypervisor host are the same machine (budgetary constraints necessitated this setup) so, obviously, it should be able to connect to itself. What is this timestamp it's referring to? Is it simply a matter of updating this so the management server is no longer considered inactive?
Re: Change of guest IP address
On 19-Dec-2013, at 3:58 PM, Andrei Mikhailovskyandrei@... wrote: Do you know if there is an easier way? Like via the api calls or the cloudmonkey command? Or is it currently the only way? - Original Message - From: Jayapal Reddy Uradijayapalreddy.uradi@... To: users@... users@... Sent: Thursday, 19 December, 2013 9:25:05 AM Subject: Re: Change of guest IP address Hi, If your VM is in isolated network please do the following 1. edit the nics table ip4_address column for your instance_id to new ip. 2. login to the router corresponds to the network and replace old ip with new ip in below files. a. /var/lib/misc/dnsmasq.leases b. /etc/dhcphosts.txt 3. restart the dnsmasq in router (service dnsmasq restart) 4. Reboot the VM or restart the network service in Vm so that VM gets the new ip from the dhcp. Thanks, Jayapal I put Jayapal's solution into a script for convenience: http://pastebin.com/7yJtjNQX Just edit the first group of variables according to your needs and run it like this: set-vm-ip.sh old-address new-address