Javier, Can you enable DEBUG mode on the KVM node agent log (/etc/cloudstack/agent/log4j.xml) and let us know what happens when you add the host?
What are the packages installed on your KVM agent related to virt? $ rpm -qa | grep qemu $ rpm -qa | grep virt $ rpm -qa | grep cloud Thanks, -- Prasanna., On Mon, May 20, 2013 at 11:25:00AM +0200, Javier Rodriguez wrote: > Hi all, > > I'm testing CloudStack in two of our servers and I'm finding trouble > starting the agent in the hypervisor node. > > Heres a summary of my configuration: > > 0 - Nw infrastructure: > > subnet: 172.16.0.0/16 > gateway 172.16.0.1 > DNS1: 192.168.1.204 > > > 1 - Cloud management server: > HW: Dell PowerEdge R320 > OS: CentOS 6.4 > IP: 172.16.2.2/16 > hostname: morpheus (morpheus.biometrics.local) > > 2 - Hypervisor node: > HW: Dell PowerEdge R320 > OS: CentOS 6.4 > IP: 172.16.2.3/16 > Hypervisor type: KVM > hostname: mnode-1 (mnode-1.biometrics.local) > > > Everything worked seamlessly on the cloud manager's side, and I > didn't see any error while following the steps for the kvm > hypervisor node described in the install manual. In the manager > console, I was able to add the zone, pod and cluster, but when I > tried to create the host, it spent several minutes "creating" and > then I got a popup alert with the error: "Unable to add the host". > > management-server.log in morpheus : > ======================= > > 2013-05-20 11:52:05,403 DEBUG [cloud.api.ApiServlet] > (catalina-exec-25:null) ===START=== 192.168.1.187 -- GET > command=addHost&zoneid=ac6c892c-8a9d-494b-8d3c-c612263059a0&podid=a4dad17d-f8dc-4734-a6b2-46954144b954&clusterid=b808e253-3aa2-4372-8520-84d8cda88c27&hypervisor=KVM&clustertype=CloudManaged&hosttags=&username=root&url=http%3A%2F%2Fmnode-1&response=json&sessionkey=%2BcU40ygVQMKpfKAI9il7SspVnBk%3D&_=1369039922245 > 2013-05-20 11:52:05,413 INFO [cloud.resource.ResourceManagerImpl] > (catalina-exec-25:null) Trying to add a new host at http://mnode-1 > in data center 2 > 2013-05-20 11:52:05,665 DEBUG [utils.ssh.SSHCmdHelper] > (catalina-exec-25:null) Executing cmd: lsmod|grep kvm > 2013-05-20 11:52:06,797 DEBUG [utils.ssh.SSHCmdHelper] > (catalina-exec-25:null) lsmod|grep kvm output:kvm_intel 53484 0 > kvm 316602 1 kvm_intel > > 2013-05-20 11:52:07,812 DEBUG [utils.ssh.SSHCmdHelper] > (catalina-exec-25:null) Executing cmd: cloud-setup-agent -m > 172.16.2.2 -z 2 -p 2 -c 2 -g 8224e6a1-e640-3bd1-b5d0-dd43c74b8b08 -a > --pubNic=cloudbr0 --prvNic=cloudbr0 --guestNic=cloudbr0 > 2013-05-20 11:52:31,388 DEBUG > [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) > Skip capacity scan due to there is no Primary Storage UPintenance > mode > 2013-05-20 11:52:31,805 DEBUG > [storage.snapshot.SnapshotSchedulerImpl] (SnapshotPollTask:null) > Snapshot scheduler.poll is being called at 2013-05-20 09:52:31 GMT > 2013-05-20 11:52:31,806 DEBUG > [storage.snapshot.SnapshotSchedulerImpl] (SnapshotPollTask:null) Got > 0 snapshots to be executed at 2013-05-20 09:52:31 GMT > 2013-05-20 11:52:31,830 DEBUG > [cloud.network.ExternalLoadBalancerUsageManagerImpl] > (ExternalNetworkMonitor-1:null) External load balancer devices stats > collector is running... > 2013-05-20 11:52:31,868 DEBUG > [network.router.VirtualNetworkApplianceManagerImpl] > (RouterMonitor-1:null) Found 0 running routers. > 2013-05-20 11:52:31,870 DEBUG > [network.router.VirtualNetworkApplianceManagerImpl] > (RouterStatusMonitor-1:null) Found 0 routers. > 2013-05-20 11:52:41,440 DEBUG [utils.ssh.SSHCmdHelper] > (catalina-exec-25:null) cloud-setup-agent -m 172.16.2.2 -z 2 -p 2 > -c 2 -g 8224e6a1-e640-3bd1-b5d0-dd43c74b8b08 -a --pubNic=cloudbr0 > --prvNic=cloudbr0 --guestNic=cloudbr0 output:CloudStack Agent setup > is done! > Configure Cgroup ... > 2013-05-20 11:52:46,100 DEBUG [cloud.server.StatsCollector] > (StatsCollector-1:null) HostStatsCollector is running... > 2013-05-20 11:52:46,100 DEBUG [cloud.server.StatsCollector] > (StatsCollector-2:null) VmStatsCollector is running... > 2013-05-20 11:52:46,114 DEBUG [cloud.server.StatsCollector] > (StatsCollector-3:null) StorageCollector is running... > 2013-05-20 11:53:01,388 DEBUG > [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) > Skip capacity scan due to there is no Primary Storage UPintenance > mode > 2013-05-20 11:53:01,793 DEBUG [cloud.alert.AlertManagerImpl] > (CapacityChecker:null) Running Capacity Checker ... > 2013-05-20 11:53:01,793 DEBUG [cloud.alert.AlertManagerImpl] > (CapacityChecker:null) recalculating system capacity > 2013-05-20 11:53:01,793 DEBUG [cloud.alert.AlertManagerImpl] > (CapacityChecker:null) Executing cpu/ram capacity update > 2013-05-20 11:53:01,794 DEBUG [cloud.alert.AlertManagerImpl] > (CapacityChecker:null) Done executing cpu/ram capacity update > 2013-05-20 11:53:01,794 DEBUG [cloud.alert.AlertManagerImpl] > (CapacityChecker:null) Executing storage capacity update > 2013-05-20 11:53:01,795 DEBUG [cloud.alert.AlertManagerImpl] > (CapacityChecker:null) Done executing storage capacity update > 2013-05-20 11:53:01,795 DEBUG [cloud.alert.AlertManagerImpl] > (CapacityChecker:null) Executing capacity updates for public ip and > Vlans > 2013-05-20 11:53:01,803 DEBUG [cloud.alert.AlertManagerImpl] > (CapacityChecker:null) Done capacity updates for public ip and Vlans > 2013-05-20 11:53:01,803 DEBUG [cloud.alert.AlertManagerImpl] > (CapacityChecker:null) Executing capacity updates for private ip > 2013-05-20 11:53:01,807 DEBUG [cloud.alert.AlertManagerImpl] > (CapacityChecker:null) Done executing capacity updates for private > ip > 2013-05-20 11:53:01,807 DEBUG [cloud.alert.AlertManagerImpl] > (CapacityChecker:null) Done recalculating system capacity > 2013-05-20 11:53:01,817 DEBUG [cloud.alert.AlertManagerImpl] > (CapacityChecker:null) Done running Capacity Checker ... > 2013-05-20 11:53:01,870 DEBUG > [network.router.VirtualNetworkApplianceManagerImpl] > (RouterStatusMonitor-1:null) Found 0 routers. > 2013-05-20 11:53:31,388 DEBUG > [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) > Skip capacity scan due to there is no Primary Storage UPintenance > mode > 2013-05-20 11:53:31,870 DEBUG > [network.router.VirtualNetworkApplianceManagerImpl] > (RouterStatusMonitor-1:null) Found 0 routers. > 2013-05-20 11:53:46,102 DEBUG [cloud.server.StatsCollector] > (StatsCollector-1:null) HostStatsCollector is running... > 2013-05-20 11:53:46,102 DEBUG [cloud.server.StatsCollector] > (StatsCollector-2:null) VmStatsCollector is running... > 2013-05-20 11:53:46,117 DEBUG [cloud.server.StatsCollector] > (StatsCollector-3:null) StorageCollector is running... > ( ... ) > 2013-05-20 11:57:31,806 DEBUG > [storage.snapshot.SnapshotSchedulerImpl] (SnapshotPollTask:null) > Snapshot scheduler.poll is being called at 2013-05-20 09:57:31 GMT > 2013-05-20 11:57:31,807 DEBUG > [storage.snapshot.SnapshotSchedulerImpl] (SnapshotPollTask:null) Got > 0 snapshots to be executed at 2013-05-20 09:57:31 GMT > 2013-05-20 11:57:31,830 DEBUG > [cloud.network.ExternalLoadBalancerUsageManagerImpl] > (ExternalNetworkMonitor-1:null) External load balancer devices stats > collector is running... > 2013-05-20 11:57:31,868 DEBUG > [network.router.VirtualNetworkApplianceManagerImpl] > (RouterMonitor-1:null) Found 0 running routers. > 2013-05-20 11:57:31,870 DEBUG > [network.router.VirtualNetworkApplianceManagerImpl] > (RouterStatusMonitor-1:null) Found 0 routers. > 2013-05-20 11:57:42,458 DEBUG [kvm.discoverer.KvmServerDiscoverer] > (catalina-exec-25:null) Timeout, to wait for the host connecting to > mgt svr, assuming it is failed > 2013-05-20 11:57:42,461 WARN [cloud.resource.ResourceManagerImpl] > (catalina-exec-25:null) Unable to find the server resources at > http://mnode-1 > 2013-05-20 11:57:42,464 WARN [api.commands.AddHostCmd] > (catalina-exec-25:null) Exception: > com.cloud.exception.DiscoveryException: Unable to add the host > at > com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:737) > at > com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImpl.java:544) > at com.cloud.api.commands.AddHostCmd.execute(AddHostCmd.java:140) > at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:138) > at com.cloud.api.ApiServer.queueCommand(ApiServer.java:544) > at com.cloud.api.ApiServer.handleRequest(ApiServer.java:423) > at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:312) > at com.cloud.api.ApiServlet.doGet(ApiServlet.java:64) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) > at > org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889) > at > org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:721) > at > org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > 2013-05-20 11:57:42,465 WARN [cloud.api.ApiDispatcher] > (catalina-exec-25:null) class com.cloud.api.ServerApiException : > Unable to add the host > 2013-05-20 11:57:42,466 DEBUG [cloud.api.ApiServlet] > (catalina-exec-25:null) ===END=== 192.168.1.187 -- GET > command=addHost&zoneid=ac6c892c-8a9d-494b-8d3c-c612263059a0&podid=a4dad17d-f8dc-4734-a6b2-46954144b954&clusterid=b808e253-3aa2-4372-8520-84d8cda88c27&hypervisor=KVM&clustertype=CloudManaged&hosttags=&username=root&url=http%3A%2F%2Fmnode-1&response=json&sessionkey=%2BcU40ygVQMKpfKAI9il7SspVnBk%3D&_=1369039922245 > > ===== END ==== > > agent.log in mnode-1: > ============== > 2013-05-20 10:48:55,603 ERROR [cloud.agent.AgentShell] (main:null) > Unable to start agent: Failed to get public nic name > 2013-05-20 11:04:28,473 INFO [utils.component.ComponentLocator] > (main:null) Unable to find components.xml > 2013-05-20 11:04:28,474 INFO [utils.component.ComponentLocator] > (main:null) Skipping configuration using components.xml > 2013-05-20 11:04:28,474 INFO [cloud.agent.AgentShell] (main:null) > Implementation Version is 4.0.2.20130420145617 > 2013-05-20 11:04:28,475 INFO [cloud.agent.AgentShell] (main:null) > agent.properties found at /etc/cloud/agent/agent.properties > 2013-05-20 11:04:28,476 INFO [cloud.agent.AgentShell] (main:null) > Defaulting to using properties file for storage > 2013-05-20 11:04:28,478 INFO [cloud.agent.AgentShell] (main:null) > Defaulting to the constant time backoff algorithm > 2013-05-20 11:04:28,534 INFO [cloud.agent.Agent] (main:null) id is > 2013-05-20 11:04:28,544 ERROR [cloud.resource.ServerResourceBase] > (main:null) Nics are not configured! > 2013-05-20 11:04:28,550 INFO [cloud.resource.ServerResourceBase] > (main:null) Designating private to be nic em1.100 > 2013-05-20 11:04:28,638 INFO > [resource.virtualnetwork.VirtualRoutingResource] (main:null) > VirtualRoutingResource _scriptDir to use: scripts/network/domr/kvm > 2013-05-20 11:04:28,838 ERROR [cloud.agent.AgentShell] (main:null) > Unable to start agent: Failed to get public nic name > > 2013-05-20 11:52:05,347 INFO [utils.component.ComponentLocator] > (main:null) Unable to find components.xml > 2013-05-20 11:52:05,348 INFO [utils.component.ComponentLocator] > (main:null) Skipping configuration using components.xml > 2013-05-20 11:52:05,348 INFO [cloud.agent.AgentShell] (main:null) > Implementation Version is 4.0.2.20130420145617 > 2013-05-20 11:52:05,349 INFO [cloud.agent.AgentShell] (main:null) > agent.properties found at /etc/cloud/agent/agent.properties > 2013-05-20 11:52:05,350 INFO [cloud.agent.AgentShell] (main:null) > Defaulting to using properties file for storage > 2013-05-20 11:52:05,352 INFO [cloud.agent.AgentShell] (main:null) > Defaulting to the constant time backoff algorithm > 2013-05-20 11:52:05,409 INFO [cloud.agent.Agent] (main:null) id is > 2013-05-20 11:52:05,418 ERROR [cloud.resource.ServerResourceBase] > (main:null) Nics are not configured! > 2013-05-20 11:52:05,424 INFO [cloud.resource.ServerResourceBase] > (main:null) Designating private to be nic em1.100 > 2013-05-20 11:52:05,513 INFO > [resource.virtualnetwork.VirtualRoutingResource] (main:null) > VirtualRoutingResource _scriptDir to use: scripts/network/domr/kvm > 2013-05-20 11:52:05,712 ERROR [cloud.agent.AgentShell] (main:null) > Unable to start agent: Failed to get public nic name > > ===== END == > > I have iptables disabled and ipsec set to permissive on both sides. > I'm guessing there might be a problem with the bridge configuration > in the kvm node (mnode-1). This is what I have at the moment: > > network scripts: > =========== > > :::::::::::::: > ifcfg-em1 > :::::::::::::: > DEVICE=em1 > BOOTPROTO=none > BROADCAST=172.16.255.255 > DNS1=192.168.1.204 > GATEWAY=172.16.0.1 > HWADDR=90:B1:1C:39:33:8A > IPADDR=172.16.2.3 > NETMASK=255.255.0.0 > NM_CONTROLLED=no > ONBOOT=yes > HOTPLUG=no > TYPE=Ethernet > > :::::::::::::: > ifcfg-em1.100 > :::::::::::::: > DEVICE=em1.100 > HWADDR=90:B1:1C:39:33:8A > ONBOOT=yes > HOTPLUG=no > BOOTPROTO=none > TYPE=Ethernet > VLAN=yes > IPADDR=172.16.5.1 > GATEWAY=172.16.0.1 > NETMASK=255.255.0.0 > > :::::::::::::: > ifcfg-em1.200 > :::::::::::::: > DEVICE=em1.200 > HWADDR=90:B1:1C:39:33:8A > ONBOOT=yes > HOTPLUG=no > BOOTPROTO=none > TYPE=Ethernet > VLAN=yes > BRIDGE=cloudbr0 > > :::::::::::::: > ifcfg-em1.300 > :::::::::::::: > DEVICE=em1.300 > HWADDR=90:B1:1C:39:33:8A > ONBOOT=yes > HOTPLUG=no > BOOTPROTO=none > TYPE=Ethernet > VLAN=yes > BRIDGE=cloudbr1 > > :::::::::::::: > ifcfg-cloudbr0 > :::::::::::::: > DEVICE=cloudbr0 > TYPE=Bridge > ONBOOT=yes > BOOTPROTO=none > IPV6INIT=no > IPV6_AUTOCONF=no > DELAY=5 > STP=yes > > :::::::::::::: > ifcfg-cloudbr1 > :::::::::::::: > DEVICE=cloudbr1 > TYPE=Bridge > ONBOOT=yes > BOOTPROTO=none > IPV6INIT=no > IPV6_AUTOCONF=no > DELAY=5 > STP=yes > > > ifconfig output: > =========== > > cloudbr0 Link encap:Ethernet HWaddr 90:B1:1C:39:33:8A > inet6 addr: fe80::92b1:1cff:fe39:338a/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:14 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 b) TX bytes:1188 (1.1 KiB) > > cloudbr1 Link encap:Ethernet HWaddr 90:B1:1C:39:33:8A > inet6 addr: fe80::92b1:1cff:fe39:338a/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:14 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 b) TX bytes:1188 (1.1 KiB) > > em1 Link encap:Ethernet HWaddr 90:B1:1C:39:33:8A > inet addr:172.16.2.3 Bcast:172.16.255.255 Mask:255.255.0.0 > inet6 addr: fe80::92b1:1cff:fe39:338a/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:89868 errors:0 dropped:0 overruns:0 frame:0 > TX packets:6620 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:20755835 (19.7 MiB) TX bytes:479334 (468.0 KiB) > Interrupt:16 > > em1.100 Link encap:Ethernet HWaddr 90:B1:1C:39:33:8A > inet addr:172.16.5.1 Bcast:172.16.255.255 Mask:255.255.0.0 > inet6 addr: fe80::92b1:1cff:fe39:338a/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:20 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 b) TX bytes:1440 (1.4 KiB) > > em1.200 Link encap:Ethernet HWaddr 90:B1:1C:39:33:8A > inet6 addr: fe80::92b1:1cff:fe39:338a/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:3019 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 b) TX bytes:158774 (155.0 KiB) > > em1.300 Link encap:Ethernet HWaddr 90:B1:1C:39:33:8A > inet6 addr: fe80::92b1:1cff:fe39:338a/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:3020 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 b) TX bytes:158844 (155.1 KiB) > > lo Link encap:Local Loopback > inet addr:127.0.0.1 Mask:255.0.0.0 > inet6 addr: ::1/128 Scope:Host > UP LOOPBACK RUNNING MTU:16436 Metric:1 > RX packets:118 errors:0 dropped:0 overruns:0 frame:0 > TX packets:118 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:10544 (10.2 KiB) TX bytes:10544 (10.2 KiB) > > virbr0 Link encap:Ethernet HWaddr 52:54:00:37:05:54 > inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0 > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) > > > > The current status of cloud-agent is "cloud-agent dead but subsys > locked". Attempting to restart the service results in the same > error(s) being printed out to the agent.log. > > I ran the cloud-setup-agent command I found in > management-server.log, and it seems to work fine, but still Im not > able to bring the cloud-agent service up: > > [root@mnode-1 network-scripts]# cloud-setup-agent -m 172.16.2.2 -z > 2 -p 2 -c 2 -g 8224e6a1-e640-3bd1-b5d0-dd43c74b8b08 -a > --pubNic=cloudbr0 --prvNic=cloudbr0 --guestNic=cloudbr0 > Starting to configure your system: > Configure Cgroup ... [OK] > Configure SElinux ... [OK] > Configure Network ... [OK] > Configure Libvirt ... [OK] > Configure Firewall ... [OK] > Configure Nfs ... [OK] > Configure cloudAgent ... [OK] > CloudStack Agent setup is done! > [root@mnode-1 network-scripts]# service cloud-agent restart > Stopping Cloud Agent: > Starting Cloud Agent: > [root@mnode-1 network-scripts]# service cloud-agent status > cloud-agent dead but subsys locked > > > What am I doing wrong? Thanks in advance for the help. > > > -- > kind regards, > > Javier Rodr?guez ------------------------ Powered by BigRock.com