Kambiz, can you please try one more thing. 1) Locate all the firewall rules for your guest network (205, right?)
Select id, ip_address_id from firewall_rules where network_id=205; 2) Now get all static nat enabled ip addresses for those rules: Select vm_id, network_id from user_ip_address where id in (Select id, ip_address_id from firewall_rules where network_id=205); For each vmId/networkId combo, check if there is non-removed nic and non-expunged vm. There might be some incorrect static nat ip/vm reference referring to vm that is removed already. If you find any, let me know and I will tell you how to clean it up -Alena. On 3/22/14, 5:41 AM, "Kambiz Darabi" <dar...@m-creations.com> wrote: >Hi Alena, > >thank you for your help. > >The query returns no rows, i.e. nics.removed was not null, but I removed >the row though to see what happens: a new virtual router was created >which also couldn't be started due to the same NPE. I reverted the >change by restoring from the dump. > >I have to mention that prior to the restart, r-7-VM was the router which >was used by my instances. I deleted the router using the UI after the >first >occurrence of the NPE, because a post with a similar problem suggested >that the deleted router would be recreated again (and this procedure >solved the problem). > >Below I have attached the state of the two tables. > >Anything else I can try? > >Thank you > > >Kambiz > >mysql> select n.id, n.removed, n.ip4_address, n.netmask, n.gateway, >n.ip_type, n.reserver_name, n.network_id, i.id as instance_id, i.name, >i.state, i.type from vm_instance i join nics n on n.instance_id = i.id >where i.type = 'DomainRouter'; >+----+---------------------+---------------+---------------+-------------+ >---------+--------------------------+------------+-------------+---------+ >-----------+--------------+ >| id | removed | ip4_address | netmask | gateway >| ip_type | reserver_name | network_id | instance_id | name >| state | type | >+----+---------------------+---------------+---------------+-------------+ >---------+--------------------------+------------+-------------+---------+ >-----------+--------------+ >| 9 | 2014-03-17 11:27:58 | 10.124.99.1 | 255.255.255.0 | NULL >| NULL | ExternalGuestNetworkGuru | 204 | 4 | r-4-VM >| Expunging | DomainRouter | >| 10 | 2014-03-17 11:27:58 | NULL | NULL | NULL >| NULL | ControlNetworkGuru | 202 | 4 | r-4-VM >| Expunging | DomainRouter | >| 11 | 2014-03-17 11:27:58 | 10.193.17.139 | 255.255.255.0 | 10.193.17.1 >| NULL | PublicNetworkGuru | 200 | 4 | r-4-VM >| Expunging | DomainRouter | >| 14 | 2014-03-17 11:27:52 | 10.124.99.1 | 255.255.255.0 | NULL >| NULL | ExternalGuestNetworkGuru | 205 | 7 | r-7-VM >| Expunging | DomainRouter | >| 15 | 2014-03-17 11:27:52 | NULL | NULL | NULL >| NULL | ControlNetworkGuru | 202 | 7 | r-7-VM >| Expunging | DomainRouter | >| 16 | 2014-03-17 11:27:52 | 10.193.17.190 | 255.255.255.0 | 10.193.17.1 >| NULL | PublicNetworkGuru | 200 | 7 | r-7-VM >| Expunging | DomainRouter | >| 26 | 2014-03-18 08:11:16 | 10.124.99.1 | 255.255.255.0 | NULL >| NULL | ExternalGuestNetworkGuru | 205 | 18 | r-18-VM >| Expunging | DomainRouter | >| 27 | 2014-03-18 08:11:16 | NULL | NULL | NULL >| NULL | ControlNetworkGuru | 202 | 18 | r-18-VM >| Expunging | DomainRouter | >| 28 | 2014-03-18 08:11:16 | 10.193.17.190 | 255.255.255.0 | 10.193.17.1 >| NULL | PublicNetworkGuru | 200 | 18 | r-18-VM >| Expunging | DomainRouter | >| 29 | NULL | 10.124.99.1 | 255.255.255.0 | NULL >| NULL | ExternalGuestNetworkGuru | 205 | 19 | r-19-VM >| Stopped | DomainRouter | >| 30 | NULL | NULL | NULL | NULL >| NULL | ControlNetworkGuru | 202 | 19 | r-19-VM >| Stopped | DomainRouter | >| 31 | NULL | 10.193.17.190 | 255.255.255.0 | 10.193.17.1 >| NULL | PublicNetworkGuru | 200 | 19 | r-19-VM >| Stopped | DomainRouter | >+----+---------------------+---------------+---------------+-------------+ >---------+--------------------------+------------+-------------+---------+ >-----------+--------------+ > >mysql> select * from router_network_ref; >+----+-----------+------------+------------+ >| id | router_id | network_id | guest_type | >+----+-----------+------------+------------+ >| 1 | 4 | 204 | Isolated | >| 2 | 7 | 205 | Isolated | >| 3 | 18 | 205 | Isolated | >| 4 | 19 | 205 | Isolated | >+----+-----------+------------+------------+ > > > >Alena Prokharchyk <alena.prokharc...@citrix.com> wrote: >> >> The error happens not because Ip is null, but because the nic in a >>certain >> network can¹t be found. Looks like there is some bug in VPC nic >> plug/unplug for Guest networks process. >> >> Kambiz, please do the following to fix it: >> >> 1) Stop the MS >> 2) Take the DB dump of cloud db in case you have to revert back. >> 3) Run the query: >> >> select * from router_network_ref where router_id=<id of your VR) and >> network_id not in (select network_id from nics where instance_id=<ID of >> your VR> and removed is null); >> >> It will give you the list of networks refs that somehow weren¹t cleaned >> during the nic detach. Remove the entry returned from router_network_ref >> table. >> >> Let me know how it works. >> >> -Alena. >> >> >> On 3/21/14, 3:36 PM, "Kambiz Darabi" <dar...@m-creations.com> wrote: >> >>>Hello, >>> >>>as this is my first post to the list, I would like to thank all >>>contributors for Cloudstack which I use since last fall without any >>>problems. I run 4.1.1 with KVM and advanced networking. >>> >>>After a restart of the management server (stopping and starting the java >>>process), the virtual domain router doesn't start and >>>management-server.log shows a NullPointerException in >>>NetworkModelImpl.getIpInNetwork (cf. stack trace below). >>> >>>By putting the server in debug mode and remote debugging, I found out >>>that the reason is a row in the table nics which has NULL in ip (cf. row >>>with id 30 in the result of the select statement below). >>> >>>What can I do to quickly solve this problem? Any pointers or suggestions >>>are appreciated as the system is currently unusable. >>> >>>Thank you for your help >>> >>> >>>Kambiz >>> >>> >>>management-server.log: >>> >>>2014-03-18 10:03:27,151 DEBUG [cloud.network.NetworkManagerImpl] >>>(Job-Executor-1:job-176) Asking VirtualRouter to prepare for >>>Nic[29-19-30e229ba-21bd-4ab5-8570-9f495bce5019-10.124.99.1] >>>2014-03-18 10:03:27,151 DEBUG [cloud.network.NetworkManagerImpl] >>>(Job-Executor-1:job-176) Asking Ovs to prepare for >>>Nic[29-19-30e229ba-21bd-4ab5-8570-9f495bce5019-10.124.99.1] >>>2014-03-18 10:03:27,151 DEBUG [cloud.network.NetworkManagerImpl] >>>(Job-Executor-1:job-176) Asking SecurityGroupProvider to prepare for >>>Nic[29-19-30e229ba-21bd-4ab5-8570-9f495bce5019-10.124.99.1] >>>2014-03-18 10:03:27,151 DEBUG [cloud.network.NetworkManagerImpl] >>>(Job-Executor-1:job-176) Asking VpcVirtualRouter to prepare for >>>Nic[29-19-30e229ba-21bd-4ab5-8570-9f495bce5019-10.124.99.1] >>>2014-03-18 10:03:27,151 WARN [network.element.VpcVirtualRouterElement] >>>(Job-Executor-1:job-176) Network Ntwk[205|Guest|8] is not associated >>>with >>>any VPC >>>2014-03-18 10:03:27,151 DEBUG [cloud.network.NetworkManagerImpl] >>>(Job-Executor-1:job-176) Asking NiciraNvp to prepare for >>>Nic[29-19-30e229ba-21bd-4ab5-8570-9f495bce5019-10.124.99.1] >>>2014-03-18 10:03:27,151 DEBUG [network.element.NiciraNvpElement] >>>(Job-Executor-1:job-176) Checking if NiciraNvpElement can handle service >>>Connectivity on network net1 >>>2014-03-18 10:03:27,153 DEBUG [cloud.network.NetworkModelImpl] >>>(Job-Executor-1:job-176) Service SecurityGroup is not supported in the >>>network id=205 >>>2014-03-18 10:03:27,156 DEBUG [cloud.network.NetworkManagerImpl] >>>(Job-Executor-1:job-176) Lock is acquired for network id 202 as a part >>>of >>>network implement >>>2014-03-18 10:03:27,156 DEBUG [cloud.network.NetworkManagerImpl] >>>(Job-Executor-1:job-176) Network id=202 is already implemented >>>2014-03-18 10:03:27,157 DEBUG [cloud.network.NetworkManagerImpl] >>>(Job-Executor-1:job-176) Lock is released for network id 202 as a part >>>of >>>network implement >>>2014-03-18 10:03:27,187 DEBUG [cloud.network.NetworkManagerImpl] >>>(Job-Executor-1:job-176) Asking VirtualRouter to prepare for >>>Nic[30-19-30e229ba-21bd-4ab5-8570-9f495bce5019-169.254.3.99] >>>2014-03-18 10:03:27,187 DEBUG [cloud.network.NetworkManagerImpl] >>>(Job-Executor-1:job-176) Asking Ovs to prepare for >>>Nic[30-19-30e229ba-21bd-4ab5-8570-9f495bce5019-169.254.3.99] >>>2014-03-18 10:03:27,187 DEBUG [cloud.network.NetworkManagerImpl] >>>(Job-Executor-1:job-176) Asking SecurityGroupProvider to prepare for >>>Nic[30-19-30e229ba-21bd-4ab5-8570-9f495bce5019-169.254.3.99] >>>2014-03-18 10:03:27,187 DEBUG [cloud.network.NetworkManagerImpl] >>>(Job-Executor-1:job-176) Asking VpcVirtualRouter to prepare for >>>Nic[30-19-30e229ba-21bd-4ab5-8570-9f495bce5019-169.254.3.99] >>>2014-03-18 10:03:27,187 WARN [network.element.VpcVirtualRouterElement] >>>(Job-Executor-1:job-176) Network Ntwk[202|Control|3] is not associated >>>with any VPC >>>2014-03-18 10:03:27,188 DEBUG [cloud.network.NetworkManagerImpl] >>>(Job-Executor-1:job-176) Asking NiciraNvp to prepare for >>>Nic[30-19-30e229ba-21bd-4ab5-8570-9f495bce5019-169.254.3.99] >>>2014-03-18 10:03:27,188 DEBUG [network.element.NiciraNvpElement] >>>(Job-Executor-1:job-176) Checking if NiciraNvpElement can handle service >>>Connectivity on network null >>>2014-03-18 10:03:27,190 DEBUG [cloud.storage.StorageManagerImpl] >>>(Job-Executor-1:job-176) Checking if we need to prepare 1 volumes for >>>VM[DomainRouter|r-19-VM] >>>2014-03-18 10:03:27,190 DEBUG [cloud.storage.StorageManagerImpl] >>>(Job-Executor-1:job-176) No need to recreate the volume: >>>Vol[24|vm=19|ROOT], since it already has a pool assigned: 200, adding >>>disk to VM >>>2014-03-18 10:03:27,224 DEBUG >>>[network.router.VirtualNetworkApplianceManagerImpl] >>>(Job-Executor-1:job-176) Boot Args for VM[DomainRouter|r-19-VM]: >>>template=domP name=r-19-VM eth2ip=10.193.17.190 eth2mask=255.255.255.0 >>>gateway=10.193.17.1 eth0ip=10.124.99.1 eth0mask=255.255.255.0 >>>domain=cs6cloud.internal dhcprange=10.124.99.1 eth0ip=169.254.3.99 >>>eth0mask=255.255.0.0 type=router disable_rp_filter=true dns1=10.193.17.1 >>>2014-03-18 10:03:27,343 DEBUG >>>[network.router.VirtualNetworkApplianceManagerImpl] >>>(Job-Executor-1:job-176) Found 8 ip(s) to apply as a part of domR >>>VM[DomainRouter|r-19-VM] start. >>>2014-03-18 10:03:27,415 DEBUG >>>[network.router.VirtualNetworkApplianceManagerImpl] >>>(Job-Executor-1:job-176) Resending ipAssoc, port forwarding, load >>>balancing rules as a part of Virtual router start >>>2014-03-18 10:03:27,499 DEBUG >>>[network.router.VirtualNetworkApplianceManagerImpl] >>>(Job-Executor-1:job-176) Found 12 firewall Egress rule(s) to apply as a >>>part of domR VM[DomainRouter|r-19-VM] start. >>>2014-03-18 10:03:27,593 ERROR [cloud.vm.VirtualMachineManagerImpl] >>>(Job-Executor-1:job-176) Failed to start instance >>>VM[DomainRouter|r-19-VM] >>>java.lang.NullPointerException >>> at >>>com.cloud.network.NetworkModelImpl.getIpInNetwork(NetworkModelImpl.java: >>>76 >>>3) >>> at >>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.finalizeNetw >>>or >>>kRulesForNetwork(VirtualNetworkApplianceManagerImpl.java:2346) >>> at >>>com.cloud.network.router.VpcVirtualNetworkApplianceManagerImpl.finalizeN >>>et >>>workRulesForNetwork(VpcVirtualNetworkApplianceManagerImpl.java:928) >>> at >>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.finalizeComm >>>an >>>dsOnStart(VirtualNetworkApplianceManagerImpl.java:2241) >>> at >>>com.cloud.network.router.VpcVirtualNetworkApplianceManagerImpl.finalizeC >>>om >>>mandsOnStart(VpcVirtualNetworkApplianceManagerImpl.java:767) >>> at >>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.finalizeDepl >>>oy >>>ment(VirtualNetworkApplianceManagerImpl.java:2205) >>> at >>>com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManage >>>rI >>>mpl.java:763) >>> at >>>com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.j >>>av >>>a:471) >>> at >>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.start(Virtua >>>lN >>>etworkApplianceManagerImpl.java:2616) >>> at >>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.startVirtual >>>Ro >>>uter(VirtualNetworkApplianceManagerImpl.java:1824) >>> at >>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.startRouters >>>(V >>>irtualNetworkApplianceManagerImpl.java:1924) >>> at >>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.deployVirtua >>>lR >>>outerInGuestNetwork(VirtualNetworkApplianceManagerImpl.java:1902) >>> at >>>com.cloud.network.element.VirtualRouterElement.implement(VirtualRouterEl >>>em >>>ent.java:175) >>> at >>>com.cloud.network.NetworkManagerImpl.implementNetworkElementsAndResource >>>s( >>>NetworkManagerImpl.java:1518) >>> at >>>com.cloud.network.NetworkManagerImpl.implementNetwork(NetworkManagerImpl >>>.j >>>ava:1434) >>> at >>>com.cloud.utils.component.ComponentInstantiationPostProcessor$Intercepto >>>rD >>>ispatcher.intercept(ComponentInstantiationPostProcessor.java:125) >>> at >>>com.cloud.network.NetworkManagerImpl.startNetwork(NetworkManagerImpl.jav >>>a: >>>2435) >>> at >>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.startRouter( >>>Vi >>>rtualNetworkApplianceManagerImpl.java:2855) >>> at >>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.startRouter( >>>Vi >>>rtualNetworkApplianceManagerImpl.java:2824) >>> at >>>com.cloud.utils.component.ComponentInstantiationPostProcessor$Intercepto >>>rD >>>ispatcher.intercept(ComponentInstantiationPostProcessor.java:125) >>> at >>>org.apache.cloudstack.api.command.admin.router.StartRouterCmd.execute(St >>>ar >>>tRouterCmd.java:103) >>> >>> >>>table nics: >>> >>>mysql> select * from nics where reserver_name = 'ControlNetworkGuru'; >>>+----+--------------------------------------+-------------+------------- >>>-- >>>----+---------------+-------------+-------------+---------+------------- >>>-- >>>+------------+--------+--------------+----------+--------------------+-- >>>-- >>>----------------------------------+-----------+---------------------+--- >>>-- >>>----------+-------------+-------------+--------------------+------------ >>>-- >>>-------+---------------------+-------------+----------+ >>>| id | uuid | instance_id | mac_address >>> | ip4_address | netmask | gateway | ip_type | >>>broadcast_uri >>>| network_id | mode | state | strategy | reserver_name | >>>reservation_id | device_id | update_time | >>>isolation_uri | ip6_address | default_nic | vm_type | created >>> | removed | ip6_gateway | ip6_cidr | >>>+----+--------------------------------------+-------------+------------- >>>-- >>>----+---------------+-------------+-------------+---------+------------- >>>-- >>>+------------+--------+--------------+----------+--------------------+-- >>>-- >>>----------------------------------+-----------+---------------------+--- >>>-- >>>----------+-------------+-------------+--------------------+------------ >>>-- >>>-------+---------------------+-------------+----------+ >>>| 2 | 289aacb8-cfd7-4879-a632-6cfbda36cbf4 | 1 | >>>0e:00:a9:fe:00:55 | 169.254.0.85 | 255.255.0.0 | 169.254.0.1 | Ip4 >>>| >>>NULL | 202 | Static | Reserved | Start | >>>ControlNetworkGuru | 993864b4-9dde-47d6-8fd6-cf94050442c6 | 0 | >>>2014-03-17 22:21:38 | NULL | NULL | 0 | >>>SecondaryStorageVm | 2013-09-06 12:44:42 | NULL | NULL >>> | NULL | >>>| 6 | 5fdf4b1a-b90c-4c79-9d42-9eaf87eaa042 | 2 | >>>0e:00:a9:fe:02:d3 | 169.254.2.211 | 255.255.0.0 | 169.254.0.1 | Ip4 >>>| >>>NULL | 202 | Static | Reserved | Start | >>>ControlNetworkGuru | 852e0a65-c72a-448f-ac71-2bb3549a5a41 | 0 | >>>2014-03-17 22:21:38 | NULL | NULL | 0 | >>>ConsoleProxy | 2013-09-06 12:44:42 | NULL | NULL >>> | NULL | >>>| 10 | 4c4e6368-95d7-419a-a9b3-a5bb394197f0 | 4 | NULL >>> | NULL | NULL | NULL | NULL | NULL >>>| 202 | Static | Deallocating | Start | ControlNetworkGuru | >>>c28e8ddc-c106-462e-96c8-5d5216dad9b7 | 1 | 2014-03-17 12:27:58 | >>>NULL | NULL | 0 | DomainRouter | >>>2013-09-10 08:08:39 | 2014-03-17 11:27:58 | NULL | NULL | >>>| 15 | 1f2e99c0-9cd9-47aa-ab10-f190efd7a2dc | 7 | NULL >>> | NULL | NULL | NULL | NULL | NULL >>>| 202 | Static | Deallocating | Start | ControlNetworkGuru | >>>ca1aa99e-e630-4533-9642-523d8a8b1fea | 1 | 2014-03-17 12:27:52 | >>>NULL | NULL | 0 | DomainRouter | >>>2013-09-12 10:58:03 | 2014-03-17 11:27:52 | NULL | NULL | >>>| 27 | 1c98c4f2-f604-4a38-a813-f68833b1d250 | 18 | NULL >>> | NULL | NULL | NULL | NULL | NULL >>>| 202 | Static | Deallocating | Start | ControlNetworkGuru | >>>ad8e0e50-72aa-4c68-8634-8dc89f12fe01 | 1 | 2014-03-18 09:11:16 | >>>NULL | NULL | 0 | DomainRouter | >>>2014-03-17 11:28:50 | 2014-03-18 08:11:16 | NULL | NULL | >>>| 30 | cabd4cd9-c39f-423f-ad6a-ee3affe0bd9d | 19 | NULL >>> | NULL | NULL | NULL | NULL | NULL >>>| 202 | Static | Allocated | Start | ControlNetworkGuru | >>>e81ba56d-a101-4c60-b44f-a0890d56aad9 | 1 | 2014-03-18 09:11:44 | >>>NULL | NULL | 0 | DomainRouter | >>>2014-03-18 08:11:32 | NULL | NULL | NULL | >>>+----+--------------------------------------+-------------+------------- >>>-- >>>----+---------------+-------------+-------------+---------+------------- >>>-- >>>+------------+--------+--------------+----------+--------------------+-- >>>-- >>>----------------------------------+-----------+---------------------+--- >>>-- >>>----------+-------------+-------------+--------------------+------------ >>>-- >>>-------+---------------------+-------------+----------+ >>> >>>