So, I vaguely remember an issue introduced a little over a year ago where the broadcast domain value of the nic was changed from a URI to just a vlan ID, which worked for vlans but broke vxlan and some other things. If I remember correctly, there would be a small set of installs during this period that wouldn't have created their nics with the correct broadcast domain value. I don't remember which versions were doing this but I do know there's a JIRA ticket and a paper trail on how people were fixing it. The code that broke the URI was backed out. VMs created with the bad code would not be compatible with the new or the old versions of code.
I was under the impression at the time that there was some SQL provided to update the values during an upgrade, perhaps that never made it in, or somehow got skipped during your upgrade process. At any rate, since there is a null pointer on broadcast domain type, you may check your nics/networks the MySQL db and verify that the broadcast/isolation types are URI format and not just a number. Or try to find the bug I'm referring to from around April last year. On May 14, 2015 5:04 AM, "Andrei Mikhailovsky" <and...@arhont.com> wrote: > Hi guys, > > Forwarding the message to the dev list as ive not had much reply in the > users list. > > In summary. after upgrading from ASC4.4.2 ro 4.5.1 i started having > migration issues with a lot of vms. some vms are successfully migrating and > others are not . > > The logs are shown below > > could someone help me to get to the bottom of this problem? > > Thanks > > Andrei > > > > ----- Forwarded Message ----- > From: "Andrei Mikhailovsky" <and...@arhont.com> > To: us...@cloudstack.apache.org > Sent: Wednesday, 13 May, 2015 10:44:29 AM > Subject: Re: ACS 4.5.1 KVM live migration problem > > Hi Rohit, > > forgot to answer you on the cloud.vlan table. > > That particular vm has a network with vlan id 1151 as shown when i look at > the network details in the acs gui. However, this vlan is not shown in the > cloud.vlan table. From what I can see the cloud.vlan table shows only the > public and management network vlan interfaces and does not show the guest > network vlans. > > In terms of the public network vlan which is used for routing traffic to > the internet from this particular vm, it is: > > > mysql> select * from vlan where id=12; > > +----+--------------------------------------+-------------+---------------+-----------------+-------------------------------+----------------+----------------+------------+---------------------+-------------+----------+-----------+---------+---------+ > | id | uuid | vlan_id | vlan_gateway | vlan_netmask | description | > vlan_type | data_center_id | network_id | physical_network_id | ip6_gateway > | ip6_cidr | ip6_range | removed | created | > > +----+--------------------------------------+-------------+---------------+-----------------+-------------------------------+----------------+----------------+------------+---------------------+-------------+----------+-----------+---------+---------+ > | 12 | d13ea4b3-2087-4376-9d0a-f54efe2a55af | vlan://2030 | 178.XXX.XXX.1 > | 255.255.255.128 | 178.XXX.XXX.2-178.XXX.XXX.119 | VirtualNetwork | 1 | > 200 | 200 | NULL | NULL | NULL | NULL | NULL | > > +----+--------------------------------------+-------------+---------------+-----------------+-------------------------------+----------------+----------------+------------+---------------------+-------------+----------+-----------+---------+---------+ > 1 row in set (0.00 sec) > > > Hope that helps > > Andrei > ----- Original Message ----- > > From: "Rohit Yadav" <rohit.ya...@shapeblue.com> > To: us...@cloudstack.apache.org > Sent: Wednesday, 13 May, 2015 8:55:55 AM > Subject: Re: ACS 4.5.1 KVM live migration problem > > Hi Andrei, > > This looks like an issue similar to > https://issues.apache.org/jira/browse/CLOUDSTACK-6893 > Can share the row from your cloud.vlan table and value of “select > cache_mode from volume_view where vm_id=<put the vm id here>\G;" for the VM > causing the NPE? > > > On 12-May-2015, at 10:51 pm, Andrei Mikhailovsky <and...@arhont.com> > wrote: > > > > > > > > It seems that the problem is worse than i've initially thought. In fact, > I can't migrate most of my vms apart from a handful and I can't determine a > correlation between the migrateable vms and once that produce exception. > > > > Thanks for any help. > > > > Andrei > > > > ----- Original Message ----- > > > > From: "Andrei Mikhailovsky" <and...@arhont.com> > > To: us...@cloudstack.apache.org > > Sent: Tuesday, 12 May, 2015 8:53:16 PM > > Subject: ACS 4.5.1 KVM live migration problem > > > > Hi, > > > > I am having an issue migrating some of vms after recently upgrading to > ACS 4.5.1. I am running Ubuntu 14.04 on both host and management servers. > Here is the output from the log file on a client agent : > > > > > > 2015-05-12 20:42:34,154 DEBUG [kvm.resource.LibvirtComputingResource] > (agentRequest-Handler-1:null) Preparing host for migrating > com.cloud.agent.api.to.VirtualMachineTO@21a038ac > > 2015-05-12 20:42:34,157 DEBUG [kvm.resource.LibvirtConnection] > (agentRequest-Handler-1:null) can't find connection: KVM, for vm: > i-9-1162-VM, continue > > 2015-05-12 20:42:34,159 DEBUG [kvm.resource.LibvirtConnection] > (agentRequest-Handler-1:null) can't find connection: LXC, for vm: > i-9-1162-VM, continue > > 2015-05-12 20:42:34,159 DEBUG [kvm.resource.LibvirtConnection] > (agentRequest-Handler-1:null) can't find which hypervisor the vm used , > then use the default hypervisor > > 2015-05-12 20:42:34,160 DEBUG [kvm.resource.BridgeVifDriver] > (agentRequest-Handler-1:null) nic=[Nic:Guest-178.248.108.205-vlan://2014] > > 2015-05-12 20:42:34,160 DEBUG [kvm.resource.BridgeVifDriver] > (agentRequest-Handler-1:null) creating a vNet dev and bridge for guest > traffic per traffic label cloudstackbr0 > > 2015-05-12 20:42:34,160 DEBUG [kvm.resource.BridgeVifDriver] > (agentRequest-Handler-1:null) Executing: > /usr/share/cloudstack-common/scripts/vm/network/vnet/modifyvlan.sh -v 2014 > -p bond0 -b brbond0-2014 -o add > > 2015-05-12 20:42:34,211 DEBUG [kvm.resource.BridgeVifDriver] > (agentRequest-Handler-1:null) Execution is successful. > > 2015-05-12 20:42:34,211 DEBUG [kvm.resource.BridgeVifDriver] > (agentRequest-Handler-1:null) nic=[Nic:Guest-10.1.1.66-null] > > 2015-05-12 20:42:34,212 DEBUG [kvm.storage.KVMStoragePoolManager] > (agentRequest-Handler-1:null) Disconnecting disk > 23add201-e4ee-447b-a448-ecd152aea4ad > > 2015-05-12 20:42:34,212 DEBUG [kvm.storage.LibvirtStorageAdaptor] > (agentRequest-Handler-1:null) Trying to fetch storage pool > cf771bc7-8998-354d-8e10-5564585a3c20 from libvirt > > 2015-05-12 20:42:34,223 DEBUG [kvm.storage.KVMStoragePoolManager] > (agentRequest-Handler-1:null) Disconnecting disk > 55100d25-410e-4fa3-a38b-7717f74d2afe > > 2015-05-12 20:42:34,223 DEBUG [kvm.storage.LibvirtStorageAdaptor] > (agentRequest-Handler-1:null) Trying to fetch storage pool > cf771bc7-8998-354d-8e10-5564585a3c20 from libvirt > > 2015-05-12 20:42:34,232 DEBUG [kvm.storage.KVMStoragePoolManager] > (agentRequest-Handler-1:null) Disconnecting disk > 2db59d16-d17f-49a1-b913-7fbe4025a549 > > 2015-05-12 20:42:34,233 DEBUG [kvm.storage.LibvirtStorageAdaptor] > (agentRequest-Handler-1:null) Trying to fetch storage pool > cf771bc7-8998-354d-8e10-5564585a3c20 from libvirt > > 2015-05-12 20:42:34,243 DEBUG [kvm.storage.KVMStoragePoolManager] > (agentRequest-Handler-1:null) Disconnecting disk > 17afbf31-ac89-46f7-a2c8-f8aed796e4c6 > > 2015-05-12 20:42:34,243 DEBUG [kvm.storage.LibvirtStorageAdaptor] > (agentRequest-Handler-1:null) Trying to fetch storage pool > d8d5ec36-3cb0-39af-8fc6-084a4abd5d28 from libvirt > > 2015-05-12 20:42:34,254 WARN [cloud.agent.Agent] > (agentRequest-Handler-1:null) Caught: > > java.lang.NullPointerException > > at > com.cloud.network.Networks$BroadcastDomainType.getSchemeValue(Networks.java:172) > > at > com.cloud.network.Networks$BroadcastDomainType.getValue(Networks.java:226) > > at > com.cloud.hypervisor.kvm.resource.BridgeVifDriver.plug(BridgeVifDriver.java:105) > > at > com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:3230) > > at > com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1307) > > at com.cloud.agent.Agent.processRequest(Agent.java:503) > > at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:808) > > at com.cloud.utils.nio.Task.run(Task.java:84) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > > 2015-05-12 20:42:34,256 DEBUG [cloud.agent.Agent] > (agentRequest-Handler-1:null) Seq 7-7525233502359390941: { Ans: , MgmtId: > 115129173025118, via: 7, Ver: v1, Flags: 110, > [{"com.cloud.agent.api.Answer":{"result":false,"details":"java.lang.NullPointerException\n\tat > com.cloud.network.Networks$BroadcastDomainType.getSchemeValue(Networks.java:172)\n\tat > com.cloud.network.Networks$BroadcastDomainType.getValue(Networks.java:226)\n\tat > com.cloud.hypervisor.kvm.resource.BridgeVifDriver.plug(BridgeVifDriver.java:105)\n\tat > com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:3230)\n\tat > com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1307)\n\tat > com.cloud.agent.Agent.processRequest(Agent.java:503)\n\tat > com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:808)\n\tat > com.cloud.utils.nio.Task.run(Task.java:84)\n\tat > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat > java.lang.Thread.run(Thread.java:745)\n","wait":0}}] } > > > > > > > > Any idea how to get this fixed? Not sure why all of a sudden the > migration stopped working for a handful of vms. I can successfully migrate > some vms, but not others. > > > > Thanks > > > > Andrei > > > > > > Regards, > Rohit Yadav > Software Architect, ShapeBlue > M. +91 88 262 30892 | rohit.ya...@shapeblue.com > Blog: bhaisaab.org | Twitter: @_bhaisaab > > > > Find out more about ShapeBlue and our range of CloudStack related services > > IaaS Cloud Design & Build< > http://shapeblue.com/iaas-cloud-design-and-build//> > CSForge – rapid IaaS deployment framework<http://shapeblue.com/csforge/> > CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/> > CloudStack Software Engineering< > http://shapeblue.com/cloudstack-software-engineering/> > CloudStack Infrastructure Support< > http://shapeblue.com/cloudstack-infrastructure-support/> > CloudStack Bootcamp Training Courses< > http://shapeblue.com/cloudstack-training/> > > This email and any attachments to it may be confidential and are intended > solely for the use of the individual to whom it is addressed. Any views or > opinions expressed are solely those of the author and do not necessarily > represent those of Shape Blue Ltd or related companies. If you are not the > intended recipient of this email, you must neither take any action based > upon its contents, nor copy or show it to anyone. Please contact the sender > if you believe you have received this email in error. Shape Blue Ltd is a > company incorporated in England & Wales. ShapeBlue Services India LLP is a > company incorporated in India and is operated under license from Shape Blue > Ltd. Shape Blue Brasil Consultoria Ltda is a company incorporated in Brasil > and is operated under license from Shape Blue Ltd. ShapeBlue SA Pty Ltd is > a company registered by The Republic of South Africa and is traded under > license from Shape Blue Ltd. ShapeBlue is a registered trademark. > >