Thanks Sumit for reporting the libvirt XML! I asked the same thing to the original reporter of the bug.
If the XML has two interfaces, this means that two ports are present in network_info - which is produced by _allocate_network. In that case we can exclude the re-scheduling issue, as the problem is probably confined in nova.network.quantumv2.api Salvatore PS: Perhaps is better to move the discussion on the lp answer or experiment ask.openstack.org, so more people might contribute, On 26 March 2013 18:36, Sumit Naiksatam <[email protected]> wrote: > We saw this issue in Grizzly as well. I investigated the Quantum logs > and I did not find anything bad. The VM actually does get two > interfaces in this case, so it seemed like some race condition on the > nova side: > > <interface type="bridge"> > <mac address="fa:16:3e:a8:fe:96"/> > <model type="virtio"/> > <driver name="qemu"/> > <source bridge="qbr7aac0341-19"/> > <target dev="tap7aac0341-19"/> > <filterref filter="nova-instance-instance-00000004-fa163ea8fe96"> > <parameter name="IP" value="10.194.193.4"/> > <parameter name="DHCPSERVER" value="10.194.193.2"/> > <parameter name="PROJNET" value="10.194.193.0"/> > <parameter name="PROJMASK" value="255.255.255.0"/> > </filterref> > </interface> > <interface type="bridge"> > <mac address="fa:16:3e:9a:b9:33"/> > <model type="virtio"/> > <driver name="qemu"/> > <source bridge="qbr7ddce382-c7"/> > <target dev="tap7ddce382-c7"/> > <filterref filter="nova-instance-instance-00000004-fa163e9ab933"> > <parameter name="IP" value="10.194.193.5"/> > <parameter name="DHCPSERVER" value="10.194.193.2"/> > <parameter name="PROJNET" value="10.194.193.0"/> > <parameter name="PROJMASK" value="255.255.255.0"/> > </filterref> > </interface> > > Thanks, > ~Sumit. > > On Tue, Mar 26, 2013 at 10:24 AM, Salvatore Orlando <[email protected]> > wrote: >> The reschedule process is apparently safe (at least from my experience). >> I'm not sure how much the sequentiality of the IPs might be a hint of >> a different problem, as in the lp answer I see a case where the >> duplicated addresses are not sequential. >> Also, the script that is launching these VMs might spawn requests in >> parallel, and so even without reschedule event there should not be any >> guarantee about sequentiality of the resulting IPs. >> >> It's interesting to understand whether the total number of port is the >> same as the total number of VM, ie: is there any VM without any port? >> >> Salvatore >> >> On 26 March 2013 17:45, Dan Wendlandt <[email protected]> wrote: >>> >>> >>> On Tue, Mar 26, 2013 at 9:36 AM, Gary Kotton <[email protected]> wrote: >>>> >>>> Hi, >>>> I have seen something like this with stable folsom. We have yet to be able >>>> to reproduce it. In our setup we saw that there were timeouts with the >>>> quantum service. In addition to this we had 2 compute nodes. My gut feeling >>>> was that one of the nodes has a failer and the scheduler selects another >>>> node. The second node will allocate another port. >>>> In allocate instance >>>> https://github.com/openstack/nova/blob/master/nova/network/quantumv2/api.py#L129. >>>> I think tht we should check that a port already exists on the requested >>>> network. >>>> I am away with my family so will not have a chance to look at this till I >>>> get back. >>> >>> >>> Yeah, that was my initial thought as well, though in the case that was >>> reported in the but, the IPs are sequential is what suggested to me that it >>> may not be a "reschedule", since the user is spinning up many VMs at once, >>> so sequential IPs would seem unlikely if there was a lot of lag time between >>> the creation of the first port and second port. >>> >>> It may make sense to have a check that deletes any previous ports with the >>> same device-id before allocating new ones (I probably wouldn't just re-use >>> it, as we don't quite know what state it may be in). >>> >>> dan >>> >>> >>>> >>>> Sorry >>>> Gary >>>> >>>> >>>> >>>> On 03/26/2013 05:58 PM, Dan Wendlandt wrote: >>>> >>>> This is interesting. I'll be in customer meetings and flying for the next >>>> few hours, so I thought I'd send it out in case anyone else has time to >>>> investigate first. >>>> >>>> https://bugs.launchpad.net/quantum/+bug/1160442 >>>> >>>> for details, see the associated question: >>>> https://answers.launchpad.net/quantum/+question/225158 >>>> >>>> Also, we discussed this a while back, but we should double-check that >>>> there is no risk with a malicious user changing the device_id of its port >>>> to >>>> match another VM of another tenant, causing confusion by making that tenant >>>> suddenly see multiple floating IPs for its VM. I think at the time we >>>> decided that all queries from nova to quantum were properly scoped by >>>> tenant, but it is worth make sure. >>>> >>>> Dan >>>> >>>> -- >>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>> Dan Wendlandt >>>> Nicira, Inc: www.nicira.com >>>> twitter: danwendlandt >>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>>> >>>> >>>> >>> >>> >>> >>> -- >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> Dan Wendlandt >>> Nicira, Inc: www.nicira.com >>> twitter: danwendlandt >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> -- >>> Mailing list: https://launchpad.net/~quantum-core >>> Post to : [email protected] >>> Unsubscribe : https://launchpad.net/~quantum-core >>> More help : https://help.launchpad.net/ListHelp >>> >> >> -- >> Mailing list: https://launchpad.net/~quantum-core >> Post to : [email protected] >> Unsubscribe : https://launchpad.net/~quantum-core >> More help : https://help.launchpad.net/ListHelp -- Mailing list: https://launchpad.net/~quantum-core Post to : [email protected] Unsubscribe : https://launchpad.net/~quantum-core More help : https://help.launchpad.net/ListHelp

