Re: [Quantum-core] heads up on possible nova/quantum issue

Salvatore Orlando Tue, 26 Mar 2013 10:56:50 -0700

Thanks Sumit for reporting the libvirt XML!
I asked the same thing to the original reporter of the bug.


If the XML has two interfaces, this means that two ports are present
in network_info - which is produced by _allocate_network.
In that case we can exclude the re-scheduling issue, as the problem is
probably confined in nova.network.quantumv2.api

Salvatore

PS: Perhaps is better to move the discussion on the lp answer or
experiment ask.openstack.org, so more people might contribute,


On 26 March 2013 18:36, Sumit Naiksatam <[email protected]> wrote:
> We saw this issue in Grizzly as well. I investigated the Quantum logs
> and I did not find anything bad. The VM actually does get two
> interfaces in this case, so it seemed like some race condition on the
> nova side:
>
> <interface type="bridge">
> <mac address="fa:16:3e:a8:fe:96"/>
> <model type="virtio"/>
> <driver name="qemu"/>
> <source bridge="qbr7aac0341-19"/>
> <target dev="tap7aac0341-19"/>
> <filterref filter="nova-instance-instance-00000004-fa163ea8fe96">
> <parameter name="IP" value="10.194.193.4"/>
> <parameter name="DHCPSERVER" value="10.194.193.2"/>
> <parameter name="PROJNET" value="10.194.193.0"/>
> <parameter name="PROJMASK" value="255.255.255.0"/>
> </filterref>
> </interface>
> <interface type="bridge">
> <mac address="fa:16:3e:9a:b9:33"/>
> <model type="virtio"/>
> <driver name="qemu"/>
> <source bridge="qbr7ddce382-c7"/>
> <target dev="tap7ddce382-c7"/>
> <filterref filter="nova-instance-instance-00000004-fa163e9ab933">
> <parameter name="IP" value="10.194.193.5"/>
> <parameter name="DHCPSERVER" value="10.194.193.2"/>
> <parameter name="PROJNET" value="10.194.193.0"/>
> <parameter name="PROJMASK" value="255.255.255.0"/>
> </filterref>
> </interface>
>
> Thanks,
> ~Sumit.
>
> On Tue, Mar 26, 2013 at 10:24 AM, Salvatore Orlando <[email protected]> 
> wrote:
>> The reschedule process is apparently safe (at least from my experience).
>> I'm not sure how much the sequentiality of the IPs might be a hint of
>> a different problem, as in the lp answer I see a case where the
>> duplicated addresses are not sequential.
>> Also, the script that is launching these VMs might spawn requests in
>> parallel, and so even without reschedule event there should not be any
>> guarantee about sequentiality of the resulting IPs.
>>
>> It's interesting to understand whether the total number of port is the
>> same as the total number of VM, ie: is there any VM without any port?
>>
>> Salvatore
>>
>> On 26 March 2013 17:45, Dan Wendlandt <[email protected]> wrote:
>>>
>>>
>>> On Tue, Mar 26, 2013 at 9:36 AM, Gary Kotton <[email protected]> wrote:
>>>>
>>>> Hi,
>>>> I have seen something like this with stable folsom. We have yet to be able
>>>> to reproduce it. In our setup we saw that there were timeouts with the
>>>> quantum service. In addition to this we had 2 compute nodes. My gut feeling
>>>> was that one of the nodes has a failer and the scheduler selects another
>>>> node. The second node will allocate another port.
>>>> In allocate instance
>>>> https://github.com/openstack/nova/blob/master/nova/network/quantumv2/api.py#L129.
>>>> I think tht we should check that a port already exists on the requested
>>>> network.
>>>> I am away with my family so will not have a chance to look at this till I
>>>> get back.
>>>
>>>
>>> Yeah, that was my initial thought as well, though in the case that was
>>> reported in the but,  the IPs are sequential is what suggested to me that it
>>> may not be a "reschedule", since the user is spinning up many VMs at once,
>>> so sequential IPs would seem unlikely if there was a lot of lag time between
>>> the creation of the first port and second port.
>>>
>>> It may make sense to have a check that deletes any previous ports with the
>>> same device-id before allocating new ones (I probably wouldn't just re-use
>>> it, as we don't quite know what state it may be in).
>>>
>>> dan
>>>
>>>
>>>>
>>>> Sorry
>>>> Gary
>>>>
>>>>
>>>>
>>>> On 03/26/2013 05:58 PM, Dan Wendlandt wrote:
>>>>
>>>> This is interesting.  I'll be in customer meetings and flying for the next
>>>> few hours, so I thought I'd send it out in case anyone else has time to
>>>> investigate first.
>>>>
>>>> https://bugs.launchpad.net/quantum/+bug/1160442
>>>>
>>>> for details, see the associated question:
>>>> https://answers.launchpad.net/quantum/+question/225158
>>>>
>>>> Also, we discussed this a while back, but we should double-check that
>>>> there is no risk with a malicious user changing the device_id of its port 
>>>> to
>>>> match another VM of another tenant, causing confusion by making that tenant
>>>> suddenly see multiple floating IPs for its VM.  I think at the time we
>>>> decided that all queries from nova to quantum were properly scoped by
>>>> tenant, but it is worth make sure.
>>>>
>>>> Dan
>>>>
>>>> --
>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>> Dan Wendlandt
>>>> Nicira, Inc: www.nicira.com
>>>> twitter: danwendlandt
>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> Dan Wendlandt
>>> Nicira, Inc: www.nicira.com
>>> twitter: danwendlandt
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> --
>>> Mailing list: https://launchpad.net/~quantum-core
>>> Post to     : [email protected]
>>> Unsubscribe : https://launchpad.net/~quantum-core
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>
>> --
>> Mailing list: https://launchpad.net/~quantum-core
>> Post to     : [email protected]
>> Unsubscribe : https://launchpad.net/~quantum-core
>> More help   : https://help.launchpad.net/ListHelp

-- 
Mailing list: https://launchpad.net/~quantum-core
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~quantum-core
More help   : https://help.launchpad.net/ListHelp

Re: [Quantum-core] heads up on possible nova/quantum issue

Reply via email to