Re: [openstack-dev] [neutron] possible race condition with nova instance and neutron ports

2017-08-10 Thread Saverio Proto
Thank you.
It worked with the new settings. Now the instances are correctly in
ERROR state of network is not functional.

To solve the performance problem and have 0 instances in ERROR state I
had to enable the neutron-rootwrap-daemon. Without it the dhcp agents
were not fast enough to consume the rabbit queues.

thanks

Saverio


On 09/08/17 14:40, Sławomir Kapłoński wrote:
> With such settings nova is not waiting for info from neutron and that is why 
> Your instances starting without network ready.
> If You will change this timeout to some value higher than 0 then instance 
> will be paused and nova will wait for info from neutron that port is active 
> (You should also check credentials config in neutron server)
> If You will set also vif_plugging_is_fatal=True then nova will put instance 
> in ERROR state if port will not be active after timeout time.
> 


-- 
SWITCH
Saverio Proto, Peta Solutions
Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland
phone +41 44 268 15 15, direct +41 44 268 1573
saverio.pr...@switch.ch, http://www.switch.ch

http://www.switch.ch/stories

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] possible race condition with nova instance and neutron ports

2017-08-09 Thread Sławomir Kapłoński
With such settings nova is not waiting for info from neutron and that is why 
Your instances starting without network ready.
If You will change this timeout to some value higher than 0 then instance will 
be paused and nova will wait for info from neutron that port is active (You 
should also check credentials config in neutron server)
If You will set also vif_plugging_is_fatal=True then nova will put instance in 
ERROR state if port will not be active after timeout time.

-- 
Slawek

> Wiadomość napisana przez Saverio Proto  w dniu 
> 09.08.2017, o godz. 14:24:
> 
> Hello,
> 
> thanks for the tip.
> I checked on a compute node, in the nova.conf file I have the following:
> 
> vif_plugging_is_fatal=False
> vif_plugging_timeout=0
> 
> both options are in the [DEFAULT] section.
> I guess these settings were never changed since we were running Icehouse.
> 
> Should I change to the Ocata default values ? (True and 300)
> 
> I will try that today.
> Thank you
> 
> Saverio
> 
> 
> On 09/08/17 09:23, Sławomir Kapłoński wrote:
>> Hello,
>> 
>> Do You have configured in nova-compute: vif_plugging_timeout and 
>> vif_plugging_is_fatal options?
>> With those options nova should pause VM until port will be set to ACTIVE in 
>> Neutron.
>> 
> 
> 
> -- 
> SWITCH
> Saverio Proto, Peta Solutions
> Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland
> phone +41 44 268 15 15, direct +41 44 268 1573
> saverio.pr...@switch.ch, http://www.switch.ch
> 
> http://www.switch.ch/stories
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] possible race condition with nova instance and neutron ports

2017-08-09 Thread Saverio Proto
Hello,

thanks for the tip.
I checked on a compute node, in the nova.conf file I have the following:

vif_plugging_is_fatal=False
vif_plugging_timeout=0

both options are in the [DEFAULT] section.
I guess these settings were never changed since we were running Icehouse.

Should I change to the Ocata default values ? (True and 300)

I will try that today.
Thank you

Saverio


On 09/08/17 09:23, Sławomir Kapłoński wrote:
> Hello,
> 
> Do You have configured in nova-compute: vif_plugging_timeout and 
> vif_plugging_is_fatal options?
> With those options nova should pause VM until port will be set to ACTIVE in 
> Neutron.
> 


-- 
SWITCH
Saverio Proto, Peta Solutions
Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland
phone +41 44 268 15 15, direct +41 44 268 1573
saverio.pr...@switch.ch, http://www.switch.ch

http://www.switch.ch/stories

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] possible race condition with nova instance and neutron ports

2017-08-09 Thread Sławomir Kapłoński
Hello,

Do You have configured in nova-compute: vif_plugging_timeout and 
vif_plugging_is_fatal options?
With those options nova should pause VM until port will be set to ACTIVE in 
Neutron.

-- 
Slawek

> Wiadomość napisana przez Saverio Proto  w dniu 
> 09.08.2017, o godz. 09:06:
> 
> Hello,
> 
> I see this in Openstack Newton
> 
> I start 200 instances with a oneliner.
> 
> openstack server create \
> --image "Ubuntu Xenial 16.04 (SWITCHengines)" \
> --flavor c1.small \
> --network demonetwork \
> --user-data cloud-init.txt \
> --key-name macsp \
> --min 200 \
> --max 200 test
> 
> When I do this I see a problem where all the instances boot and are in
> Running state, before all neutron ports are ACTIVE.
> 
> I see with `openstack port list` neutron ports still in BUILD state. It
> takes a longer time to make them all ACTIVE, the instances that use
> those ports boot up much faster.
> 
> In rabbitmq queues I see growing the dhcp_agent. queues.
> 
> At the very end all neutron ports will be ACTIVE and rabbitmq queues
> back to normal. But many VMs failed getting an address via DHCP because
> at the time they were trying the dnsmasq process did not have a
> corresponding entry for the instance in the `host` file. Unfortunately
> the instance gives up trying obtaining an address via DHCP after 5 minutes.
> 
> When I start 200 instances I always end up with 15 to 30 instances
> without IP address. I check carefully using cloud-init: I make them
> phone-home to check if they are alive.
> 
> Should I open a bug for this ?? It looks like a race condition where
> nova boots the instance before the neutron port is really ready.
> 
> thank you for your feedback.
> 
> Saverio
> 
> 
> 
> 
> 
> -- 
> SWITCH
> Saverio Proto, Peta Solutions
> Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland
> phone +41 44 268 15 15, direct +41 44 268 1573
> saverio.pr...@switch.ch, http://www.switch.ch
> 
> http://www.switch.ch/stories
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [neutron] possible race condition with nova instance and neutron ports

2017-08-09 Thread Saverio Proto
Hello,

I see this in Openstack Newton

I start 200 instances with a oneliner.

openstack server create \
--image "Ubuntu Xenial 16.04 (SWITCHengines)" \
--flavor c1.small \
--network demonetwork \
--user-data cloud-init.txt \
--key-name macsp \
--min 200 \
--max 200 test

When I do this I see a problem where all the instances boot and are in
Running state, before all neutron ports are ACTIVE.

I see with `openstack port list` neutron ports still in BUILD state. It
takes a longer time to make them all ACTIVE, the instances that use
those ports boot up much faster.

In rabbitmq queues I see growing the dhcp_agent. queues.

At the very end all neutron ports will be ACTIVE and rabbitmq queues
back to normal. But many VMs failed getting an address via DHCP because
at the time they were trying the dnsmasq process did not have a
corresponding entry for the instance in the `host` file. Unfortunately
the instance gives up trying obtaining an address via DHCP after 5 minutes.

When I start 200 instances I always end up with 15 to 30 instances
without IP address. I check carefully using cloud-init: I make them
phone-home to check if they are alive.

Should I open a bug for this ?? It looks like a race condition where
nova boots the instance before the neutron port is really ready.

thank you for your feedback.

Saverio





-- 
SWITCH
Saverio Proto, Peta Solutions
Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland
phone +41 44 268 15 15, direct +41 44 268 1573
saverio.pr...@switch.ch, http://www.switch.ch

http://www.switch.ch/stories

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev