Bug submitted:
https://bugs.launchpad.net/neutron/+bug/1482521

Thanks,
Artur

From: Oleg Bondarev [mailto:obonda...@mirantis.com]
Sent: Thursday, August 6, 2015 5:18 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron][dvr] Removing fip namespace when 
restarting L3 agent.



On Thu, Aug 6, 2015 at 5:23 PM, Korzeniewski, Artur 
<artur.korzeniew...@intel.com<mailto:artur.korzeniew...@intel.com>> wrote:
Thanks Kevin for that hint.
But it does not resolve the connectivity problem, it is just not removing the 
namespace when it is asked to.
The real question is, why do we invoke the 
/neutron/neutron/agent/l3/dvr_fip_ns.py FipNamespace.delete() method in the 
first place?

I’ve captured the traceback for this situation:
2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.utils [-] Unable to access 
/opt/openstack/data/neutron/external/pids/8223e12e-837b-49d4-9793-63603fccbc9f.pid
 from (pid=70216) get_value_from_file 
/opt/openstack/neutron/neutron/agent/linux/utils.py:222
2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.utils [-] Unable to access 
/opt/openstack/data/neutron/external/pids/8223e12e-837b-49d4-9793-63603fccbc9f.pid
 from (pid=70216) get_value_from_file 
/opt/openstack/neutron/neutron/agent/linux/utils.py:222
2015-08-06 06:35:28.469 DEBUG neutron.agent.linux.external_process [-] No 
process started for 8223e12e-837b-49d4-9793-63603fccbc9f from (pid=70216) 
disable /opt/openstack/neutron/neutron/agent/linux/external_process.py:113
Traceback (most recent call last):
 File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 117, in 
switch
    self.greenlet.switch(value)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 
214, in main
    result = function(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 
612, in run_service
    service.start()
  File "/opt/openstack/neutron/neutron/service.py", line 233, in start
    self.manager.after_start()
  File "/opt/openstack/neutron/neutron/agent/l3/agent.py", line 641, in 
after_start
    self.periodic_sync_routers_task(self.context)
  File "/opt/openstack/neutron/neutron/agent/l3/agent.py", line 519, in 
periodic_sync_routers_task
    self.fetch_and_sync_all_routers(context, ns_manager)
  File "/opt/openstack/neutron/neutron/agent/l3/namespace_manager.py", line 91, 
in __exit__
    self._cleanup(_ns_prefix, ns_id)
  File "/opt/openstack/neutron/neutron/agent/l3/namespace_manager.py", line 
140, in _cleanup
    ns.delete()
  File "/opt/openstack/neutron/neutron/agent/l3/dvr_fip_ns.py", line 147, in 
delete
    raise TypeError("ss")
TypeError: ss

It seems that the fip namespace is not processed at startup of L3 agent, and 
the cleanup is removing the namespace…
It is also removing the interface to local dvr router connection so… VM has no 
internet access with floating IP:
Command: ['ip', 'netns', 'exec', 'fip-8223e12e-837b-49d4-9793-63603fccbc9f', 
'ip', 'link', 'del', u'fpr-fe517b4b-d']

If the interface inside the fip namespace is not deleted, the VM has full 
internet access without any downtime.

Ca we consider it a bug? I guess it is something in startup/full-sync logic 
since the log is saying:
/opt/openstack/data/neutron/external/pids/8223e12e-837b-49d4-9793-63603fccbc9f.pid

I think yes, we can consider it a bug. Can you please file one? I can take and 
probably fix it.


And after finishing the sync loop, the fip namespace is deleted…

Regards,
Artur

From: Kevin Benton [mailto:blak...@gmail.com<mailto:blak...@gmail.com>]
Sent: Thursday, August 6, 2015 7:40 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron][dvr] Removing fip namespace when 
restarting L3 agent.

Can you try setting the following to False:
https://github.com/openstack/neutron/blob/dc0944f2d4e347922054bba679ba7f5d1ae6ffe2/etc/l3_agent.ini#L97

On Wed, Aug 5, 2015 at 3:36 PM, Korzeniewski, Artur 
<artur.korzeniew...@intel.com<mailto:artur.korzeniew...@intel.com>> wrote:
Hi all,
During testing of Neutron upgrades, I have found that restarting the L3 agent 
in DVR mode is causing the VM network downtime for configured floating IP.
The lockdown is visible when pinging the VM from external network, 2-3 pings 
are lost.
The responsible place in code is:
DVR: destroy fip ns: fip-8223e12e-837b-49d4-9793-63603fccbc9f from (pid=156888) 
delete /opt/openstack/neutron/neutron/agent/l3/dvr_fip_ns.py:164

Can someone explain why the fip namespace is deleted? Can we workout the 
situation, when there is no downtime of VM access?

Artur Korzeniewski
--------------------------------------------
Intel Technology Poland sp. z o.o.
KRS 101882
ul. Slowackiego 173, 80-298 Gdansk


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--
Kevin Benton

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to