[openstack-dev] [Neutron] Crash Issue: OVS-Agent status needs to be fully represented/processed
Recently we encountered some ovs-agent crash issues. [1][2][3] *[Root cause]* 1. Currently only a 'restarted' flag is used in rpc_loop() to identify ovs status. * ovs_restarted = self.check_ovs_restart() * *True*: ovs is running, but a restart happened before this loop. rpc_loop() reset bridges and re-process ports. *False*: ovs is running since last loop, rpc_loop() continue to process in a normal way. But if ovs is dead, or is not up yet during a restart, check_ovs_restart() will incorrectly returns "True". Then rpc_loop() continues to reset bridges, and apply other ovs operations, till causing exceptions/crash. Related Bug: [1] [2] 2. Also, during agent boot up, ovs status is not checked at all. Agent crashes without no useful log info, when ovs is dead. Related Bug: [3] *[Proposal]* 1. Add const {NORMAL, DEAD, RESTARTED} to represent ovs status. NORMAL - ovs is running since last loop, rpc_loop() continue to process in a normal way. RESTARTED - ovs is running, but a restart happened before this loop. rpc_loop() reset bridges and re-process ports. DEAD - keep agent running, but rpc_loop() doesn't apply ovs operations to prevent unnecessary exceptions/crash. When ovs is up, it enters RESTARTED mode; 2. Check ovs status during agent boot up, if it's DEAD, exit graceful since subsequent operations causes a crash, and write log to remind that ovs_dead causes agent termination. *[Code Review]* https://review.openstack.org/#/c/110538/ Will be appreciated if you could share some thoughts or do a quick code review. Thanks. Best, Robin [1] https://bugs.launchpad.net/neutron/+bug/1296202 [2] https://bugs.launchpad.net/neutron/+bug/1350179 [3] https://bugs.launchpad.net/neutron/+bug/1351135 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Neutron Distributed Virtual Router
NSX makes firewall distributed also. So besides VPN, before neutron implements FW also in a distributed fashion, it might be another reason that people need existing router. Discussion about advanced services and dvr is recorded here: https://etherpad.openstack.org/p/Distributed-Virtual-Router Best, Robin Wang 2013/12/10 Nachi Ueno > Hi Yong > > NSX have two kind of router. > Edge and distributed router. > > Edge node will work as some VPN services and advanced service nodes. > > Actually, VPNaaS OSS impl is running in l3-agent. > so IMO, we need l3-agent also for basis of some edge services. > > > > > > 2013/12/9 Yongsheng Gong : > > If distributed router is good enough, why do we still need > non-distributed > > router? > > > > > > On Tue, Dec 10, 2013 at 9:04 AM, Ian Wells > wrote: > >> > >> I would imagine that, from the Neutron perspective, you get a single > >> router whether or not it's distributed. I think that if a router is > >> distributed - regardless of whether it's tenant-tenant or > tenant-outside - > >> it certainly *could* have some sort of SLA flag, but I don't think a > simple > >> 'distributed' flag is either here or there; it's not telling the tenant > >> anything meaningful. > >> > >> > >> On 10 December 2013 00:48, Mike Wilson wrote: > >>> > >>> I guess the question that immediately comes to mind is, is there anyone > >>> that doesn't want a distributed router? I guess there could be someone > out > >>> there that hates the idea of traffic flowing in a balanced fashion, but > >>> can't they just run a single router then? Does there really need to be > some > >>> flag to disable/enable this behavior? Maybe I am oversimplifying > things... > >>> you tell me. > >>> > >>> -Mike Wilson > >>> > >>> > >>> On Mon, Dec 9, 2013 at 3:01 PM, Vasudevan, Swaminathan (PNB Roseville) > >>> wrote: > >>>> > >>>> Hi Folks, > >>>> > >>>> We are in the process of defining the API for the Neutron Distributed > >>>> Virtual Router, and we have a question. > >>>> > >>>> > >>>> > >>>> Just wanted to get the feedback from the community before we implement > >>>> and post for review. > >>>> > >>>> > >>>> > >>>> We are planning to use the “distributed” flag for the routers that are > >>>> supposed to be routing traffic locally (both East West and North > South). > >>>> > >>>> This “distributed” flag is already there in the “neutronclient” API, > but > >>>> currently only utilized by the “Nicira Plugin”. > >>>> > >>>> We would like to go ahead and use the same “distributed” flag and add > an > >>>> extension to the router table to accommodate the “distributed flag”. > >>>> > >>>> > >>>> > >>>> Please let us know your feedback. > >>>> > >>>> > >>>> > >>>> Thanks. > >>>> > >>>> > >>>> > >>>> Swaminathan Vasudevan > >>>> > >>>> Systems Software Engineer (TC) > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> HP Networking > >>>> > >>>> Hewlett-Packard > >>>> > >>>> 8000 Foothills Blvd > >>>> > >>>> M/S 5541 > >>>> > >>>> Roseville, CA - 95747 > >>>> > >>>> tel: 916.785.0937 > >>>> > >>>> fax: 916.785.1815 > >>>> > >>>> email: swaminathan.vasude...@hp.com > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> ___ > >>>> OpenStack-dev mailing list > >>>> OpenStack-dev@lists.openstack.org > >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >>>> > >>> > >>> > >>> ___ > >>> OpenStack-dev mailing list > >>> OpenStack-dev@lists.openstack.org > >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >>> > >> > >> > >> ___ > >> OpenStack-dev mailing list > >> OpenStack-dev@lists.openstack.org > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >> > > > > > > ___ > > OpenStack-dev mailing list > > OpenStack-dev@lists.openstack.org > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > ___ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] Distributed Virtual Router Discussion
Hi Artem, Very happy to see more stackers working on this feature. : ) "Note that the images in your document are badly corrupted - maybe my questions could already be answered by your diagrams. " I met the same issue at first. Downloading the doc and open it locally may help. It works for me. Also, a wiki page for DVR/VDR feature is created, including some interesting performance test output. Thanks. https://wiki.openstack.org/wiki/Distributed_Router_for_OVS Best, Robin Wang From: Artem Dmytrenko Date: 2013-10-22 02:51 To: yong sheng gong \(gong...@unitedstack.com\); cloudbe...@gmail.com; OpenStack Development Mailing List Subject: Re: [openstack-dev] Distributed Virtual Router Discussion Hi Swaminathan. I work for a virtual networking startup called Midokura and I'm very interested in joining the discussion. We currently have distributed router implementation using existing Neutron API. Could you clarify why distributed vs centrally located routing implementation need to be distinguished? Another question is that are you proposing distributed routing implementation for tenant routers or for the router connecting the virtual cloud to the external network? The reason that I'm asking this question is because our company would also like to propose a router implementation that would eliminate a single point uplink failures. We have submitted a couple blueprints on that topic (https://blueprints.launchpad.net/neutron/+spec/provider-router-support, https://blueprints.launchpad.net/neutron/+spec/bgp-dynamic-routing) and would appreciate an opportunity to collaborate on making it a reality. Note that the images in your document are badly corrupted - maybe my questions could already be answered by your diagrams. Could you update your document with legible diagrams? Looking forward to further discussing this topic with you! Sincerely, Artem Dmytrenko On Mon, 10/21/13, Vasudevan, Swaminathan (PNB Roseville) wrote: Subject: [openstack-dev] Distributed Virtual Router Discussion To: "yong sheng gong (gong...@unitedstack.com)" , "cloudbe...@gmail.com" , "OpenStack Development Mailing List (openstack-dev@lists.openstack.org)" Date: Monday, October 21, 2013, 12:18 PM Hi Folks, I am currently working on a blueprint for Distributed Virtual Router. If anyone interested in being part of the discussion please let me know. I have put together a first draft of my blueprint and have posted it on Launchpad for review. https://blueprints.launchpad.net/neutron/+spec/neutron-ovs-dvr Thanks. Swaminathan Vasudevan Systems Software Engineer (TC) HP Networking Hewlett-Packard 8000 Foothills Blvd M/S 5541 Roseville, CA - 95747 tel: 916.785.0937 fax: 916.785.1815 email: swaminathan.vasude...@hp.com -Inline Attachment Follows- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Neutron] Serious Performance Issue in OVS plugin
Hi all, In our Grizzly deployment, we found a distinct performance reduction on networking throughput. While using hybrid vif-driver, together with ovs plugin, the throughput is dramatically reduced to 2.34 Gb/s. If we turn it back to common vif-driver, throughput is 12.7Gb/s. A bug is filed on it: https://bugs.launchpad.net/neutron/+bug/1223267 Hybrid vif-driver makes it possible to leverage iptables-based security group feature. However, this dramatical performance reduction might be a big cost, especially in 10 GE environment. It'll be appricated, if you share some suggestions on how to solve this issue. Thanks very much. Best, Robin Wang ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev