Summary: Liberty OVS agent restarts are better, but still need work. See: https://bugs.launchpad.net/neutron/+bug/1514056
As many of you know, Liberty has a fix for OVS agent restarts such that it doesn’t dump all flows when starting, resulting in a loss of traffic. Unfortunately, Liberty neutron still has issues with OVS agent restarts. The fix that went into Liberty prevents it from dropping flows on the br-tun and br-int bridges and that helps greatly, but the br-ex bridge still has it’s flows cleared on startup. You may be thinking: Wait, br-ex only has like 3 flows on it, how can that be a problem? The issue appears to be that the br-ex flows are cleared early and not setup again until late in the process. This means that routers on the node where OVS agent is lose network connectivity for the majority of the restart time. I did some testing with this yesterday, comparing a few scenarios with 100 FIPS, 100 instances and various scenarios for routers. You can find the the complete data here: https://docs.google.com/spreadsheets/d/1ZGra_MszBlL0fNsFqd4nOvh1PsgWu58-GxEeh1m1BPw/edit?usp=sharing The summary looks like this: 100 routers, 100 networks, 100 floating ips, 100 instances, single node test: Kilo average outage time: 47 seconds Liberty average outage time: 37 seconds 1 router, 1 network, 100 floating ips, 100 instances, single node test: Kilo average outage time: 46 seconds Liberty average outage time: 13 seconds 1 router, 1 network, 100 floating its, 100 instances, router on a separate node, all instances on a single node, OVS restart on compute node: Kilo average outage time: 25 seconds Liberty average outage time: 0 to 1 seconds I did my testing using 1 second pings using fping to all of the floating IPs. With the last test, it frequently lost no packets, and as a result I was not really able to test the scenario other than to qualify it as good. This is a huge operational issue for us and I suspect for many of the rest of you using OVS. I’d encourage everyone that is using OVS to register interest in having this fixed in the LP bug (https://bugs.launchpad.net/neutron/+bug/1514056). Right now this bug as marked as low priority. _______________________________________________ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators