> On 15 Jun 2016, at 17:27, Ihar Hrachyshka <[email protected]> wrote: > > First, some context: we talked it thru with Eugene on IRC, and Eugene > reported that he cannot reproduce the issue on his setup using Ubuntu > hypervisor with ovs 2.4: > > http://eavesdrop.openstack.org/irclogs/%23openstack-neutron/%23openstack-neutron.2016-06-13.log.html#t2016-06-13T19:45:22 > > So I went and did some testing with the functional test I have implemented. I > validated the following setups: > > - ubuntu 14.04 + ovs 2.0.x > - centos 7 + ovs 2.4 > - centos 7 + ovs 2.5 > > All of them fail to pass the test. I also pushed the test without the fix > into gate, and it failed too: > > https://review.openstack.org/329558 > > So we definitely have some sort of issue that is independent of underlying > distribution or Open vSwitch. > > With that, I believe we should go forward with the fix as a short term > solution: https://review.openstack.org/327651 (I removed WIP from it.)
The patch landed in master, and I seek to backport it to Liberty/Mitaka (backports proposed). > > I will also reach ovs developers on the matter to see if they can somehow > allow us to disable the mtu curtailing, and still stay supported. I dropped an email to [email protected] just now: http://openvswitch.org/pipermail/dev/2016-June/073190.html to seek their guidance. > > Ihar > >> On 13 Jun 2016, at 19:43, Eugene Nikanorov <[email protected]> wrote: >> >> That's interesting. >> >> >> In our deployments we do something like br-ex (linux bridge, mtu 9000) - >> OVSIntPort (mtu 65000) - br-floating (ovs bridge, mtu 1500) - br-int (ovs >> bridge, mtu 1500). >> qgs then are getting created in br-int, traffic goes all the way and that >> altogether allows jumbo frames over external network. >> >> For that reason I thought that mtu inside OVS doesn't really matter. >> This, however is for ovs 2.4.1 >> >> I wonder if that behavior has changed and if the description is available >> anywhere. >> >> Thanks, >> Eugene. >> >> On Mon, Jun 13, 2016 at 9:49 AM, Ihar Hrachyshka <[email protected]> wrote: >> Hi all, >> >> in Mitaka, we introduced a bunch of changes to the way we handle MTU in >> Neutron/Nova, making sure that the whole instance data path, starting from >> instance internal interface, thru hybrid bridge, into the br-int; as well as >> router data path (qr) have proper MTU value set on all participating >> devices. On hypervisor side, both Nova and Neutron take part in it, setting >> it with ip-link tool based on what Neutron plugin calculates for us. So far >> so good. >> >> Turns out that for OVS, it does not work as expected in regards to br-int. >> There was a bug reported lately: https://launchpad.net/bugs/1590397 >> >> Briefly, when we try to set MTU on a device that is plugged into a bridge, >> and if the bridge already has another port with lower MTU, the bridge itself >> inherits MTU from that latter port, and Linux kernel (?) does not allow to >> set MTU on the first device at all, making ip link calls ineffective. >> >> AFAIU this behaviour is consistent with Linux bridging rules: you can’t have >> ports of different MTU plugged into the same bridge. >> >> Now, that’s a huge problem for Neutron, because we plug ports that belong to >> different networks (and that hence may have different MTUs) into the same >> br-int bridge. >> >> So I played with the code locally a bit and spotted that currently, we set >> MTU for router ports before we move their devices into router namespaces. >> And once the device is in a namespace, ip-link actually works. So I wrote a >> fix with a functional test that proves the point: >> https://review.openstack.org/#/c/327651/ The fix was validated by the >> reporter of the original bug and seems to fix the issue for him. >> >> It’s suspicious that it works from inside a namespace but not when the >> device is still in the root namespace. So I reached out to Jiri Benc from >> our local Open vSwitch team, and here is a quote: >> >> === >> >> "It's a bug in ovs-vswitchd. It doesn't see the interface that's in >> other netns and thus cannot enforce the correct MTU. >> >> We'll hopefully fix it and disallow incorrect MTU setting even across >> namespaces. However, it requires significant effort and rework of ovs >> name space handling. >> >> You should not depend on the current buggy behavior. Don't set MTU of >> the internal interfaces higher than the rest of the bridge, it's not >> supported. Hacking this around by moving the interface to a netns is >> exploiting of a bug. >> >> We can certainly discuss whether this limitation could be relaxed. >> Honestly, I don't know, it's for a discussion upstream. But as of now, >> it's not supported and you should not do it.” >> >> So basically, as long as we try to plug ports with different MTUs into the >> same bridge, we are utilizing a bug in Open vSwitch, that may break us any >> time. >> >> I guess our alternatives are: >> - either redesign bridge setup for openvswitch to e.g. maintain a bridge per >> network; >> - or talk to ovs folks on whether they may support that for us. >> >> I understand the former option is too scary. It opens lots of questions, >> including upgrade impact since it will obviously introduce a dataplane >> downtime. That would be a huge shift in paradigm, probably too huge to >> swallow. The latter option may not fly with vswitch folks. Any better ideas? >> >> It’s also not clear whether we want to proceed with my immediate fix. >> Advices are welcome. >> >> Thanks, >> Ihar >> __________________________________________________________________________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: [email protected]?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> __________________________________________________________________________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: [email protected]?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
