Turns out my issue is simply that Havana requires Open vSwitch >=1.10 and my compute nodes were still running 1.4. I some how managed to miss the errors in the logs (didn't look far enough back) indicating the failure to properly create the flooding flow on the tunnel bridge:
/var/log/neutron/openvswitch-agent.log.4.gz:Command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ovs-ofctl', 'mod-flows', 'br-tun', 'hard_timeout=0,idle_timeout=0,priority=1,table=21,dl_vlan=1,actions=strip_vlan,set_tunnel:3,output:4,58,56,11,12,47,13,48,49,44,43,45,46,30,31,29,28,26,27,24,25,32,19,21,59,60,57,6,5,20,18,17,16,15,14,7,9,8,53,10,3,2,38,37,39,40,34,23,36,35,22,42,41,54,52,51,50,55,33'] /var/log/neutron/openvswitch-agent.log.4.gz:Stderr: 'ovs-ofctl: unknown keyword hard_timeout\n' v1.10 was installed on upgrade, but since it's tied to kernel module, requires a reboot of the compute nodes to make it go. On Fri, Jan 31, 2014 at 3:30 AM, Ruzicka, Marek <[email protected]> wrote: > Hi Jon, > > By any chance, do you have any kind of asymmetric routing in place? > This is definitely a long shot, since I have no idea about your setup, but we > have experienced similar issues ourselves. > In our case it was problem with asymmetric routing and rather dumb linux > defaults when it comes to arp settings. > > Try to check what are your current settings, and if they differ, try these: > > net.ipv4.conf.all.arp_announce=1 > net.ipv4.conf.default.arp_announce=1 > net.ipv4.conf.all.arp_notify=1 > net.ipv4.conf.default.arp_notify=1 > net.ipv4.conf.all.rp_filter=0 > net.ipv4.conf.default.rp_filter=0 > > Just shooting from the hip here, so sorry if I'm completely wrong here. > > Marek > > -----Original Message----- > From: Jonathan Proulx [mailto:[email protected]] > Sent: 30. januára 2014 19:11 > To: Robert Collins > Cc: [email protected] > Subject: Re: [Openstack] [Neutron] asymetric DHCP brokenness on tenant GRE > networks > > Still can't quite sort this out but I am circling in on where the problem is. > > To recap bootpc and arp requests from instances using GRE tenant networks are > not making it onto the physical network, I suspect this is "all broadcast > traffic". If IP is configured statically and the arp cache is set (by > pinging from the other end, network controller in this case) I can > communicate over the link, until the arp cache times out... > > By fiddling with ovs port mirroring I've been able to determine where the > packets disappear from my expected path (and verified that packets are > visible at these point when traffic is passing): > > > tap (has packets) -> patch-tun (has packets) -> patch-int (still > there) -> gre-<N> (no packets) -> eth0 (no packets) > > \___________________________________/ > \_____________________________________/ (GRE wrapped) > > br-int > br-tun IP of > tunnel endpoint > > > That will probably get mangled by line wrapping but packets make it to the > tunnel bridge, br-tun, on the patch-int interface but do not make it onto the > gre-<n> interface. This is consistent across multiple GRE networks including > newly created ones. The provider VLAN networks most of our instances use > function normally (on a much different path), and GRE used to work definitely > with Grizzly though not sure if they broke on upgrade or since then as > they're not widely used. > > so my basic question remains WTF? > > -Jon > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : [email protected] > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : [email protected] Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
