Turns out my issue is simply that Havana requires Open vSwitch >=1.10
and my compute nodes were still running 1.4.  I some how managed to
miss the errors in the logs (didn't look far enough back) indicating
the failure to properly create the flooding flow on the tunnel bridge:

/var/log/neutron/openvswitch-agent.log.4.gz:Command: ['sudo',
'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ovs-ofctl',
'mod-flows', 'br-tun',
'hard_timeout=0,idle_timeout=0,priority=1,table=21,dl_vlan=1,actions=strip_vlan,set_tunnel:3,output:4,58,56,11,12,47,13,48,49,44,43,45,46,30,31,29,28,26,27,24,25,32,19,21,59,60,57,6,5,20,18,17,16,15,14,7,9,8,53,10,3,2,38,37,39,40,34,23,36,35,22,42,41,54,52,51,50,55,33']
/var/log/neutron/openvswitch-agent.log.4.gz:Stderr: 'ovs-ofctl:
unknown keyword hard_timeout\n'

v1.10 was installed on upgrade, but since it's tied to kernel module,
requires a reboot of the compute nodes to make it go.

On Fri, Jan 31, 2014 at 3:30 AM, Ruzicka, Marek
<[email protected]> wrote:
> Hi Jon,
>
> By any chance, do you have any kind of asymmetric routing in place?
> This is definitely a long shot, since I have no idea about your setup, but we 
> have experienced similar issues ourselves.
> In our case it was problem with asymmetric routing and rather dumb linux 
> defaults when it comes to arp settings.
>
> Try to check what are your current settings, and if they differ, try these:
>
> net.ipv4.conf.all.arp_announce=1
> net.ipv4.conf.default.arp_announce=1
> net.ipv4.conf.all.arp_notify=1
> net.ipv4.conf.default.arp_notify=1
> net.ipv4.conf.all.rp_filter=0
> net.ipv4.conf.default.rp_filter=0
>
> Just shooting from the hip here, so sorry if I'm completely wrong here.
>
> Marek
>
> -----Original Message-----
> From: Jonathan Proulx [mailto:[email protected]]
> Sent: 30. januára 2014 19:11
> To: Robert Collins
> Cc: [email protected]
> Subject: Re: [Openstack] [Neutron] asymetric DHCP brokenness on tenant GRE 
> networks
>
> Still can't quite sort this out but I am circling in on where the problem is.
>
> To recap bootpc and arp requests from instances using GRE tenant networks are 
> not making it onto the physical network,  I suspect this is "all broadcast 
> traffic".  If IP is configured statically and the arp cache is set (by 
> pinging from the other end, network controller in this case) I can 
> communicate over the link, until the arp cache times out...
>
> By fiddling with ovs port mirroring I've been able to determine where the 
> packets disappear from my expected path (and verified that packets are 
> visible at these point when traffic is passing):
>
>
> tap (has packets) -> patch-tun (has packets) -> patch-int (still
> there) -> gre-<N> (no packets) -> eth0 (no packets)
>
> \___________________________________/
> \_____________________________________/    (GRE wrapped)
>
>                         br-int
>                  br-tun                                          IP of
> tunnel endpoint
>
>
> That will probably get mangled by line wrapping but packets make it to the 
> tunnel bridge, br-tun, on the patch-int interface but do not make it onto the 
> gre-<n> interface.  This is consistent across multiple GRE networks including 
> newly created ones.  The provider VLAN networks most of our instances use 
> function normally (on a much different path), and GRE used to work definitely 
> with Grizzly though not sure if they broke on upgrade or since then as 
> they're not widely used.
>
> so my basic question remains WTF?
>
> -Jon
>
> _______________________________________________
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : [email protected]
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : [email protected]
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Reply via email to