Reviewed: https://review.openstack.org/368553 Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=4361f7543f984cf5f09c0c7070ac6b0f22f3b6b1 Submitter: Jenkins Branch: master
commit 4361f7543f984cf5f09c0c7070ac6b0f22f3b6b1 Author: IWAMOTO Toshihiro <iwam...@valinux.co.jp> Date: Mon Sep 12 14:36:18 2016 +0900 of_interface: Use vlan_tci instead of vlan_vid To pop VLAN tags in learn action generated flows, vlan_tci should be used instead of vlan_vid. Otherwise, VLAN tags with VID=0 are left. Change-Id: Ie38ab860424f6e2e2448abac82c428dae3a8a544 Closes-bug: #1622017 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1622017 Title: OVS agent is not removing VLAN tags before tunnels when configured with native OF interface Status in neutron: Fix Released Bug description: In investigating an MTU issue, an accounted-for overhead of 4 bytes was discovered. A spurious 802.1q header was discovered using tcpdump when attempting to connect to a guest via floating IP. The tenant network type is VXLAN and the VXLAN endpoints themselves are on a VLAN. This issue effectively breaks communication with guests via floating ip for some system configurations. The test system is configured with a default global_physnet_mtu of 1500 and inspection of the router namespace confirms that the tenant network's router interface has been automatically configured to with an MTU of 1450. Ping was used to test. e.g. ping -M do -s 1422 192.0.2.58 (1422 is the maximum that should fit in the 1450 MTU without fragmentation). With the system configured as described, "ping -s 1420 <floating ip>" fails. tcpdump on the controller reveals: root@overcloud-controller-0 heat-admin]# tcpdump -vvv -e -i any icmp tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes 18:32:49.163223 P 52:54:00:01:09:3c (oui Unknown) ethertype IPv4 (0x0800), length 1464: (tos 0x0, ttl 64, id 37535, offset 0, flags [DF], proto ICMP (1), length 1448) 192.0.2.1 > 192.0.2.58: ICMP echo request, id 16083, seq 1, length 1428 18:32:49.163340 In 00:00:00:00:00:00 (oui Ethernet) ethertype IPv4 (0x0800), length 592: (tos 0xc0, ttl 64, id 4395, offset 0, flags [none], proto ICMP (1), length 576) overcloud-controller-0.tenant.localdomain > overcloud-controller-0.tenant.localdomain: ICMP overcloud-novacompute-0.tenant.localdomain unreachable - need to frag (mtu 1500), length 556 (tos 0x0, ttl 64, id 22077, offset 0, flags [DF], proto UDP (17), length 1502) overcloud-controller-0.tenant.localdomain.51706 > overcloud-novacompute-0.tenant.localdomain.4789: [no cksum] VXLAN, flags [I] (0x08), vni 36 Adjusting the ping size to allow for a 4 byte header (e.g. ping -s 1418 <floating ip>) succeeds. Using an alternate tcpdump command to get information from the VXLAN traffic, reveals unusual extra 802.1q header with a vlan ID of 0: [root@overcloud-controller-0 heat-admin]# tcpdump -vvv -n -e -i any udp tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes 18:36:48.095985 Out 56:13:19:d8:af:27 ethertype IPv4 (0x0800), length 1516: (tos 0x0, ttl 64, id 22088, offset 0, flags [DF], proto UDP (17), length 1500) 172.16.0.5.51706 > 172.16.0.10.4789: [no cksum] VXLAN, flags [I] (0x08), vni 36 fa:16:3e:99:37:ce > fa:16:3e:06:65:6f, ethertype 802.1Q (0x8100), length 1464: vlan 0, p 0, ethertype IPv4, (tos 0x0, ttl 63, id 37541, offset 0, flags [DF], proto ICMP (1), length 1446) 192.0.2.1 > 192.168.2.101: ICMP echo request, id 16422, seq 1, length 1426 18:36:48.097861 P ea:0c:37:f7:69:5e ethertype 802.1Q (0x8100), length 1520: vlan 50, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 22354, offset 0, flags [DF], proto UDP (17), length 1500) 172.16.0.10.50337 > 172.16.0.5.4789: [no cksum] VXLAN, flags [I] (0x08), vni 36 The flow table is similar to (this was taken from the compute node, not the controller but the br-tun flow tables follow the same form with only different values for local segment IDs) [root@overcloud-novacompute-0 ml2]# ovs-ofctl -O OpenFlow13 dump-flows br-tun OFPST_FLOW reply (OF1.3) (xid=0x2): cookie=0xb13175655506ca2e, duration=11.785s, table=0, n_packets=0, n_bytes=0, priority=1,in_port=1 actions=goto_table:2 cookie=0xb13175655506ca2e, duration=10.955s, table=0, n_packets=0, n_bytes=0, priority=1,in_port=2 actions=goto_table:4 cookie=0xb13175655506ca2e, duration=11.783s, table=0, n_packets=0, n_bytes=0, priority=0 actions=drop cookie=0xb13175655506ca2e, duration=11.781s, table=2, n_packets=0, n_bytes=0, priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=goto_table:20 cookie=0xb13175655506ca2e, duration=11.779s, table=2, n_packets=0, n_bytes=0, priority=0,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=goto_table:22 cookie=0xb13175655506ca2e, duration=11.778s, table=3, n_packets=0, n_bytes=0, priority=0 actions=drop cookie=0xb13175655506ca2e, duration=10.677s, table=4, n_packets=0, n_bytes=0, priority=1,tun_id=0x24 actions=push_vlan:0x8100,set_field:4097->vlan_vid,goto_table:10 cookie=0xb13175655506ca2e, duration=11.777s, table=4, n_packets=0, n_bytes=0, priority=0 actions=drop cookie=0xb13175655506ca2e, duration=11.776s, table=6, n_packets=0, n_bytes=0, priority=0 actions=drop cookie=0xb13175655506ca2e, duration=11.774s, table=10, n_packets=0, n_bytes=0, priority=1 actions=learn(table=20,hard_timeout=300,priority=1,cookie=0xb13175655506ca2e,OXM_OF_VLAN_VID[],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->OXM_OF_VLAN_VID[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:1 cookie=0xb13175655506ca2e, duration=11.772s, table=20, n_packets=0, n_bytes=0, priority=0 actions=goto_table:22 cookie=0xb13175655506ca2e, duration=10.680s, table=22, n_packets=0, n_bytes=0, priority=1,dl_vlan=1 actions=pop_vlan,set_field:0x24->tun_id,output:2 cookie=0xb13175655506ca2e, duration=11.771s, table=22, n_packets=0, n_bytes=0, priority=0 actions=drop On a hunch, the same trials were performed with the openvswitch agents on the controller and compute nodes configured to use the ovs-ofctl OF interface. ping -s 1422 192.0.2.58 as well as ssh to the guests and copies of large amount of data are now possible. The same tcpdump command shows that the extra 802.1q information is not present: #with ofctl instead of native [root@overcloud-controller-0 ml2]# tcpdump -vvv -n -e -i any udp tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes 19:10:31.570425 Out 56:13:19:d8:af:27 ethertype IPv4 (0x0800), length 1512: (tos 0x0, ttl 64, id 22104, offset 0, flags [DF], proto UDP (17), length 1496) 172.16.0.5.51706 > 172.16.0.10.4789: [no cksum] VXLAN, flags [I] (0x08), vni 36 fa:16:3e:99:37:ce > fa:16:3e:06:65:6f, ethertype IPv4 (0x0800), length 1460: (tos 0x0, ttl 63, id 37549, offset 0, flags [DF], proto ICMP (1), length 1446) 192.0.2.1 > 192.168.2.101: ICMP echo request, id 19062, seq 1, length 1426 19:10:31.572143 P ea:0c:37:f7:69:5e ethertype 802.1Q (0x8100), length 1520: vlan 50, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 22370, offset 0, flags [DF], proto UDP (17), length 1500) 172.16.0.10.50337 > 172.16.0.5.4789: [no cksum] VXLAN, flags [I] (0x08), vni 36 The flow table is also different, using strip_vlan instead of pop_vlan (as well as other obvious differences) [root@overcloud-novacompute-0 ml2]# ovs-ofctl dump-flows br-tun NXST_FLOW reply (xid=0x4): cookie=0xb4814c0ff5ea6fd4, duration=2095.101s, table=0, n_packets=115156, n_bytes=8744100, idle_age=546, priority=1,in_port=1 actions=resubmit(,2) cookie=0xb4814c0ff5ea6fd4, duration=2094.475s, table=0, n_packets=346419, n_bytes=274503223, idle_age=546, priority=1,in_port=2 actions=resubmit(,4) cookie=0xb4814c0ff5ea6fd4, duration=2095.100s, table=0, n_packets=0, n_bytes=0, idle_age=2095, priority=0 actions=drop cookie=0xb4814c0ff5ea6fd4, duration=2095.099s, table=2, n_packets=115155, n_bytes=8744058, idle_age=546, priority=0,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,20) cookie=0xb4814c0ff5ea6fd4, duration=2095.099s, table=2, n_packets=1, n_bytes=42, idle_age=1263, priority=0,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,22) cookie=0xb4814c0ff5ea6fd4, duration=2095.098s, table=3, n_packets=0, n_bytes=0, idle_age=2095, priority=0 actions=drop cookie=0xb4814c0ff5ea6fd4, duration=2094.227s, table=4, n_packets=346419, n_bytes=274503223, idle_age=546, priority=1,tun_id=0x24 actions=mod_vlan_vid:1,resubmit(,10) cookie=0xb4814c0ff5ea6fd4, duration=2095.097s, table=4, n_packets=0, n_bytes=0, idle_age=2095, priority=0 actions=drop cookie=0xb4814c0ff5ea6fd4, duration=2095.097s, table=6, n_packets=0, n_bytes=0, idle_age=2095, priority=0 actions=drop cookie=0xb4814c0ff5ea6fd4, duration=2095.096s, table=10, n_packets=346419, n_bytes=274503223, idle_age=546, priority=1 actions=learn(table=20,hard_timeout=300,priority=1,cookie=0xb4814c0ff5ea6fd4,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:1 cookie=0xb4814c0ff5ea6fd4, duration=2095.096s, table=20, n_packets=0, n_bytes=0, idle_age=2095, priority=0 actions=resubmit(,22) cookie=0xb4814c0ff5ea6fd4, duration=2094.235s, table=22, n_packets=1, n_bytes=42, idle_age=1263, dl_vlan=1 actions=strip_vlan,set_tunnel:0x24,output:2 cookie=0xb4814c0ff5ea6fd4, duration=2095.086s, table=22, n_packets=0, n_bytes=0, idle_age=2095, priority=0 actions=drop System details follow: System info: CentOS Linux release 7.2.1511 (Core) Kernel version: 3.10.0-327.28.3.el7.x86_6 System is a tripleo deployment using a network isolation type network environment (see docs for details) Deployment command line: openstack overcloud deploy --templates ./tripleo-heat-templates -e ~/tripleo-heat-templates/environments/network-isolation.yaml -e ~/tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e ~/for_net_isolation.yaml All templates "stock" except for last, contains: parameter_defaults: EC2MetadataIp: 192.0.2.1 ControlPlaneDefaultRoute: 192.0.2.1 OpenStack packages openvswitch.x86_64 2.5.0-2.el7 @delorean-newton-testing openstack-neutron-openvswitch.noarch 1:9.0.0-0.20160907193737.dc6508a.el7.centos @delorean [root@overcloud-controller-0 ~]# ovs-vsctl --version ovs-vsctl (Open vSwitch) 2.5.0 Compiled Mar 18 2016 15:00:11 DB Schema 7.12.1 [root@overcloud-controller-0 ~]# ovs-ofctl --version ovs-ofctl (Open vSwitch) 2.5.0 Compiled Mar 18 2016 15:00:11 OpenFlow versions 0x1:0x4 python-ryu-common.noarch 4.3-2.el7 @delorean-newton-testing python2-ryu.noarch 4.3-2.el7 @delorean-newton-testing To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1622017/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp