All,

First apologies in advance for a lengthy message:

I have seen some strange vlan behavior where there is a miss-match on vlan tci. It can be reproduced on every commit I tried since version 2.3 but cannot be reproduced in version 2.2.

The example below is from commit 7cc398cb8561a16ae3be5ffc687be5620981d619 in the current master branch.

To reproduce set a specific rule to match on a particular tci. Every incoming packet with a miss-matched vlan tag will get the following error from the Linux datapath function ovs_nla_get_match():

[449232.750362] openvswitch: netlink: VLAN tag present bit must have an exact match (tci_mask=100). [449233.767975] openvswitch: netlink: VLAN tag present bit must have an exact match (tci_mask=100). [449234.765656] openvswitch: netlink: VLAN tag present bit must have an exact match (tci_mask=100).

This is the flow rule to reproduce the problem:

ovs-ofctl --protocols=OpenFlow13 add-flow br0 in_port=2,vlan_tci=0x1e36,priority=20000,actions=pop_vlan,output:1

I believe that the miss matched packet is getting up-called but when the flow comes back to the kernel it doesn't have the mask set correctly in the netlink attributes so the kernel has trouble parsing it, the flow is never set in the kernel so every subsequent packet also misses.

=====Work Around=============
The problem can be made to disappear by setting a lower priority flow to explicitly drop all other incoming packets whose vlan tci's don't match as follows:

ovs-ofctl --protocols=OpenFlow13 add-flow br0 in_port=2,vlan_tci=0x1000/0x1000,priority=20,actions=drop

I am pretty sure this is a bug but I am looking for feedback before I fix it because some may argue that the above is correct behavior but I have my doubts. Also, having every incoming miss-matched packet get bounced to user space and back causes the performance to drop by two orders of magnitude.

=============log with debug set===============

Mar 9 19:43:47 lubuntu1310 ovs-vswitchd: ovs|07894|poll_loop|DBG|wakeup due to [POLLIN] on fd 14 (FIFO pipe:[3806013]) at ofproto/ofproto-dpif.c:1633 (0% CPU usage) Mar 9 19:43:47 lubuntu1310 ovs-vswitchd: ovs|09287|poll_loop(handler7)|DBG|wakeup due to [POLLIN] on fd 19 (unknown anon_inode:[eventpoll]) at lib/dpif-netlink.c:2201 (0% CPU usage) Mar 9 19:43:47 lubuntu1310 ovs-vswitchd: ovs|09288|dpif(handler7)|DBG|system@ovs-system: miss upcall: Mar 9 19:43:47 lubuntu1310 ovs-vswitchd: recirc_id(0),dp_hash(0),skb_priority(0),in_port(3),skb_mark(0),eth(src=26:58:26:16:3c:38,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8100),vlan(vid=999,pcp=0),encap(eth_type(0x0806),arp(sip=192.168.1.3,tip=192.168.1.2,op=1,sha=26:58:26:16:3c:38,tha=00:00:00:00:00:00)) Mar 9 19:43:47 lubuntu1310 ovs-vswitchd: arp,in_port=0,dl_vlan=999,dl_vlan_pcp=0,dl_src=26:58:26:16:3c:38,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.1.3,arp_tpa=192.168.1.2,arp_op=1,arp_sha=26:58:26:16:3c:38,arp_tha=00:00:00:00:00:00 Mar 9 19:43:47 lubuntu1310 ovs-vswitchd: ovs|09289|dpif(handler7)|WARN|system@ovs-system: failed to put[create] (Invalid argument) ufid:09fbc634c913835be0aed9d5f728f5a1 recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(3),skb_mark(0/0),eth(src=26:58:26:16:3c:38/00:00:00:00:00:00,dst=ff:ff:ff:ff:ff:ff/00:00:00:00:00:00),eth_type(0x8100),vlan(vid=999/0x100,pcp=0/0x0),encap(eth_type(0x0806),arp(sip=192.168.1.3/0.0.0.0,tip=192.168.1.2/0.0.0.0,op=1/0,sha=26:58:26:16:3c:38/00:00:00:00:00:00,tha=00:00:00:00:00:00/00:00:00:00:00:00)) Mar 9 19:43:47 lubuntu1310 ovs-vswitchd: ovs|09290|poll_loop(handler7)|DBG|wakeup due to 0-ms timeout at ofproto/ofproto-dpif-upcall.c:622 (0% CPU usage) Mar 9 19:43:47 lubuntu1310 kernel: [450932.727558] openvswitch: netlink: VLAN tag present bit must have an exact match (tci_mask=100). Mar 9 19:43:48 lubuntu1310 ovs-vswitchd: ovs|15966|poll_loop(revalidator6)|DBG|wakeup due to 499-ms timeout at ofproto/ofproto-dpif-upcall.c:802 (0% CPU usage) Mar 9 19:43:48 lubuntu1310 ovs-vswitchd: ovs|15967|netlink_socket(revalidator6)|DBG|Dropped 21 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate Mar 9 19:43:48 lubuntu1310 ovs-vswitchd: ovs|15968|netlink_socket(revalidator6)|DBG|nl_sock_transact_multiple__ (Success): nl(len:24, type=128(ovs_datapath), flags=9[REQUEST][ECHO], seq=2a4e, pid=4294962693,genl(cmd=3,version=2) Mar 9 19:43:48 lubuntu1310 ovs-vswitchd: ovs|15969|dpif(revalidator6)|DBG|system@ovs-system: get_stats success Mar 9 19:43:48 lubuntu1310 ovs-vswitchd: ovs|15970|dpif(revalidator6)|DBG|system@ovs-system: dumped all flows Mar 9 19:43:48 lubuntu1310 ovs-vswitchd: ovs|05115|poll_loop(urcu3)|DBG|wakeup due to [POLLIN] on fd 25 (FIFO pipe:[3806112]) at lib/ovs-rcu.c:273 (0% CPU usage) Mar 9 19:43:48 lubuntu1310 ovs-vswitchd: ovs|05116|poll_loop(urcu3)|DBG|wakeup due to [POLLIN] on fd 25 (FIFO pipe:[3806112]) at lib/ovs-rcu.c:205 (0% CPU usage) Mar 9 19:43:48 lubuntu1310 ovs-vswitchd: ovs|05117|poll_loop(urcu3)|DBG|wakeup due to [POLLIN] on fd 25 (FIFO pipe:[3806112]) at lib/ovs-rcu.c:205 (0% CPU usage) Mar 9 19:43:48 lubuntu1310 ovs-vswitchd: ovs|15971|dpif(revalidator6)|DBG|system@ovs-system: flow_dump_destroy success Mar 9 19:43:48 lubuntu1310 ovs-vswitchd: ovs|07895|poll_loop|DBG|wakeup due to [POLLIN] on fd 14 (FIFO pipe:[3806013]) at ofproto/ofproto-dpif.c:1633 (0% CPU usage)

Thanks in advance,

--Tom



--
Thomas F. Herbert

_______________________________________________
discuss mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/discuss

Reply via email to