All,
First apologies in advance for a lengthy message:
I have seen some strange vlan behavior where there is a miss-match on
vlan tci. It can be reproduced on every commit I tried since version 2.3
but cannot be reproduced in version 2.2.
The example below is from commit
7cc398cb8561a16ae3be5ffc687be5620981d619 in the current master branch.
To reproduce set a specific rule to match on a particular tci. Every
incoming packet with a miss-matched vlan tag will get the following
error from the Linux datapath function ovs_nla_get_match():
[449232.750362] openvswitch: netlink: VLAN tag present bit must have an
exact match (tci_mask=100).
[449233.767975] openvswitch: netlink: VLAN tag present bit must have an
exact match (tci_mask=100).
[449234.765656] openvswitch: netlink: VLAN tag present bit must have an
exact match (tci_mask=100).
This is the flow rule to reproduce the problem:
ovs-ofctl --protocols=OpenFlow13 add-flow br0
in_port=2,vlan_tci=0x1e36,priority=20000,actions=pop_vlan,output:1
I believe that the miss matched packet is getting up-called but when the
flow comes back to the kernel it doesn't have the mask set correctly in
the netlink attributes so the kernel has trouble parsing it, the flow is
never set in the kernel so every subsequent packet also misses.
=====Work Around=============
The problem can be made to disappear by setting a lower priority flow to
explicitly drop all other incoming packets whose vlan tci's don't match
as follows:
ovs-ofctl --protocols=OpenFlow13 add-flow br0
in_port=2,vlan_tci=0x1000/0x1000,priority=20,actions=drop
I am pretty sure this is a bug but I am looking for feedback before I
fix it because some may argue that the above is correct behavior but I
have my doubts. Also, having every incoming miss-matched packet get
bounced to user space and back causes the performance to drop by two
orders of magnitude.
=============log with debug set===============
Mar 9 19:43:47 lubuntu1310 ovs-vswitchd: ovs|07894|poll_loop|DBG|wakeup
due to [POLLIN] on fd 14 (FIFO pipe:[3806013]) at
ofproto/ofproto-dpif.c:1633 (0% CPU usage)
Mar 9 19:43:47 lubuntu1310 ovs-vswitchd:
ovs|09287|poll_loop(handler7)|DBG|wakeup due to [POLLIN] on fd 19
(unknown anon_inode:[eventpoll]) at lib/dpif-netlink.c:2201 (0% CPU usage)
Mar 9 19:43:47 lubuntu1310 ovs-vswitchd:
ovs|09288|dpif(handler7)|DBG|system@ovs-system: miss upcall:
Mar 9 19:43:47 lubuntu1310 ovs-vswitchd:
recirc_id(0),dp_hash(0),skb_priority(0),in_port(3),skb_mark(0),eth(src=26:58:26:16:3c:38,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8100),vlan(vid=999,pcp=0),encap(eth_type(0x0806),arp(sip=192.168.1.3,tip=192.168.1.2,op=1,sha=26:58:26:16:3c:38,tha=00:00:00:00:00:00))
Mar 9 19:43:47 lubuntu1310 ovs-vswitchd:
arp,in_port=0,dl_vlan=999,dl_vlan_pcp=0,dl_src=26:58:26:16:3c:38,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.1.3,arp_tpa=192.168.1.2,arp_op=1,arp_sha=26:58:26:16:3c:38,arp_tha=00:00:00:00:00:00
Mar 9 19:43:47 lubuntu1310 ovs-vswitchd:
ovs|09289|dpif(handler7)|WARN|system@ovs-system: failed to put[create]
(Invalid argument) ufid:09fbc634c913835be0aed9d5f728f5a1
recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(3),skb_mark(0/0),eth(src=26:58:26:16:3c:38/00:00:00:00:00:00,dst=ff:ff:ff:ff:ff:ff/00:00:00:00:00:00),eth_type(0x8100),vlan(vid=999/0x100,pcp=0/0x0),encap(eth_type(0x0806),arp(sip=192.168.1.3/0.0.0.0,tip=192.168.1.2/0.0.0.0,op=1/0,sha=26:58:26:16:3c:38/00:00:00:00:00:00,tha=00:00:00:00:00:00/00:00:00:00:00:00))
Mar 9 19:43:47 lubuntu1310 ovs-vswitchd:
ovs|09290|poll_loop(handler7)|DBG|wakeup due to 0-ms timeout at
ofproto/ofproto-dpif-upcall.c:622 (0% CPU usage)
Mar 9 19:43:47 lubuntu1310 kernel: [450932.727558] openvswitch:
netlink: VLAN tag present bit must have an exact match (tci_mask=100).
Mar 9 19:43:48 lubuntu1310 ovs-vswitchd:
ovs|15966|poll_loop(revalidator6)|DBG|wakeup due to 499-ms timeout at
ofproto/ofproto-dpif-upcall.c:802 (0% CPU usage)
Mar 9 19:43:48 lubuntu1310 ovs-vswitchd:
ovs|15967|netlink_socket(revalidator6)|DBG|Dropped 21 log messages in
last 1 seconds (most recently, 1 seconds ago) due to excessive rate
Mar 9 19:43:48 lubuntu1310 ovs-vswitchd:
ovs|15968|netlink_socket(revalidator6)|DBG|nl_sock_transact_multiple__
(Success): nl(len:24, type=128(ovs_datapath), flags=9[REQUEST][ECHO],
seq=2a4e, pid=4294962693,genl(cmd=3,version=2)
Mar 9 19:43:48 lubuntu1310 ovs-vswitchd:
ovs|15969|dpif(revalidator6)|DBG|system@ovs-system: get_stats success
Mar 9 19:43:48 lubuntu1310 ovs-vswitchd:
ovs|15970|dpif(revalidator6)|DBG|system@ovs-system: dumped all flows
Mar 9 19:43:48 lubuntu1310 ovs-vswitchd:
ovs|05115|poll_loop(urcu3)|DBG|wakeup due to [POLLIN] on fd 25 (FIFO
pipe:[3806112]) at lib/ovs-rcu.c:273 (0% CPU usage)
Mar 9 19:43:48 lubuntu1310 ovs-vswitchd:
ovs|05116|poll_loop(urcu3)|DBG|wakeup due to [POLLIN] on fd 25 (FIFO
pipe:[3806112]) at lib/ovs-rcu.c:205 (0% CPU usage)
Mar 9 19:43:48 lubuntu1310 ovs-vswitchd:
ovs|05117|poll_loop(urcu3)|DBG|wakeup due to [POLLIN] on fd 25 (FIFO
pipe:[3806112]) at lib/ovs-rcu.c:205 (0% CPU usage)
Mar 9 19:43:48 lubuntu1310 ovs-vswitchd:
ovs|15971|dpif(revalidator6)|DBG|system@ovs-system: flow_dump_destroy
success
Mar 9 19:43:48 lubuntu1310 ovs-vswitchd: ovs|07895|poll_loop|DBG|wakeup
due to [POLLIN] on fd 14 (FIFO pipe:[3806013]) at
ofproto/ofproto-dpif.c:1633 (0% CPU usage)
Thanks in advance,
--Tom
--
Thomas F. Herbert
_______________________________________________
discuss mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/discuss