While obviously not supported or fully in mainline, we've been using the latest conntrack development code and seeing a number of crashes that have a very similar failure and the stack trace is below. Curious to know if this issue is fixed in master - it doesn't really seem to be a conntrack specific issue but we'd like to get past this and continue to use and test the conntrack stuff.
I can reproduce this two ways. The first is to remove a linux bond interface that is in a bridge using the "ip link del bond0" and then remove the bridge via "ovs-vsctl del-br <br>". That is the backtrace that is below and I can reproduce this pretty quickly doing this behavior in a loop (delete the bond, delete the bridge, wait, create-the bridge, create and add the bond and another interface, repeat). We weren't really trying to be that forceful and I figured if the bridge remained up and we just removed the bond from the bridge cleanly we may avoid the problem. But simply doing "ovs-vsctl del-port <br> bond0", "ip link del bond0" also causes a similar problem but with much lower frequency. (I will attach this backtrace once I get it again if useful) Before the crashes, we always see a message like this in ovs-vswitchd.log: 2015-11-12T01:29:03.945Z|01200|netdev_linux|WARN|Dropped 52 log messages in last 74 seconds (most recently, 73 seconds ago) due to excessive rate 2015-11-12T01:29:03.945Z|01201|netdev_linux|WARN|dock0: removing policing failed: Operation not supported 2015-11-12T01:29:03.945Z|01202|netdev_linux|WARN|br0: removing policing failed: Operation not supported 2015-11-12T01:29:03.945Z|01203|netdev_linux|WARN|vi_l3_1: removing policing failed: Operation not supported 2015-11-12T01:29:03.946Z|01204|netdev_linux|WARN|lan0: removing policing failed: Operation not supported 2015-11-12T01:29:03.959Z|01205|bridge|WARN|could not open network device bond0 (No such device) 2015-11-12T01:29:03.960Z|01206|netdev_linux|WARN|dock0: removing policing failed: Operation not supported 2015-11-12T01:29:03.960Z|01207|netdev_linux|WARN|br0: removing policing failed: Operation not supported *2015-11-12T01:29:04.435Z|01208|dpif|WARN|system@ovs-system: failed to flow_del (No such file or directory) ufid:734136a4-5748-4691-a52c-cd5bfad4da68 recirc_id(0),dp_hash(0),skb_priority(0),in_port(2),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),eth(src=04:00:00:00:00:02,dst=04:00:00:00:00:fe),eth_type(0x0800),ipv4(src=192.168.27.2,dst=10.131.0.87,proto=6,tos=0,ttl=64,frag=no),tcp(src=51582,dst=443),tcp_flags(psh|ack)* 2015-11-12T01:29:04.435Z|01209|util|EMER|lib/cmap.c:846: assertion ok failed in cmap_replace() We are currently using the conntrack branch at this point: https://github.com/justinpettit/ovs/commit/86e6bfcb999ed134aa12bf947cea4da1426af2c2 . We know this is about a month old but updated and didn't see an improvement and ran into a few other issues. Here is the back trace from method #1: Reading symbols from /tmp/ovs-sym/usr/lib/debug/usr/sbin/ovs-vswitchd...done. (gdb) back #0 0x000000f37a6959dc in raise () from /lib/mips64el-linux-gnuabi64/libc.so.6 #1 0x000000f37a697470 in abort () from /lib/mips64el-linux-gnuabi64/libc.so.6 #2 0x000000012014cf18 in ovs_abort_valist (err_no=<optimized out>, format=<optimized out>, args=<optimized out>) at lib/util.c:323 #3 0x0000000120156bac in vlog_abort_valist (module_=<optimized out>, message=0x1201d3fe0 "%s: assertion %s failed in %s()", args=0xf57c5ad230) at lib/vlog.c:1129 #4 0x0000000120156bf4 in vlog_abort (module=<optimized out>, message=<optimized out>) at lib/vlog.c:1143 #5 0x000000012014cb7c in ovs_assert_failure (where=<optimized out>, function=<optimized out>, condition=<optimized out>) at lib/util.c:72 #6 0x00000001200919c8 in cmap_replace (cmap=<optimized out>, old_node=<optimized out>, new_node=0x0, hash=<optimized out>) at lib/cmap.c:846 #7 0x0000000120066b88 in cmap_remove (hash=<optimized out>, node=0xf2ec00af30, cmap=0x1209031a8) at ./lib/cmap.h:265 #8 ukey_delete (ukey=0xf2ec00af30, umap=0x120903178) at ofproto/ofproto-dpif-upcall.c:1729 #9 push_ukey_ops (udpif=udpif@entry=0x12092ea40, umap=umap@entry=0x120903178, ops=ops@entry=0xf57c5ad2e0, n_ops=n_ops@entry=1) at ofproto/ofproto-dpif-upcall.c:2046 #10 0x0000000120067f18 in revalidator_sweep__ (revalidator=<optimized out>, purge=purge@entry=true) at ofproto/ofproto-dpif-upcall.c:2257 #11 0x0000000120068128 in revalidator_purge (revalidator=<optimized out>) at ofproto/ofproto-dpif-upcall.c:2274 #12 udpif_stop_threads (udpif=udpif@entry=0x12092ea40) at ofproto/ofproto-dpif-upcall.c:461 #13 0x0000000120068d50 in udpif_stop_threads (udpif=0x12092ea40) at ofproto/ofproto-dpif-upcall.c:590 #14 udpif_synchronize (udpif=0x12092ea40) at ofproto/ofproto-dpif-upcall.c:587 #15 0x0000000120054458 in destruct (ofproto_=0x1209ad410) at ofproto/ofproto-dpif.c:1451 #16 0x00000001200473f0 in ofproto_destroy (p=0x1209ad410) at ofproto/ofproto.c:1609 #17 0x000000012002c92c in bridge_destroy (br=br@entry=0x120962d30) at vswitchd/bridge.c:3207 #18 0x000000012002ce88 in add_del_bridges (cfg=0x12092e460, cfg=0x12092e460) at vswitchd/bridge.c:1713 #19 0x000000012002e404 in bridge_reconfigure (ovs_cfg=ovs_cfg@entry=0x12092e460) at vswitchd/bridge.c:597 #20 0x0000000120032c3c in bridge_run () at vswitchd/bridge.c:2973 #21 0x0000000120025a28 in main (argc=11, argv=0xf57c5af888) at vswitchd/ovs-vswitchd.c:120 Again, if it helps I can get one from repro method #2. Does anyone else see anything similar? Or does anyone know of a fix in master that may be relevant to this? Thanks in advance for any time spent or help.
_______________________________________________ discuss mailing list [email protected] http://openvswitch.org/mailman/listinfo/discuss
