"dev" <dev-boun...@openvswitch.org> wrote on 07/18/2016 11:29:28 AM:
> > > From: "Lance Richardson" <lrich...@redhat.com> > > To: dev@openvswitch.org > > Sent: Wednesday, July 6, 2016 7:39:52 PM > > Subject: [ovs-dev] [PATCH] netdev-dummy: fix crash with more than > one passive connection > > > > Investigation found that Some of the occasional failures in the > > "ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS" test case are caused > > by ovs-vswitchd crashing with SIGSEGV. It turns out that the > > crash occurrs when the number of netdev-dummy passive connections > > transitions from 1 to 2. When xrealloc() copies the array of > > dummy_packet_stream structures from the original buffer to a > > newly allocated one, the struct ovs_list txq member of the structure > > becomes corrupt (e.g. if ovs_list_is_empty() would have returned > > false before the copy, it will return true after the copy, which > > will lead to a crash when the bogus packet buffer on the list is > > dereferenced). > > > > Fix by taking a hint from David Wheeler and adding a level of > > indirection. > > > > Signed-off-by: Lance Richardson <lrich...@redhat.com> [snip] > > Here is a small script that reliably reproduces the crash in ovs-vswitchd. > I don't have an explanation for why we have two connections to the same > port in the "ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS" test case (this > happens infrequently), perhaps it's something like the active connect from > a peer ovs-vswitchd being interrupted and re-tried. > > #!/bin/bash > > set -ex > > PWD=$(pwd) > export PATH=${PWD}/vswitchd:${PWD}/utilities:${PWD}/ovsdb:$PATH > > ovs_setenv() { > export OVS_RUNDIR=${PWD}/$1 > export OVS_LOGDIR=${PWD}/$1 > export OVS_DBDIR=${PWD}/$1 > export OVS_SYSCONFDIR=${PWD}/$1 > export OVS_PKGDATADIR=${PWD}/$1 > } > > for x in repro_main repro_hv1 repro_hv2; do > mkdir -p $x > rm -f $x/* > ovs_setenv $x > : > $OVS_RUNDIR/.conf.db.~lock~ > ovsdb-tool create $OVS_RUNDIR/conf.db vswitchd/vswitch.ovsschema > ovsdb-server --remote=punix:$OVS_RUNDIR/db.sock -vconsole:off > --detach --no-chdir --pidfile --log-file > ovs-vsctl --no-wait -- init > ovs-vswitchd --enable-dummy=system -vvconn -vofproto_dpif - > vunixctl -vconsole:off --detach --no-chdir --pidfile --log-file > done > > ovs_setenv repro_main > ovs-vsctl add-br foo \ > -- add-port foo p1 \ > -- set Interface p1 options:pstream="punix:$PWD/repro_main/p1.sock" > > ovs_setenv repro_hv1 > ovs-vsctl add-br br1 \ > -- add-port br1 p1 \ > -- set Interface p1 options:stream="unix:$PWD/repro_main/p1.sock" > > ovs_setenv repro_hv2 > ovs-vsctl add-br br1 \ > -- add-port br1 p1 \ > -- set Interface p1 options:stream="unix:$PWD/repro_main/p1.sock" I like this, but I think I'm going to be consistent and ask if you can spin a new version with the above script as a unit test so that we can see the crash before applying the rest of the patch and then verify that it doesn't happen with the patch. Ryan _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev