"dev" <dev-boun...@openvswitch.org> wrote on 07/18/2016 11:29:28 AM:

>
> > From: "Lance Richardson" <lrich...@redhat.com>
> > To: dev@openvswitch.org
> > Sent: Wednesday, July 6, 2016 7:39:52 PM
> > Subject: [ovs-dev] [PATCH] netdev-dummy: fix crash with more than
> one   passive connection
> >
> > Investigation found that Some of the occasional failures in the
> > "ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS" test case are caused
> > by ovs-vswitchd crashing with SIGSEGV. It turns out that the
> > crash occurrs when the number of netdev-dummy passive connections
> > transitions from 1 to 2.  When xrealloc() copies the array of
> > dummy_packet_stream structures from the original buffer to a
> > newly allocated one, the struct ovs_list txq member of the structure
> > becomes corrupt (e.g. if ovs_list_is_empty() would have returned
> > false before the copy, it will return true after the copy, which
> > will lead to a crash when the bogus packet buffer on the list is
> > dereferenced).
> >
> > Fix by taking a hint from David Wheeler and adding a level of
> > indirection.
> >
> > Signed-off-by: Lance Richardson <lrich...@redhat.com>

[snip]

>
> Here is a small script that reliably reproduces the crash in
ovs-vswitchd.
> I don't have an explanation for why we have two connections to the same
> port in the "ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS" test case (this
> happens infrequently), perhaps it's something like the active connect
from
> a peer ovs-vswitchd being interrupted and re-tried.
>
> #!/bin/bash
>
> set -ex
>
> PWD=$(pwd)
> export PATH=${PWD}/vswitchd:${PWD}/utilities:${PWD}/ovsdb:$PATH
>
> ovs_setenv() {
>     export OVS_RUNDIR=${PWD}/$1
>     export OVS_LOGDIR=${PWD}/$1
>     export OVS_DBDIR=${PWD}/$1
>     export OVS_SYSCONFDIR=${PWD}/$1
>     export OVS_PKGDATADIR=${PWD}/$1
> }
>
> for x in repro_main repro_hv1 repro_hv2; do
>     mkdir -p $x
>     rm -f $x/*
>     ovs_setenv $x
>     : > $OVS_RUNDIR/.conf.db.~lock~
>     ovsdb-tool create $OVS_RUNDIR/conf.db vswitchd/vswitch.ovsschema
>     ovsdb-server --remote=punix:$OVS_RUNDIR/db.sock  -vconsole:off
> --detach --no-chdir --pidfile --log-file
>     ovs-vsctl --no-wait -- init
>     ovs-vswitchd --enable-dummy=system -vvconn -vofproto_dpif -
> vunixctl -vconsole:off --detach --no-chdir --pidfile --log-file
> done
>
> ovs_setenv repro_main
> ovs-vsctl add-br foo  \
>           -- add-port foo p1 \
>           -- set Interface p1
options:pstream="punix:$PWD/repro_main/p1.sock"
>
> ovs_setenv repro_hv1
> ovs-vsctl add-br br1 \
>           -- add-port br1 p1 \
>           -- set Interface p1
options:stream="unix:$PWD/repro_main/p1.sock"
>
> ovs_setenv repro_hv2
> ovs-vsctl add-br br1 \
>           -- add-port br1 p1 \
>           -- set Interface p1
options:stream="unix:$PWD/repro_main/p1.sock"

I like this, but I think I'm going to be consistent and ask if you
can spin a new version with the above script as a unit test so that
we can see the crash before applying the rest of the patch and then
verify that it doesn't happen with the patch.

Ryan
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to