Thanks for testing.

I discovered that this exact patch causes another problem.  I posted a
slight revision without that issue.  Would you mind re-testing?  Thanks
a lot.

The new version is here:
https://patchwork.ozlabs.org/patch/1012261/

On Tue, Dec 11, 2018 at 12:35:15PM +1300, Josh Bailey wrote:
> Yes sir. That fixes it - vswitchd no longer crashes.
> 
> Thanks,
> 
> 
> On Tue, Dec 11, 2018 at 12:31 PM Ben Pfaff <b...@ovn.org> wrote:
> 
> > Here's a more specific patch that, if my hypothesis is correct, would
> > solve the issue.
> >
> > diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c
> > index 7c0f16b321f1..ebee5817710e 100644
> > --- a/ofproto/connmgr.c
> > +++ b/ofproto/connmgr.c
> > @@ -1493,7 +1493,7 @@ ofconn_receives_async_msg(const struct ofconn
> > *ofconn,
> >      ovs_assert(reason < 32);
> >      ovs_assert((unsigned int) type < OAM_N_TYPES);
> >
> > -    if (!rconn_is_connected(ofconn->rconn)) {
> > +    if (!rconn_is_connected(ofconn->rconn) || !ofconn->protocol) {
> >          return false;
> >      }
> >
> > On Mon, Dec 10, 2018 at 12:37:14PM -0800, Ben Pfaff wrote:
> > > Probably, this is a different issue.  My guess is that the connection in
> > > question doesn't have an OpenFlow protocol at the moment.  We've dealt
> > > with problems like this before, see e.g. commit 903f6c4f8a9b ("connmgr:
> > > Fix vswitchd abort when a port is added and the controller is down").
> > > Either that fix wasn't sufficient or it wasn't backported or it's some
> > > other slightly different issue.
> > >
> > > I've had a restructuring that should improve things in this area out for
> > > review since the end of October.  So far it hasn't attracted a review:
> > > https://patchwork.ozlabs.org/patch/990599/
> > >
> > > On Mon, Dec 10, 2018 at 06:16:19PM -0200, Flavio Leitner wrote:
> > > >
> > > > Looks like you're using an unsupported OpenFlow protocol:
> > > >
> > https://github.com/openvswitch/ovs/blob/5f361a2a320717c46289fc30d65a186f2f5d3ba0/lib/ofp-protocol.c#L123
> > > >
> > > > I see that you are configuring a controller in OVS and you are
> > > > running Ryu, maybe it's using the wrong protocol version?
> > > >
> > > > fbl
> > > >
> > > > On Tue, Dec 11, 2018 at 08:07:51AM +1300, Josh Bailey wrote:
> > > > > Certainly:
> > > > >
> > > > > 2018-12-04 21:23:59 josh #1  0x00007f870edf9801 in __GI_abort () at
> > > > > abort.c:79
> > > > >
> > > > > 2018-12-04 21:23:59 josh #2  0x00005634f368e0a8 in
> > > > > ofputil_protocol_to_ofp_version (protocol=<optimized out>) at
> > > > > lib/ofp-protocol.c:123
> > > > >
> > > > > 2018-12-04 21:23:59 josh #3  0x00005634f36890ae in
> > > > > ofputil_encode_port_status (ps=ps@entry=0x7ffef1dc7880,
> > protocol=<optimized
> > > > > out>) at lib/ofp-port.c:938
> > > > >
> > > > > 2018-12-04 21:23:59 josh #4  0x00005634f35f7ab2 in
> > connmgr_send_port_status
> > > > > (mgr=0x5634f518a9a0, source=source@entry=0x0, pp=pp@entry
> > =0x5634f5247310,
> > > > > reason=reason@entry=2 '\002') at ofproto/connmgr.c:1654
> > > > >
> > > > > 2018-12-04 21:23:59 josh #5  0x00005634f35bcfe3 in
> > ofproto_port_set_state
> > > > > (port=port@entry=0x5634f52472f0, state=<optimized out>) at
> > > > > ofproto/ofproto.c:2485
> > > > >
> > > > > 2018-12-04 21:23:59 josh #6  0x00005634f35d07e3 in port_run
> > > > > (ofport=0x5634f52472e0) at ofproto/ofproto-dpif.c:3629
> > > > >
> > > > > 2018-12-04 21:23:59 josh #7  run (ofproto_=0x5634f51dd2c0) at
> > > > > ofproto/ofproto-dpif.c:1666
> > > > >
> > > > > 2018-12-04 21:23:59 josh #8  0x00005634f35be5ee in ofproto_run
> > > > > (p=0x5634f51dd2c0) at ofproto/ofproto.c:1741
> > > > >
> > > > > 2018-12-04 21:23:59 josh #9  0x00005634f35abe9c in bridge_run__ () at
> > > > > vswitchd/bridge.c:2944
> > > > >
> > > > > 2018-12-04 21:23:59 josh #10 0x00005634f35b19e0 in bridge_run () at
> > > > > vswitchd/bridge.c:3002
> > > > >
> > > > > 2018-12-04 21:23:59 josh #11 0x00005634f3211595 in main
> > (argc=<optimized
> > > > > out>, argv=<optimized out>) at vswitchd/ovs-vswitchd.c:125
> > > > >
> > > > > On Tue, Dec 11, 2018 at 7:50 AM Flavio Leitner <f...@sysclose.org>
> > wrote:
> > > > >
> > > > > > On Wed, Dec 05, 2018 at 12:11:28PM +1300, Josh Bailey via discuss
> > wrote:
> > > > > > > Hello OVS colleagues,
> > > > > > >
> > > > > > > vswitchd appears to crash handling a port add/mod. Please see
> > following
> > > > > > to
> > > > > > > reproduce.
> > > > > > >
> > > > > > > Run two Ryu OF controllers:
> > > > > > >
> > > > > > > $ ryu-manager --ofp-tcp-listen-port 6653  --ofp-listen-host
> > 127.0.0.1
> > > > > > > --verbose --app-lists ryu.app.simple_switch_stp
> > > > > > >
> > > > > > > $ ryu-manager --ofp-tcp-listen-port 6654  --ofp-listen-host
> > 127.0.0.1
> > > > > > > --verbose --app-lists ryu.app.simple_switch_stp
> > > > > > >
> > > > > > >
> > > > > > > Now set up a bridge with no interfaces:
> > > > > > >
> > > > > > >
> > > > > > > root@faucet:~/faucet#
> > /usr/local/share/openvswitch/scripts/ovs-ctl start
> > > > > > > * Starting ovsdb-server
> > > > > > > * system ID not configured, please use --system-id
> > > > > > > * Configuring Open vSwitch system IDs
> > > > > > > * Starting ovs-vswitchd
> > > > > > > * Enabling remote OVSDB managers
> > > > > > > root@faucet:~/faucet# ovs-vsctl --version
> > > > > > > ovs-vsctl (Open vSwitch) 2.9.3
> > > > > > > DB Schema 7.15.1
> > > > > > > root@faucet:~/faucet# ovs-vsctl add-br br0
> > > > > > > root@faucet:~/faucet# ovs-vsctl set-controller br0 tcp:
> > 127.0.0.1:6653
> > > > > > tcp:
> > > > > > > 127.0.0.1:6654
> > > > > > >
> > > > > > >
> > > > > > > Now add a physical interface known to be up:
> > > > > > >
> > > > > > >
> > > > > > > root@faucet:~/faucet# ovs-vsctl add-port br0 enp2s0f0
> > > > > > >
> > > > > > >
> > > > > > > Observe crash in log:
> > > > > > >
> > > > > > >
> > > > > > > 2018-12-04T23:03:06.663Z|00036|bridge|INFO|bridge br0: added
> > interface
> > > > > > > enp2s0f0 on port 1
> > > > > > > 2018-12-04T23:03:06.663Z|00037|bridge|INFO|bridge br0: using
> > datapath ID
> > > > > > > 000090e2ba7e7558
> > > > > > > 2018-12-04T23:03:06.663Z|00038|rconn|INFO|br0<->tcp:
> > 127.0.0.1:6653:
> > > > > > > disconnecting
> > > > > > > 2018-12-04T23:03:06.663Z|00039|rconn|INFO|br0<->tcp:
> > 127.0.0.1:6654:
> > > > > > > disconnecting
> > > > > > > 2018-12-04T23:03:06.664Z|00040|fail_open|WARN|Could not connect
> > to
> > > > > > > controller (or switch failed controller's post-connection
> > admission
> > > > > > control
> > > > > > > policy) for 19 seconds, failing open
> > > > > > > 2018-12-04T23:03:06.710Z|00002|daemon_unix(monitor)|ERR|1
> > crashes: pid
> > > > > > 5620
> > > > > > > died, killed (Aborted), core dumped, restarting
> > > > > >
> > > > > > Please open the coredump using gdb and provide the backtrace at
> > least,
> > > > > > Thanks,
> > > > > > --
> > > > > > fbl
> > > > > >
> > > > > >
> > > >
> > > > --
> > > > Flavio
> > > >
> > > > _______________________________________________
> > > > discuss mailing list
> > > > disc...@openvswitch.org
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> >
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to