Thanks for testing. I discovered that this exact patch causes another problem. I posted a slight revision without that issue. Would you mind re-testing? Thanks a lot.
The new version is here: https://patchwork.ozlabs.org/patch/1012261/ On Tue, Dec 11, 2018 at 12:35:15PM +1300, Josh Bailey wrote: > Yes sir. That fixes it - vswitchd no longer crashes. > > Thanks, > > > On Tue, Dec 11, 2018 at 12:31 PM Ben Pfaff <b...@ovn.org> wrote: > > > Here's a more specific patch that, if my hypothesis is correct, would > > solve the issue. > > > > diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c > > index 7c0f16b321f1..ebee5817710e 100644 > > --- a/ofproto/connmgr.c > > +++ b/ofproto/connmgr.c > > @@ -1493,7 +1493,7 @@ ofconn_receives_async_msg(const struct ofconn > > *ofconn, > > ovs_assert(reason < 32); > > ovs_assert((unsigned int) type < OAM_N_TYPES); > > > > - if (!rconn_is_connected(ofconn->rconn)) { > > + if (!rconn_is_connected(ofconn->rconn) || !ofconn->protocol) { > > return false; > > } > > > > On Mon, Dec 10, 2018 at 12:37:14PM -0800, Ben Pfaff wrote: > > > Probably, this is a different issue. My guess is that the connection in > > > question doesn't have an OpenFlow protocol at the moment. We've dealt > > > with problems like this before, see e.g. commit 903f6c4f8a9b ("connmgr: > > > Fix vswitchd abort when a port is added and the controller is down"). > > > Either that fix wasn't sufficient or it wasn't backported or it's some > > > other slightly different issue. > > > > > > I've had a restructuring that should improve things in this area out for > > > review since the end of October. So far it hasn't attracted a review: > > > https://patchwork.ozlabs.org/patch/990599/ > > > > > > On Mon, Dec 10, 2018 at 06:16:19PM -0200, Flavio Leitner wrote: > > > > > > > > Looks like you're using an unsupported OpenFlow protocol: > > > > > > https://github.com/openvswitch/ovs/blob/5f361a2a320717c46289fc30d65a186f2f5d3ba0/lib/ofp-protocol.c#L123 > > > > > > > > I see that you are configuring a controller in OVS and you are > > > > running Ryu, maybe it's using the wrong protocol version? > > > > > > > > fbl > > > > > > > > On Tue, Dec 11, 2018 at 08:07:51AM +1300, Josh Bailey wrote: > > > > > Certainly: > > > > > > > > > > 2018-12-04 21:23:59 josh #1 0x00007f870edf9801 in __GI_abort () at > > > > > abort.c:79 > > > > > > > > > > 2018-12-04 21:23:59 josh #2 0x00005634f368e0a8 in > > > > > ofputil_protocol_to_ofp_version (protocol=<optimized out>) at > > > > > lib/ofp-protocol.c:123 > > > > > > > > > > 2018-12-04 21:23:59 josh #3 0x00005634f36890ae in > > > > > ofputil_encode_port_status (ps=ps@entry=0x7ffef1dc7880, > > protocol=<optimized > > > > > out>) at lib/ofp-port.c:938 > > > > > > > > > > 2018-12-04 21:23:59 josh #4 0x00005634f35f7ab2 in > > connmgr_send_port_status > > > > > (mgr=0x5634f518a9a0, source=source@entry=0x0, pp=pp@entry > > =0x5634f5247310, > > > > > reason=reason@entry=2 '\002') at ofproto/connmgr.c:1654 > > > > > > > > > > 2018-12-04 21:23:59 josh #5 0x00005634f35bcfe3 in > > ofproto_port_set_state > > > > > (port=port@entry=0x5634f52472f0, state=<optimized out>) at > > > > > ofproto/ofproto.c:2485 > > > > > > > > > > 2018-12-04 21:23:59 josh #6 0x00005634f35d07e3 in port_run > > > > > (ofport=0x5634f52472e0) at ofproto/ofproto-dpif.c:3629 > > > > > > > > > > 2018-12-04 21:23:59 josh #7 run (ofproto_=0x5634f51dd2c0) at > > > > > ofproto/ofproto-dpif.c:1666 > > > > > > > > > > 2018-12-04 21:23:59 josh #8 0x00005634f35be5ee in ofproto_run > > > > > (p=0x5634f51dd2c0) at ofproto/ofproto.c:1741 > > > > > > > > > > 2018-12-04 21:23:59 josh #9 0x00005634f35abe9c in bridge_run__ () at > > > > > vswitchd/bridge.c:2944 > > > > > > > > > > 2018-12-04 21:23:59 josh #10 0x00005634f35b19e0 in bridge_run () at > > > > > vswitchd/bridge.c:3002 > > > > > > > > > > 2018-12-04 21:23:59 josh #11 0x00005634f3211595 in main > > (argc=<optimized > > > > > out>, argv=<optimized out>) at vswitchd/ovs-vswitchd.c:125 > > > > > > > > > > On Tue, Dec 11, 2018 at 7:50 AM Flavio Leitner <f...@sysclose.org> > > wrote: > > > > > > > > > > > On Wed, Dec 05, 2018 at 12:11:28PM +1300, Josh Bailey via discuss > > wrote: > > > > > > > Hello OVS colleagues, > > > > > > > > > > > > > > vswitchd appears to crash handling a port add/mod. Please see > > following > > > > > > to > > > > > > > reproduce. > > > > > > > > > > > > > > Run two Ryu OF controllers: > > > > > > > > > > > > > > $ ryu-manager --ofp-tcp-listen-port 6653 --ofp-listen-host > > 127.0.0.1 > > > > > > > --verbose --app-lists ryu.app.simple_switch_stp > > > > > > > > > > > > > > $ ryu-manager --ofp-tcp-listen-port 6654 --ofp-listen-host > > 127.0.0.1 > > > > > > > --verbose --app-lists ryu.app.simple_switch_stp > > > > > > > > > > > > > > > > > > > > > Now set up a bridge with no interfaces: > > > > > > > > > > > > > > > > > > > > > root@faucet:~/faucet# > > /usr/local/share/openvswitch/scripts/ovs-ctl start > > > > > > > * Starting ovsdb-server > > > > > > > * system ID not configured, please use --system-id > > > > > > > * Configuring Open vSwitch system IDs > > > > > > > * Starting ovs-vswitchd > > > > > > > * Enabling remote OVSDB managers > > > > > > > root@faucet:~/faucet# ovs-vsctl --version > > > > > > > ovs-vsctl (Open vSwitch) 2.9.3 > > > > > > > DB Schema 7.15.1 > > > > > > > root@faucet:~/faucet# ovs-vsctl add-br br0 > > > > > > > root@faucet:~/faucet# ovs-vsctl set-controller br0 tcp: > > 127.0.0.1:6653 > > > > > > tcp: > > > > > > > 127.0.0.1:6654 > > > > > > > > > > > > > > > > > > > > > Now add a physical interface known to be up: > > > > > > > > > > > > > > > > > > > > > root@faucet:~/faucet# ovs-vsctl add-port br0 enp2s0f0 > > > > > > > > > > > > > > > > > > > > > Observe crash in log: > > > > > > > > > > > > > > > > > > > > > 2018-12-04T23:03:06.663Z|00036|bridge|INFO|bridge br0: added > > interface > > > > > > > enp2s0f0 on port 1 > > > > > > > 2018-12-04T23:03:06.663Z|00037|bridge|INFO|bridge br0: using > > datapath ID > > > > > > > 000090e2ba7e7558 > > > > > > > 2018-12-04T23:03:06.663Z|00038|rconn|INFO|br0<->tcp: > > 127.0.0.1:6653: > > > > > > > disconnecting > > > > > > > 2018-12-04T23:03:06.663Z|00039|rconn|INFO|br0<->tcp: > > 127.0.0.1:6654: > > > > > > > disconnecting > > > > > > > 2018-12-04T23:03:06.664Z|00040|fail_open|WARN|Could not connect > > to > > > > > > > controller (or switch failed controller's post-connection > > admission > > > > > > control > > > > > > > policy) for 19 seconds, failing open > > > > > > > 2018-12-04T23:03:06.710Z|00002|daemon_unix(monitor)|ERR|1 > > crashes: pid > > > > > > 5620 > > > > > > > died, killed (Aborted), core dumped, restarting > > > > > > > > > > > > Please open the coredump using gdb and provide the backtrace at > > least, > > > > > > Thanks, > > > > > > -- > > > > > > fbl > > > > > > > > > > > > > > > > > > > > -- > > > > Flavio > > > > > > > > _______________________________________________ > > > > discuss mailing list > > > > disc...@openvswitch.org > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss