On Wed, Oct 17, 2018 at 05:33:05PM +0530, nusid...@redhat.com wrote:
> From: Numan Siddique <nusid...@redhat.com>
> 
> We see the below trace when a port is added to a bridge and the configured
> controller is down
> 
> 0x00007fb002f8b207 in raise () from /lib64/libc.so.6
> 0x00007fb002f8c8f8 in abort () from /lib64/libc.so.6
> 0x00007fb004953026 in ofputil_protocol_to_ofp_version () from 
> /lib64/libopenvswitch-2.10.so.0
> 0x00007fb00494e38e in ofputil_encode_port_status () from 
> /lib64/libopenvswitch-2.10.so.0
> 0x00007fb004ef1c5b in connmgr_send_port_status () from 
> /lib64/libofproto-2.10.so.0
> 0x00007fb004efa9f4 in ofport_install () from /lib64/libofproto-2.10.so.0
> 0x00007fb004efbfb2 in update_port () from /lib64/libofproto-2.10.so.0
> 0x00007fb004efc7f9 in ofproto_port_add () from /lib64/libofproto-2.10.so.0
> 0x0000556d540a3f95 in bridge_add_ports__ ()
> 0x0000556d540a5a47 in bridge_reconfigure ()
> 0x0000556d540a9199 in bridge_run ()
> 0x0000556d540a02a5 in main ()
> 
> When connmgr detects that the connection to the controller is down, it
> resets the ofconn's protocol to 'OFPUTIL_P_NONE' and that's why we
> see the above abort. This patch fixes the issue by also checking the
> connection status before sending the port status in the
>  connmgr_send_port_status().
> 
> The issue can be reproduced by running the test added in this patch
> without the fix.
> 
> Signed-off-by: Numan Siddique <nusid...@redhat.com>

This is good work.  Thank you.

In general there's some bad factoring here, but a proper fix would be a
lot more work.

I think that we could generalize this by adjusting
ofconn_receives_async_msg() to return false if the connection is down.
At least one caller checks already, but it seems like all the callers
should ignore connections that are down.  What do you think of that?

Thanks,

Ben.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to