On 5 Sep 2024, at 18:35, Kevin Traynor wrote:
> On 08/08/2024 10:57, Xinxin Zhao wrote: >> When the ovs control thread del vhost-user port and >> the vhost-event thread process the vhost-user port down concurrently, >> the main thread may fall into a deadlock. >> >> E.g., vhostuser port is created as client. >> The ovs control thread executes the following process: >> rte_vhost_driver_unregister->fdset_try_del. >> At the same time, the vhost-event thread executes the following process: >> fdset_event_dispatch->vhost_user_read_cb->destroy_device. >> At this time, vhost-event will wait for rcu scheduling, >> and the ovs control thread is waiting for pfdentry->busy to be 0. >> The two threads are waiting for each other and fall into a deadlock. >> > > Hi Xinxin, > > Thanks for the patch. I managed to reproduced this with a little bit of > hacking. Indeed, a deadlock can occur with some unlucky timing. > > Acked-by: Kevin Traynor <[email protected]> Kevin or Xinxin, can you add some more explanation on where the deadlock is occurring? Also, how do we guarantee that it’s safe to go to quiesce state and that no others in the call chain hold/use any RCU-protected data? Thanks, Eelco >> Fixes: afee281 ("netdev-dpdk: Fix dpdk_watchdog failure to quiesce.") >> >> Signed-off-by: Xinxin Zhao <[email protected]> >> --- >> lib/netdev-dpdk.c | 11 ++++++++++- >> 1 file changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >> index 02cef6e45..0c02357f5 100644 >> --- a/lib/netdev-dpdk.c >> +++ b/lib/netdev-dpdk.c >> @@ -1808,7 +1808,16 @@ dpdk_vhost_driver_unregister(struct netdev_dpdk *dev >> OVS_UNUSED, >> OVS_EXCLUDED(dpdk_mutex) >> OVS_EXCLUDED(dev->mutex) >> { >> - return rte_vhost_driver_unregister(vhost_id); >> + int ret; >> + /* Due to the rcu wait of the vhost-event thread, >> + * rte_vhost_driver_unregister() may loop endlessly. >> + * So the unregister action needs to be removed from the rcu_list. >> + */ >> + ovsrcu_quiesce_start(); >> + ret = rte_vhost_driver_unregister(vhost_id); >> + ovsrcu_quiesce_end(); >> + >> + return ret; >> } >> >> static void > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
