Hi Kevin,
Sorry for the long time no reply. We encountered this problem in the actual production environment. The triggering condition was deleting the vhost port and virtio front-end set vhost port link_down concurrently. >>>> Also, how do we guarantee that it’s safe to go to quiesce state and that >>>> no others in the call chain hold/use any RCU-protected data? My modification idea is that dpdk already has a lock to protect the security of dpdk vhost port resources. Regarding ovs netdev data, ovs has netdev_mutex lock protection, so we think the data is relatively safe. We have not thought of a better solution yet, and the current modification has been running stably in the production environment for 1 year. At 2024-09-11 17:19:06, "Kevin Traynor" <[email protected]> wrote: >On 09/09/2024 13:34, Eelco Chaudron wrote: >> >> >> On 6 Sep 2024, at 16:03, Kevin Traynor wrote: >> >>> On 06/09/2024 07:38, Eelco Chaudron wrote: >>>> >>>> >>>> On 5 Sep 2024, at 18:35, Kevin Traynor wrote: >>>> >>>>> On 08/08/2024 10:57, Xinxin Zhao wrote: >>>>>> When the ovs control thread del vhost-user port and >>>>>> the vhost-event thread process the vhost-user port down concurrently, >>>>>> the main thread may fall into a deadlock. >>>>>> >>>>>> E.g., vhostuser port is created as client. >>>>>> The ovs control thread executes the following process: >>>>>> rte_vhost_driver_unregister->fdset_try_del. >>>>>> At the same time, the vhost-event thread executes the following process: >>>>>> fdset_event_dispatch->vhost_user_read_cb->destroy_device. >>>>>> At this time, vhost-event will wait for rcu scheduling, >>>>>> and the ovs control thread is waiting for pfdentry->busy to be 0. >>>>>> The two threads are waiting for each other and fall into a deadlock. >>>>>> >>>>> > >Hi Xinxin, > >Just wondering if you observed this during normal usage, or did you >identify from code inspection/tools or some other stress test etc ? > >thanks, >Kevin. > >>>>> Hi Xinxin, >>>>> >>>>> Thanks for the patch. I managed to reproduced this with a little bit of >>>>> hacking. Indeed, a deadlock can occur with some unlucky timing. >>>>> >>>>> Acked-by: Kevin Traynor <[email protected]> >>>> >>>> Kevin or Xinxin, can you add some more explanation on where the deadlock >>>> is occurring? >>>> >>> >>> vhost-event thread is blocking on the synchronize, waiting for main >>> thread to queisce, i.e. >>> >>> #3 0x0000000003690c09 in ovsrcu_synchronize () at lib/ovs-rcu.c:237 >>> #4 0x00000000037162fa in destroy_device (vid=0) at lib/netdev-dpdk.c:4919 >>> #5 0x0000000003238a12 in vhost_destroy_device_notify (dev=0x14ba693c0) >>> at ../lib/vhost/vhost.c:756 >>> >>> While main thread is looping in dpdk unregister, waiting for vhost-event >>> callback to finish, i.e.: >>> >>> #1 0x00000000032336e8 in rte_vhost_driver_unregister (path=0xd217a30 >>> "/tmp/vhost0") at ../lib/vhost/socket.c:1075 >>> #2 0x000000000370d5f3 in dpdk_vhost_driver_unregister (dev=0x17d6e8b40, >>> vhost_id=0xd217a30 "/tmp/vhost0") at lib/netdev-dpdk.c:1811 >>> #3 0x000000000370d709 in netdev_dpdk_vhost_destruct >>> (netdev=0x17d6e8bc0) at lib/netdev-dpdk.c:1843 >>> >>>> Also, how do we guarantee that it’s safe to go to quiesce state and that >>>> no others in the call chain hold/use any RCU-protected data? >>>> >>> >>> It should be fine for anything rcu freed in the main thread (possible >>> mirror bridge struct) as other threads would need to quiesce too, but >>> you have a point about anything used in main thread. We are deep into >>> bridge_run(), so I'm not sure how we could test for every scenario. >>> >>> If we can't guarantee it, then maybe another approach is needed, perhaps >>> we could hijack the ovs_vhost thread mechanism to call the unregister ? >>> but i'm not sure if there's other implications doing it asynchronously. >> >> >> This callback is called by netdev_unref(), which is called for example by >> netdev_close(). netdev_close() is called all over the place which makes it >> unsafe to just go through quiesce state. I guess we need another way to fix >> this. >> >>>> Thanks, >>>> >>>> Eelco >>>> >>>>>> Fixes: afee281 ("netdev-dpdk: Fix dpdk_watchdog failure to quiesce.") >>>>>> >>>>>> Signed-off-by: Xinxin Zhao <[email protected]> >>>>>> --- >>>>>> lib/netdev-dpdk.c | 11 ++++++++++- >>>>>> 1 file changed, 10 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >>>>>> index 02cef6e45..0c02357f5 100644 >>>>>> --- a/lib/netdev-dpdk.c >>>>>> +++ b/lib/netdev-dpdk.c >>>>>> @@ -1808,7 +1808,16 @@ dpdk_vhost_driver_unregister(struct netdev_dpdk >>>>>> *dev OVS_UNUSED, >>>>>> OVS_EXCLUDED(dpdk_mutex) >>>>>> OVS_EXCLUDED(dev->mutex) >>>>>> { >>>>>> - return rte_vhost_driver_unregister(vhost_id); >>>>>> + int ret; >>>>>> + /* Due to the rcu wait of the vhost-event thread, >>>>>> + * rte_vhost_driver_unregister() may loop endlessly. >>>>>> + * So the unregister action needs to be removed from the rcu_list. >>>>>> + */ >>>>>> + ovsrcu_quiesce_start(); >>>>>> + ret = rte_vhost_driver_unregister(vhost_id); >>>>>> + ovsrcu_quiesce_end(); >>>>>> + >>>>>> + return ret; >>>>>> } >>>>>> >>>>>> static void >>>>> >>>>> _______________________________________________ >>>>> dev mailing list >>>>> [email protected] >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>> >> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
