Hi Kevin,



Sorry for the long time no reply.

We encountered this problem in the actual production environment.

The triggering condition was deleting the vhost port and virtio front-end set 
vhost port link_down concurrently.

>>>> Also, how do we guarantee that it’s safe to go to quiesce state and that 
>>>> no others in the call chain hold/use any RCU-protected data?

   My modification idea is that dpdk already has a lock to protect the security 
of dpdk vhost port resources.

   Regarding ovs netdev data, ovs has netdev_mutex lock protection, so we think 
the data is relatively safe. 

   We have not thought of a better solution yet, and the current modification 
has been running stably in the production environment for 1 year.


At 2024-09-11 17:19:06, "Kevin Traynor" <[email protected]> wrote:
>On 09/09/2024 13:34, Eelco Chaudron wrote:
>> 
>> 
>> On 6 Sep 2024, at 16:03, Kevin Traynor wrote:
>> 
>>> On 06/09/2024 07:38, Eelco Chaudron wrote:
>>>>
>>>>
>>>> On 5 Sep 2024, at 18:35, Kevin Traynor wrote:
>>>>
>>>>> On 08/08/2024 10:57, Xinxin Zhao wrote:
>>>>>> When the ovs control thread del vhost-user port and
>>>>>> the vhost-event thread process the vhost-user port down concurrently,
>>>>>> the main thread may fall into a deadlock.
>>>>>>
>>>>>> E.g., vhostuser port is created as client.
>>>>>> The ovs control thread executes the following process:
>>>>>> rte_vhost_driver_unregister->fdset_try_del.
>>>>>> At the same time, the vhost-event thread executes the following process:
>>>>>> fdset_event_dispatch->vhost_user_read_cb->destroy_device.
>>>>>> At this time, vhost-event will wait for rcu scheduling,
>>>>>> and the ovs control thread is waiting for pfdentry->busy to be 0.
>>>>>> The two threads are waiting for each other and fall into a deadlock.
>>>>>>
>>>>>
>
>Hi Xinxin,
>
>Just wondering if you observed this during normal usage, or did you
>identify from code inspection/tools or some other stress test etc ?
>
>thanks,
>Kevin.
>
>>>>> Hi Xinxin,
>>>>>
>>>>> Thanks for the patch. I managed to reproduced this with a little bit of
>>>>> hacking. Indeed, a deadlock can occur with some unlucky timing.
>>>>>
>>>>> Acked-by: Kevin Traynor <[email protected]>
>>>>
>>>> Kevin or Xinxin, can you add some more explanation on where the deadlock 
>>>> is occurring?
>>>>
>>>
>>> vhost-event thread is blocking on the synchronize, waiting for main
>>> thread to queisce, i.e.
>>>
>>> #3  0x0000000003690c09 in ovsrcu_synchronize () at lib/ovs-rcu.c:237
>>> #4  0x00000000037162fa in destroy_device (vid=0) at lib/netdev-dpdk.c:4919
>>> #5  0x0000000003238a12 in vhost_destroy_device_notify (dev=0x14ba693c0)
>>> at ../lib/vhost/vhost.c:756
>>>
>>> While main thread is looping in dpdk unregister, waiting for vhost-event
>>> callback to finish, i.e.:
>>>
>>> #1  0x00000000032336e8 in rte_vhost_driver_unregister (path=0xd217a30
>>> "/tmp/vhost0") at ../lib/vhost/socket.c:1075
>>> #2  0x000000000370d5f3 in dpdk_vhost_driver_unregister (dev=0x17d6e8b40,
>>> vhost_id=0xd217a30 "/tmp/vhost0") at lib/netdev-dpdk.c:1811
>>> #3  0x000000000370d709 in netdev_dpdk_vhost_destruct
>>> (netdev=0x17d6e8bc0) at lib/netdev-dpdk.c:1843
>>>
>>>> Also, how do we guarantee that it’s safe to go to quiesce state and that 
>>>> no others in the call chain hold/use any RCU-protected data?
>>>>
>>>
>>> It should be fine for anything rcu freed in the main thread (possible
>>> mirror bridge struct) as other threads would need to quiesce too, but
>>> you have a point about anything used in main thread. We are deep into
>>> bridge_run(), so I'm not sure how we could test for every scenario.
>>>
>>> If we can't guarantee it, then maybe another approach is needed, perhaps
>>> we could hijack the ovs_vhost thread mechanism to call the unregister ?
>>> but i'm not sure if there's other implications doing it asynchronously.
>> 
>> 
>> This callback is called by netdev_unref(), which is called for example by 
>> netdev_close(). netdev_close() is called all over the place which makes it 
>> unsafe to just go through quiesce state. I guess we need another way to fix 
>> this.
>> 
>>>> Thanks,
>>>>
>>>> Eelco
>>>>
>>>>>> Fixes: afee281 ("netdev-dpdk: Fix dpdk_watchdog failure to quiesce.")
>>>>>>
>>>>>> Signed-off-by: Xinxin Zhao <[email protected]>
>>>>>> ---
>>>>>>  lib/netdev-dpdk.c | 11 ++++++++++-
>>>>>>  1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>>>>>> index 02cef6e45..0c02357f5 100644
>>>>>> --- a/lib/netdev-dpdk.c
>>>>>> +++ b/lib/netdev-dpdk.c
>>>>>> @@ -1808,7 +1808,16 @@ dpdk_vhost_driver_unregister(struct netdev_dpdk 
>>>>>> *dev OVS_UNUSED,
>>>>>>      OVS_EXCLUDED(dpdk_mutex)
>>>>>>      OVS_EXCLUDED(dev->mutex)
>>>>>>  {
>>>>>> -    return rte_vhost_driver_unregister(vhost_id);
>>>>>> +    int ret;
>>>>>> +    /* Due to the rcu wait of the vhost-event thread,
>>>>>> +     * rte_vhost_driver_unregister() may loop endlessly.
>>>>>> +     * So the unregister action needs to be removed from the rcu_list.
>>>>>> +     */
>>>>>> +    ovsrcu_quiesce_start();
>>>>>> +    ret = rte_vhost_driver_unregister(vhost_id);
>>>>>> +    ovsrcu_quiesce_end();
>>>>>> +
>>>>>> +    return ret;
>>>>>>  }
>>>>>>
>>>>>>  static void
>>>>>
>>>>> _______________________________________________
>>>>> dev mailing list
>>>>> [email protected]
>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>
>> 
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to