Hi Maxime, > -----Original Message----- > From: Maxime Coquelin <maxime.coque...@redhat.com> > Sent: Friday, April 16, 2021 2:35 PM > To: Hu, Jiayu <jiayu...@intel.com>; dev@dpdk.org > Cc: Xia, Chenbo <chenbo....@intel.com>; Wang, Yinan > <yinan.w...@intel.com>; Pai G, Sunil <sunil.pa...@intel.com>; Jiang, Cheng1 > <cheng1.ji...@intel.com> > Subject: Re: [PATCH v2 3/4] vhost: avoid deadlock on async register > > Hi Jiayu, > > On 4/16/21 4:19 AM, Hu, Jiayu wrote: > > Hi Maxime, > >>>> On 4/14/21 3:40 AM, Hu, Jiayu wrote: > >>>>> Hi Maxime, > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Maxime Coquelin <maxime.coque...@redhat.com> > >>>>>> Sent: Tuesday, April 13, 2021 7:33 PM > >>>>>> To: Hu, Jiayu <jiayu...@intel.com>; dev@dpdk.org > >>>>>> Cc: Xia, Chenbo <chenbo....@intel.com>; Wang, Yinan > >>>>>> <yinan.w...@intel.com>; Pai G, Sunil <sunil.pa...@intel.com>; Jiang, > >>>> Cheng1 > >>>>>> <cheng1.ji...@intel.com> > >>>>>> Subject: Re: [PATCH v2 3/4] vhost: avoid deadlock on async register > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 4/2/21 3:04 PM, Jiayu Hu wrote: > >>>>>>> Users can register async copy device in vring_state_changed(), > >>>>>>> when vhost queue is enabled. However, a deadlock occurs inside > >>>>>>> rte_vhost_async_channel_register(), if > >>>>>> VHOST_USER_F_PROTOCOL_FEATURES > >>>>>>> is not supported, as vhost_user_msg_handler() takes vq- > >access_lock > >>>>>>> before calling vhost_user_set_vring_kick(). > >>>>>>> > >>>>>>> This patch avoids async register deadlock by removing calling > >>>>>>> vring_state_changed() in vhost_user_set_vring_kick(). It's safe > >>>>>>> as vhost_user_msg_handler() will call vring_state_changed() > anyway. > >>>>>>> > >>>>>>> Signed-off-by: Jiayu Hu <jiayu...@intel.com> > >>>>>>> --- > >>>>>>> lib/librte_vhost/vhost_user.c | 3 --- > >>>>>>> 1 file changed, 3 deletions(-) > >>>>>>> > >>>>>>> diff --git a/lib/librte_vhost/vhost_user.c > >> b/lib/librte_vhost/vhost_user.c > >>>>>>> index 44c0452..8f0eba6 100644 > >>>>>>> --- a/lib/librte_vhost/vhost_user.c > >>>>>>> +++ b/lib/librte_vhost/vhost_user.c > >>>>>>> @@ -1918,9 +1918,6 @@ vhost_user_set_vring_kick(struct > virtio_net > >>>>>> **pdev, struct VhostUserMsg *msg, > >>>>>>> */ > >>>>>>> if (!(dev->features & (1ULL << > >>>>>> VHOST_USER_F_PROTOCOL_FEATURES))) { > >>>>>>> vq->enabled = true; > >>>>>>> - if (dev->notify_ops->vring_state_changed) > >>>>>>> - dev->notify_ops->vring_state_changed( > >>>>>>> - dev->vid, file.index, 1); > >>>>>>> } > >>>>>>> > >>>>>>> if (vq->ready) { > >>>>>>> > >>>>>> > >>>>>> As replied earlier on v1, I agree the call to vring_state_changed here > >>>>>> is not needed. But it might not be enough, there are other cases > where > >>>>>> you could have issues. > >>>>> > >>>>> vhost_user_notify_queue_state() can be called in three cases: > >>>>> 1. when vq ready status changes, vhost_user_msg_handler() calls it to > >>>> notify > >>>>> backend. But vhost_user_msg_handler() doesn't take lock before > calling > >> it. > >>>>> So in this case, no deadlock occurs in async register. > >>>>> > >>>>> 2. if vq->ready is true, vhost_user_set_vring_call() calls it to notify > >> backend > >>>>> vq is not enabled. Although vhost_user_set_vring_call() is protected by > >> lock, > >>>>> async register is called only if vq is enabled, so async register will > >>>>> not > be > >>>> called > >>>>> in this case. > >>>>> > >>>>> 3. If vq->ready is true, vhost_user_set_vring_kick() calls it to notify > >> backend > >>>>> vq is not enabled. Same as #2, async register is called only when vq is > >>>> enabled. > >>>>> Even if vhost_user_set_vring_kick() is protected by lock, there is no > >>>> deadlock in > >>>>> async register, as it will not be called in this case. > >>>>> > >>>>> In summary, I think there is no deadlock issue in async register if we > >>>>> can remove calling vring_state_change() in > vhost_user_set_vring_kick(). > >>>> > >>>> > >>>> But unregister one could be called in theory no? Otherwise it would > look > >>>> unbalanced. At least on disabled notification, the app should make sure > >>>> the DMA transfers to and from the vring are stopped before it returns > >>>> from the callabck. Otherwise it could lead to undefined behavior. > >>> > >>> Right, users need to call unregister, but we cannot remove calling > >>> vhost_user_notify_queue_state() in case #2 and #3, IMHO. So to > >>> avoid deadlock, we recommended users to call async unregister in > >>> destroy_device(), instead of on vring disabled notification. Does it > >>> make sense to you? > >> > >> Calling async unregister in destroy device is fine by me. But I'm more > >> concerned about DMA transations not being stopped when the ring > becomes > >> disabled. > > If ring becomes disabled, no more new pkts go into async data path, as > > virtio_dev_rx_async_submit() returns if vq->enabled is false. > > > >> > >> I cannot say if you are doing it right, because the vhost example does > >> not implement the vring_state_changed callback. > >> It is not a problem with the sync datapath as we have the lock > >> protection + enabled variable that prevents to process the rings when it > >> gets stopped. > >> But for the async path, if you have programmed DMA transfers, you need > >> to rely on the vring_state_change to block the control path while the > >> transfers are cancelled or done. > > > > I am not sure if I understand your concern correctly, but for async data > path, > > enable variable can prevent it from enqueue new pkts when ring is > disabled, as > > virtio_dev_rx_async_submit() check enable variable before processing ring; > > in addition, lock protection can stop async data path as > virtio_dev_rx_async_submit() > > acquires lock too. For example, in the case that front-end updates kickfd, > > set_vring_kick() will notify backend that vring is stopped by calling > > vhost_user_notify_queue_state(). In this case, sync data path will stop as a > result of > > lock protection, and I think it's the same for async data path, as it > > acquires > lock before > > processing ring. > > > What about memory hotplug happening while the DMA transfers are on- > going > for example? > > In this case, the lock is enough for sync datapath, but is not for async > one. > > If a VHOST_USER_SET_MEM_TABLE request is received while > virtio_dev_rx_async_submit(), it will be blocked until > virtio_dev_rx_async_submit() returns but the DMA transfers may not be > finished yet. > When unblocked the control thread will call vhost_user_set_mem_table(), > which will mnumap the current memory regions before mmaping the new > ones while DMA transfers are on-going. > > To fix this, we need to call the vring_state_changed callback for all > the virtqueues with disable state. in your app, you need to stop DMA > transfers when disabled state notification happens, or block the > callback while the transfer is done.
Let me summarize the problem: when frontend memory is hot-plug, host application needs to stop DMA transfer for all virtqueues, which is done by emptying in-flight DMA copies and unregister async inside vring_state_changed(). But vhost library takes lock before entering vring_state_changed(), and async unregister and rte_vhost_poll_enqueue_completed() both acquire lock, so deadlock occurs inside rte_vhost_poll_enqueue_ completed() or async unregister. (please correct me if am wrong) There may be two possible solutions: 1. remove lock before calling vring_state_change() in vhost library; 2. provide a new callback without acquiring lock for DMA accelerated case. The callback is used to notify backend that you need to stop DMA transfer. Thanks, Jiayu