Hi Maxime,

> -----Original Message-----
> From: Maxime Coquelin <maxime.coque...@redhat.com>
> Sent: Friday, April 16, 2021 2:35 PM
> To: Hu, Jiayu <jiayu...@intel.com>; dev@dpdk.org
> Cc: Xia, Chenbo <chenbo....@intel.com>; Wang, Yinan
> <yinan.w...@intel.com>; Pai G, Sunil <sunil.pa...@intel.com>; Jiang, Cheng1
> <cheng1.ji...@intel.com>
> Subject: Re: [PATCH v2 3/4] vhost: avoid deadlock on async register
> 
> Hi Jiayu,
> 
> On 4/16/21 4:19 AM, Hu, Jiayu wrote:
> > Hi Maxime,
> >>>> On 4/14/21 3:40 AM, Hu, Jiayu wrote:
> >>>>> Hi Maxime,
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Maxime Coquelin <maxime.coque...@redhat.com>
> >>>>>> Sent: Tuesday, April 13, 2021 7:33 PM
> >>>>>> To: Hu, Jiayu <jiayu...@intel.com>; dev@dpdk.org
> >>>>>> Cc: Xia, Chenbo <chenbo....@intel.com>; Wang, Yinan
> >>>>>> <yinan.w...@intel.com>; Pai G, Sunil <sunil.pa...@intel.com>; Jiang,
> >>>> Cheng1
> >>>>>> <cheng1.ji...@intel.com>
> >>>>>> Subject: Re: [PATCH v2 3/4] vhost: avoid deadlock on async register
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 4/2/21 3:04 PM, Jiayu Hu wrote:
> >>>>>>> Users can register async copy device in vring_state_changed(),
> >>>>>>> when vhost queue is enabled. However, a deadlock occurs inside
> >>>>>>> rte_vhost_async_channel_register(), if
> >>>>>> VHOST_USER_F_PROTOCOL_FEATURES
> >>>>>>> is not supported, as vhost_user_msg_handler() takes vq-
> >access_lock
> >>>>>>> before calling vhost_user_set_vring_kick().
> >>>>>>>
> >>>>>>> This patch avoids async register deadlock by removing calling
> >>>>>>> vring_state_changed() in vhost_user_set_vring_kick(). It's safe
> >>>>>>> as vhost_user_msg_handler() will call vring_state_changed()
> anyway.
> >>>>>>>
> >>>>>>> Signed-off-by: Jiayu Hu <jiayu...@intel.com>
> >>>>>>> ---
> >>>>>>>  lib/librte_vhost/vhost_user.c | 3 ---
> >>>>>>>  1 file changed, 3 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/lib/librte_vhost/vhost_user.c
> >> b/lib/librte_vhost/vhost_user.c
> >>>>>>> index 44c0452..8f0eba6 100644
> >>>>>>> --- a/lib/librte_vhost/vhost_user.c
> >>>>>>> +++ b/lib/librte_vhost/vhost_user.c
> >>>>>>> @@ -1918,9 +1918,6 @@ vhost_user_set_vring_kick(struct
> virtio_net
> >>>>>> **pdev, struct VhostUserMsg *msg,
> >>>>>>>        */
> >>>>>>>       if (!(dev->features & (1ULL <<
> >>>>>> VHOST_USER_F_PROTOCOL_FEATURES))) {
> >>>>>>>               vq->enabled = true;
> >>>>>>> -             if (dev->notify_ops->vring_state_changed)
> >>>>>>> -                     dev->notify_ops->vring_state_changed(
> >>>>>>> -                             dev->vid, file.index, 1);
> >>>>>>>       }
> >>>>>>>
> >>>>>>>       if (vq->ready) {
> >>>>>>>
> >>>>>>
> >>>>>> As replied earlier on v1, I agree the call to vring_state_changed here
> >>>>>> is not needed. But it might not be enough, there are other cases
> where
> >>>>>> you could have issues.
> >>>>>
> >>>>> vhost_user_notify_queue_state() can be called in three cases:
> >>>>> 1. when vq ready status changes, vhost_user_msg_handler() calls it to
> >>>> notify
> >>>>> backend. But vhost_user_msg_handler() doesn't take lock before
> calling
> >> it.
> >>>>> So in this case, no deadlock occurs in async register.
> >>>>>
> >>>>> 2. if vq->ready is true, vhost_user_set_vring_call() calls it to notify
> >> backend
> >>>>> vq is not enabled. Although vhost_user_set_vring_call() is protected by
> >> lock,
> >>>>> async register is called only if vq is enabled, so async register will 
> >>>>> not
> be
> >>>> called
> >>>>> in this case.
> >>>>>
> >>>>> 3. If vq->ready is true, vhost_user_set_vring_kick() calls it to notify
> >> backend
> >>>>> vq is not enabled. Same as #2, async register is called only when vq is
> >>>> enabled.
> >>>>> Even if vhost_user_set_vring_kick() is protected by lock, there is no
> >>>> deadlock in
> >>>>> async register, as it will not be called in this case.
> >>>>>
> >>>>> In summary,  I think there is no deadlock issue in async register if we
> >>>>> can remove calling vring_state_change() in
> vhost_user_set_vring_kick().
> >>>>
> >>>>
> >>>> But unregister one could be called in theory no? Otherwise it would
> look
> >>>> unbalanced. At least on disabled notification, the app should make sure
> >>>> the DMA transfers to and from the vring are stopped before it returns
> >>>> from the callabck. Otherwise it could lead to undefined behavior.
> >>>
> >>> Right, users need to call unregister, but we cannot remove calling
> >>> vhost_user_notify_queue_state() in case #2 and #3, IMHO. So to
> >>> avoid deadlock, we recommended users to call async unregister in
> >>> destroy_device(), instead of on vring disabled notification. Does it
> >>> make sense to you?
> >>
> >> Calling async unregister in destroy device is fine by me. But I'm more
> >> concerned about DMA transations not being stopped when the ring
> becomes
> >> disabled.
> > If ring becomes disabled, no more new pkts go into async data path, as
> > virtio_dev_rx_async_submit() returns if vq->enabled is false.
> >
> >>
> >> I cannot say if you are doing it right, because the vhost example does
> >> not implement the vring_state_changed callback.
> >> It is not a problem with the sync datapath as we have the lock
> >> protection + enabled variable that prevents to process the rings when it
> >> gets stopped.
> >> But for the async path, if you have programmed DMA transfers, you need
> >> to rely on the vring_state_change to block the control path while the
> >> transfers are cancelled or done.
> >
> > I am not sure if I understand your concern correctly, but for async data
> path,
> > enable variable can prevent it from enqueue new pkts when ring is
> disabled, as
> > virtio_dev_rx_async_submit() check enable variable before processing ring;
> > in addition, lock protection can stop async data path as
> virtio_dev_rx_async_submit()
> > acquires lock too. For example, in the case that front-end updates kickfd,
> > set_vring_kick() will notify backend that vring is stopped by calling
> > vhost_user_notify_queue_state(). In this case, sync data path will stop as a
> result of
> > lock protection, and I think it's the same for async data path, as it 
> > acquires
> lock before
> > processing ring.
> 
> 
> What about memory hotplug happening while the DMA transfers are on-
> going
> for example?
> 
> In this case, the lock is enough for sync datapath, but is not for async
> one.
> 
> If a VHOST_USER_SET_MEM_TABLE request is received while
> virtio_dev_rx_async_submit(), it will be blocked until
> virtio_dev_rx_async_submit() returns but the DMA transfers may not be
> finished yet.
> When unblocked the control thread will call vhost_user_set_mem_table(),
> which will mnumap the current memory regions before mmaping the new
> ones while DMA transfers are on-going.
> 
> To fix this, we need to call the vring_state_changed callback for all
> the virtqueues with disable state. in your app, you need to stop DMA
> transfers when disabled state notification happens, or block the
> callback while the transfer is done.

Let me summarize the problem:
when frontend memory is hot-plug, host application needs to stop
DMA transfer for all virtqueues, which is done by emptying in-flight
DMA copies and unregister async inside vring_state_changed(). But
vhost library takes lock before entering vring_state_changed(), and
async unregister and rte_vhost_poll_enqueue_completed() both
acquire lock, so deadlock occurs inside rte_vhost_poll_enqueue_
completed() or async unregister. (please correct me if am wrong)

There may be two possible solutions:
1. remove lock before calling vring_state_change() in vhost library;
2. provide a new callback without acquiring lock for DMA accelerated
case. The callback is used to notify backend that you need to stop
DMA transfer.

Thanks,
Jiayu

Reply via email to