> -----Original Message----- > From: Maxime Coquelin [mailto:maxime.coque...@redhat.com] > Sent: Tuesday, March 6, 2018 5:27 PM > To: Kulasek, TomaszX <tomaszx.kula...@intel.com>; y...@fridaylinux.org > Cc: Verkamp, Daniel <daniel.verk...@intel.com>; Harris, James R > <james.r.har...@intel.com>; Wodkowski, PawelX > <pawelx.wodkow...@intel.com>; email@example.com; Stojaczyk, DariuszX > <dariuszx.stojac...@intel.com> > Subject: Re: [dpdk-dev] [PATCH] vhost: stop device before updating public > vring > data > > Hi Tomasz, > > On 03/05/2018 05:11 PM, Tomasz Kulasek wrote: > > For now DPDK assumes that callfd, kickfd and last_idx are being set just > > once during vring initialization and device cannot be running while DPDK > > receives SET_VRING_KICK, SET_VRING_CALL and SET_VRING_BASE messages. > > However, that assumption is wrong. For Vhost SCSI messages might arrive > > at any point of time, possibly multiple times, one after another. > > > > QEMU issues SET_VRING_CALL once during device initialization, then again > > during device start. The second message will close previous callfd, > > which is still being used by the user-implementation of vhost device. > > This results in writing to invalid (closed) callfd. > > > > Other messages like SET_FEATURES, SET_VRING_ADDR etc also will change > > internal state of VQ or device. To prevent race condition device should > > also be stopped before updateing vring data. > > > > Signed-off-by: Dariusz Stojaczyk<dariuszx.stojac...@intel.com> > > Signed-off-by: Pawel Wodkowski<pawelx.wodkow...@intel.com> > > Signed-off-by: Tomasz Kulasek<tomaszx.kula...@intel.com> > > --- > > lib/librte_vhost/vhost_user.c | 40 > ++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 40 insertions(+) > > In last release, we have introduced a per-virtqueue lock to protect > vring handling against asynchronous device changes. > > I think that would solve the issue you are facing, but you would need > to export the VQs locking functions to the vhost-user lib API to be > able to use it. > > I don't think your current patch is the right solution anyway, because > it destroys the device in case we don't want it to remain alive, like > set_log_base, or set_features when only the logging feature gets > enabled.
Please correct me if I can't see something obvious, but how this lock protect against eg SET_MEM_TABLE message? Current flow you are thinking of is: DPDK vhost-user thread 1.1. vhost_user_lock_all_queue_pairs() 1.2. vhost_user_set_mem_table() 1.3. vhost_user_unlock_all_queue_pairs() BACKEND: virito-net: 2.1. rte_spinlock_lock(&vq->access_lock); 2.2. Process vrings and copy all data 2.3. rte_spinlock_unlock(&vq->access_lock); Yes, it will synchronize access to virtio_net structure but what if the BACKEND is in zero copy mode and/or pass buffers to physical device? The request will not end in 2.2 and you unmap the memory regions in the middle of request. Even worse, the physical device will just abort the request but BACKEND can segfault or write random memory because BACKEND try to use invalid memory address (retrieved at request start). To use this per-virtqueue lock: 1. the lock need to be held from request start to the end - but this can starve DPDK vhost-user thread as there might be many request on-the-fly and when one is done the new one might be started. 2. Becouse we don't know if something changed between requst start and request end BACKEND need walk through all descriptors chain at the request end and do the rte_vhost_gpa_to_vva() again. The SET_MEM_TABLE is most obvious message but the same is true for other like VHOST_IOTLB_INVALIDATE or SET_FEATURES. Pawel > > Cheers, > Maxime