Re: [PATCH v4 5/7] vhost-vdpa: clean iotlb map during reset for older userspace
On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu wrote: > > Using .compat_reset op from the previous patch, the buggy .reset > behaviour can be kept as-is on older userspace apps, which don't ack the > IOTLB_PERSIST backend feature. As this compatibility quirk is limited to > those drivers that used to be buggy in the past, it won't affect change > the behaviour or affect ABI on the setups with API compliant driver. > > The separation of .compat_reset from the regular .reset allows > vhost-vdpa able to know which driver had broken behaviour before, so it > can apply the corresponding compatibility quirk to the individual driver > whenever needed. Compared to overloading the existing .reset with > flags, .compat_reset won't cause any extra burden to the implementation > of every compliant driver. > > Signed-off-by: Si-Wei Liu > --- > drivers/vhost/vdpa.c | 17 + > drivers/virtio/virtio_vdpa.c | 2 +- > include/linux/vdpa.h | 7 +-- > 3 files changed, 19 insertions(+), 7 deletions(-) > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > index acc7c74ba7d6..9ce40003793b 100644 > --- a/drivers/vhost/vdpa.c > +++ b/drivers/vhost/vdpa.c > @@ -227,13 +227,22 @@ static void vhost_vdpa_unsetup_vq_irq(struct vhost_vdpa > *v, u16 qid) > irq_bypass_unregister_producer(&vq->call_ctx.producer); > } > > -static int vhost_vdpa_reset(struct vhost_vdpa *v) > +static int _compat_vdpa_reset(struct vhost_vdpa *v) > { > struct vdpa_device *vdpa = v->vdpa; > + u32 flags = 0; > > - v->in_batch = 0; > + flags |= !vhost_backend_has_feature(v->vdev.vqs[0], > + VHOST_BACKEND_F_IOTLB_PERSIST) ? > +VDPA_RESET_F_CLEAN_MAP : 0; > + > + return vdpa_reset(vdpa, flags); > +} > > - return vdpa_reset(vdpa); > +static int vhost_vdpa_reset(struct vhost_vdpa *v) > +{ > + v->in_batch = 0; > + return _compat_vdpa_reset(v); > } > > static long vhost_vdpa_bind_mm(struct vhost_vdpa *v) > @@ -312,7 +321,7 @@ static long vhost_vdpa_set_status(struct vhost_vdpa *v, > u8 __user *statusp) > vhost_vdpa_unsetup_vq_irq(v, i); > > if (status == 0) { > - ret = vdpa_reset(vdpa); > + ret = _compat_vdpa_reset(v); > if (ret) > return ret; > } else > diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c > index 06ce6d8c2e00..8d63e5923d24 100644 > --- a/drivers/virtio/virtio_vdpa.c > +++ b/drivers/virtio/virtio_vdpa.c > @@ -100,7 +100,7 @@ static void virtio_vdpa_reset(struct virtio_device *vdev) > { > struct vdpa_device *vdpa = vd_get_vdpa(vdev); > > - vdpa_reset(vdpa); > + vdpa_reset(vdpa, 0); > } > > static bool virtio_vdpa_notify(struct virtqueue *vq) > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h > index 6b8cbf75712d..db15ac07f8a6 100644 > --- a/include/linux/vdpa.h > +++ b/include/linux/vdpa.h > @@ -519,14 +519,17 @@ static inline struct device *vdpa_get_dma_dev(struct > vdpa_device *vdev) > return vdev->dma_dev; > } > > -static inline int vdpa_reset(struct vdpa_device *vdev) > +static inline int vdpa_reset(struct vdpa_device *vdev, u32 flags) > { > const struct vdpa_config_ops *ops = vdev->config; > int ret; > > down_write(&vdev->cf_lock); > vdev->features_valid = false; > - ret = ops->reset(vdev); > + if (ops->compat_reset && flags) > + ret = ops->compat_reset(vdev, flags); > + else > + ret = ops->reset(vdev); Instead of inventing a new API that carries the flags. Tweak the existing one seems to be simpler and better? As compat_reset(vdev, 0) == reset(vdev) Then you don't need the switch in the parent as well +static int vdpasim_reset(struct vdpa_device *vdpa) +{ + return vdpasim_compat_reset(vdpa, 0); +} Thanks > up_write(&vdev->cf_lock); > return ret; > } > -- > 2.39.3 > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC] vhost: vmap virtio descriptor table kernel/kvm
On Tue, Oct 24, 2023 at 11:17 AM Liang Chen wrote: > > The current vhost code uses 'copy_from_user' to copy descriptors from > userspace to vhost. We attempted to 'vmap' the descriptor table to > reduce the overhead associated with 'copy_from_user' during descriptor > access, because it can be accessed quite frequently. This change > resulted in a moderate performance improvement (approximately 3%) in > our pktgen test, as shown below. Additionally, the latency in the > 'vhost_get_vq_desc' function shows a noticeable decrease. > > current code: > IFACE rxpck/s txpck/srxkB/stxkB/s > rxcmp/s txcmp/s rxmcst/s %ifutil > Average:vnet0 0.31 1330807.03 0.02 77976.98 > 0.00 0.00 0.00 6.39 > # /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc > avg = 145 nsecs, total: 1455980341 nsecs, count: 9985224 > > vmap: > IFACE rxpck/s txpck/srxkB/stxkB/s > rxcmp/s txcmp/s rxmcst/s %ifutil > Average:vnet0 0.07 1371870.49 0.00 80383.04 > 0.00 0.00 0.00 6.58 > # /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc > avg = 122 nsecs, total: 1286983929 nsecs, count: 10478134 > > We are uncertain if there are any aspects we may have overlooked and > would appreciate any advice before we submit an actual patch. So the idea is to use a shadow page table instead of the userspace one to avoid things like spec barriers or SMAP. I've tried this in the past: commit 7f466032dc9e5a61217f22ea34b2df932786bbfc (HEAD) Author: Jason Wang Date: Fri May 24 04:12:18 2019 -0400 vhost: access vq metadata through kernel virtual address It was noticed that the copy_to/from_user() friends that was used to access virtqueue metdata tends to be very expensive for dataplane implementation like vhost since it involves lots of software checks, speculation barriers, hardware feature toggling (e.g SMAP). The extra cost will be more obvious when transferring small packets since the time spent on metadata accessing become more significant. ... Note that it tries to use a direct map instead of a VMAP as Andrea suggests. But it led to several fallouts which were tricky to be fixed[1] (like the use of MMU notifiers to do synchronization). So it is reverted finally. I'm not saying it's a dead end. But we need to find a way to solve the issues or use something different. I'm happy to offer help. 1) Avoid using SMAP for vhost kthread, for example using shed notifier, I'm not sure this is possible or not 2) A new iov iterator that doesn't do SMAP at all, this looks dangerous and Al might not like it 3) (Re)using HMM ... You may want to see archives for more information. We've had a lot of discussions. Thanks [1] https://lore.kernel.org/lkml/20190731084655.7024-1-jasow...@redhat.com/ > > > Thanks, > Liang > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH vhost 2/2] virtio: set broken when re-enabling vq fails
In vp_modern_enable_vq_after_reset, we will do some checks to ensure that the vq is ready to re-enable. If that fails, the vq is good. If the vq_active_vp() fails, that means the vq is broken. The driver can not use that vq, this commit sets the vq to broken. Signed-off-by: Xuan Zhuo --- drivers/virtio/virtio_pci_modern.c | 6 +++--- drivers/virtio/virtio_ring.c | 11 --- include/linux/virtio_config.h | 2 ++ 3 files changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c index ee6a386d250b..56a4075ca5fb 100644 --- a/drivers/virtio/virtio_pci_modern.c +++ b/drivers/virtio/virtio_pci_modern.c @@ -286,16 +286,16 @@ static int vp_modern_enable_vq_after_reset(struct virtqueue *vq) int err; if (!vq->reset) - return -EBUSY; + return -EINVAL; index = vq->index; info = vp_dev->vqs[index]; if (vp_modern_get_queue_reset(mdev, index)) - return -EBUSY; + return -EINVAL; if (vp_modern_get_queue_enable(mdev, index)) - return -EBUSY; + return -EINVAL; err = vp_active_vq(vq, info->msix_vector); if (err) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index af310418e66e..91e63c57c112 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -2183,11 +2183,16 @@ static int virtqueue_enable_after_reset(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); struct virtio_device *vdev = vq->vq.vdev; + int err; + + err = vdev->config->enable_vq_after_reset(_vq); + if (err == -EINVAL || !err) + return err; - if (vdev->config->enable_vq_after_reset(_vq)) - return -EBUSY; + dev_warn(&vdev->dev, "Fail to re-enable the vq.%u error:%d\n", _vq->index, err); + __virtqueue_unbreak(_vq); - return 0; + return err; } /* diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h index 2b3438de2c4d..f96bebf9b632 100644 --- a/include/linux/virtio_config.h +++ b/include/linux/virtio_config.h @@ -91,6 +91,8 @@ typedef void vq_callback_t(struct virtqueue *); * @enable_vq_after_reset: enable a reset queue * vq: the virtqueue * Returns 0 on success or error status + * -EINVAL: the vq is not in the reset status or is not ready to enable. + * Other error: enabling vq fails. The vq is in broken status. * If disable_vq_and_reset is set, then enable_vq_after_reset must also be * set. */ -- 2.32.0.3.g01195cf9f ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH vhost 1/2] virtio_ring: remove unused code
Remove the ignored return values; Signed-off-by: Xuan Zhuo --- drivers/virtio/virtio_ring.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 51d8f3299c10..af310418e66e 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -2755,9 +2755,9 @@ int virtqueue_resize(struct virtqueue *_vq, u32 num, return err; if (vq->packed_ring) - err = virtqueue_resize_packed(_vq, num); + virtqueue_resize_packed(_vq, num); else - err = virtqueue_resize_split(_vq, num); + virtqueue_resize_split(_vq, num); return virtqueue_enable_after_reset(_vq); } -- 2.32.0.3.g01195cf9f ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH vhost 0/2] virtio: set broken when re-enable vq fails
In vp_modern_enable_vq_after_reset, we will do some check to ensure that the vq is in reset status. If that fails, the vq is good. If the vq_active_vp() fails, that means the vq is broken. The driver will can not use that vq, this commit sets the vq to broken. Xuan Zhuo (2): virtio_ring: remove unused code virtio: set broken when re-enabling vq fails drivers/virtio/virtio_pci_modern.c | 6 +++--- drivers/virtio/virtio_ring.c | 15 ++- include/linux/virtio_config.h | 2 ++ 3 files changed, 15 insertions(+), 8 deletions(-) -- 2.32.0.3.g01195cf9f ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[RFC] vhost: vmap virtio descriptor table kernel/kvm
The current vhost code uses 'copy_from_user' to copy descriptors from userspace to vhost. We attempted to 'vmap' the descriptor table to reduce the overhead associated with 'copy_from_user' during descriptor access, because it can be accessed quite frequently. This change resulted in a moderate performance improvement (approximately 3%) in our pktgen test, as shown below. Additionally, the latency in the 'vhost_get_vq_desc' function shows a noticeable decrease. current code: IFACE rxpck/s txpck/srxkB/stxkB/s rxcmp/s txcmp/s rxmcst/s %ifutil Average:vnet0 0.31 1330807.03 0.02 77976.98 0.00 0.00 0.00 6.39 # /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc avg = 145 nsecs, total: 1455980341 nsecs, count: 9985224 vmap: IFACE rxpck/s txpck/srxkB/stxkB/s rxcmp/s txcmp/s rxmcst/s %ifutil Average:vnet0 0.07 1371870.49 0.00 80383.04 0.00 0.00 0.00 6.58 # /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc avg = 122 nsecs, total: 1286983929 nsecs, count: 10478134 We are uncertain if there are any aspects we may have overlooked and would appreciate any advice before we submit an actual patch. Thanks, Liang ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[RFC] vmap virtio descriptor table
The current vhost code uses 'copy_from_user' to copy descriptors from userspace to vhost. We attempted to 'vmap' the descriptor table to reduce the overhead associated with 'copy_from_user' during descriptor access, because it can be accessed quite frequently. This change resulted in a moderate performance improvement (approximately 3%) in our pktgen test, as shown below. Additionally, the latency in the 'vhost_get_vq_desc' function shows a noticeable decrease. current code: IFACE rxpck/s txpck/srxkB/stxkB/s rxcmp/s txcmp/s rxmcst/s %ifutil Average:vnet0 0.31 1330807.03 0.02 77976.98 0.00 0.00 0.00 6.39 # /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc avg = 145 nsecs, total: 1455980341 nsecs, count: 9985224 kmap: IFACE rxpck/s txpck/srxkB/stxkB/s rxcmp/s txcmp/s rxmcst/s %ifutil Average:vnet0 0.07 1371870.49 0.00 80383.04 0.00 0.00 0.00 6.58 # /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc avg = 122 nsecs, total: 1286983929 nsecs, count: 10478134 We are uncertain if there are any aspects we may have overlooked and would appreciate any advice before we submit an actual patch. Thanks, Liang ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 1/2] virtio_pci: Don't make an extra copy of cpu affinity mask
On Mon, 23 Oct 2023 18:52:45 +0200, Jakub Sitnicki wrote: > On Thu, Oct 19, 2023 at 08:55 PM +08, Xuan Zhuo wrote: > > On Thu, 19 Oct 2023 12:16:24 +0200, Jakub Sitnicki > > wrote: > >> Since commit 19e226e8cc5d ("virtio: Make vp_set_vq_affinity() take a > >> mask.") it is actually not needed to have a local copy of the cpu mask. > > > > > > Could you give more info to prove this? Actually, my question is that can we pass a val on the stack(or temp value) to the irq_set_affinity_hint()? Such as the virtio-net uses zalloc_cpumask_var to alloc a cpu_mask, and that will be released. int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, bool setaffinity) { unsigned long flags; struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL); if (!desc) return -EINVAL; -> desc->affinity_hint = m; irq_put_desc_unlock(desc, flags); if (m && setaffinity) __irq_set_affinity(irq, m, false); return 0; } EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint); The above code directly refers the mask pointer. If the mask is a temp value, I think that is a bug. And I notice that many places directly pass the temp value to this API. And I am a little confused. ^_^ Or I missed something. Thanks. > > > > If you are right, I think you should delete all code about > > msix_affinity_masks? > > Sorry for the late reply. I've been away. > > It looks that msix_affinity_masks became unused - intentionally - in > 2015, after commit 210d150e1f5d ("virtio_pci: Clear stale cpumask when > setting irq affinity") [1]. > > Originally introduced in 2012 in commit 75a0a52be3c2 ("virtio: introduce > an API to set affinity for a virtqueue") [2]. As I understand, it was > meant to make it possible to set VQ affinity to more than once CPU. > > Now that we can pass a CPU mask, listing all CPUs, to set_vq_affinity, > msix_affinity_masks seems to no longer have a purpose. > > So, IMO, you're right. We can remove it. > > Happy to do that in a follow up series. > > That is - if you're okay with these two patches in the current form. > > Thanks for reviewing. > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=210d150e1f5da506875e376422ba31ead2d49621 > [2] > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75a0a52be3c27b58654fbed2c8f2ff401482b9a4 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2] ALSA: virtio: use copy and fill_silence callbacks
Hi Takashi, On 19.10.2023 16:48, Takashi Iwai wrote: On Thu, 19 Oct 2023 03:20:19 +0200, Anton Yakovlev wrote: Hi Takashi, On 19.10.2023 03:07, Takashi Iwai wrote: On Wed, 18 Oct 2023 12:48:23 +0200, Matias Ezequiel Vara Larsen wrote: This commit replaces the mmap mechanism with the copy() and fill_silence() callbacks for both capturing and playback for the virtio-sound driver. This change is required to prevent the updating of the content of a buffer that is already in the available ring. The current mechanism splits a dma buffer into descriptors that are exposed to the device. This dma buffer is shared with the user application. When the device consumes a buffer, the driver moves the request from the used ring to available ring. The driver exposes the buffer to the device without knowing if the content has been updated from the user. The section 2.8.21.1 of the virtio spec states that: "The device MAY access the descriptor chains the driver created and the memory they refer to immediately". If the device picks up buffers from the available ring just after it is notified, it happens that the content may be old. By providing the copy() callback, the driver first updates the content of the buffer, and then, exposes the buffer to the device by enqueuing it in the available ring. Thus, device always picks up a buffer that is updated. During copy(), the number of requests enqueued depends on the "pos" and "bytes" arguments. The length of each request is period_size bytes. For capturing, the driver starts by exposing all the available buffers to device. After device updates the content of a buffer, it enqueues it in the used ring. It is only after the copy() for capturing is issued that the driver re-enqueues the buffer in the available ring. Co-developed-by: Anton Yakovlev Signed-off-by: Matias Ezequiel Vara Larsen --- Changelog: v1 -> v2: * Use snd_pcm_set_managed_buffer_all()for buffer allocation/freeing. * Make virtsnd_pcm_msg_send() generic by specifying the offset and size for the modified part of the buffer; this way no assumptions need to be made. * Disable SNDRV_PCM_INFO_NO_REWINDS since now only sequential reading/writing of frames is supported. * Correct comment at virtsnd_pcm_msg_send(). * v1 patch at: https://ddec1-0-en-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2flore.kernel.org%2flkml%2f20231016151000.GE119987%40fedora%2ft%2f&umid=323acbff-2d10-45a8-bbe1-78fc8583abec&auth=53c7c7de28b92dfd96e93d9dd61a23e634d2fbec-6526c5e588ae6e2c247d0c967e0504e4fc7f307a sound/virtio/virtio_pcm.c | 7 ++- sound/virtio/virtio_pcm.h | 9 ++-- sound/virtio/virtio_pcm_msg.c | 93 ++- sound/virtio/virtio_pcm_ops.c | 81 +- 4 files changed, 137 insertions(+), 53 deletions(-) Most of the code changes look good, but I wonder: diff --git a/sound/virtio/virtio_pcm.c b/sound/virtio/virtio_pcm.c index c10d91fff2fb..66d67eef1bcc 100644 --- a/sound/virtio/virtio_pcm.c +++ b/sound/virtio/virtio_pcm.c @@ -104,12 +104,11 @@ static int virtsnd_pcm_build_hw(struct virtio_pcm_substream *vss, * only message-based transport. */ vss->hw.info = - SNDRV_PCM_INFO_MMAP | - SNDRV_PCM_INFO_MMAP_VALID | Do we need the removal of those MMAP features inevitably? Usually mmap can still work even if the driver implements the copy ops. Those aren't always mutual exclusive. The driver uses a message queue to communicate with the device. Thus, the audio buffer is sliced into several I/O requests (= number of periods) of the same size (= period size). Before this, all such requests were enqueued when the substream started, and immediately re-enqueued once the request is completed. This approach made it possible to add mmap support. But for mmap there are no explicit notifications from the application how many frames were written or read. Thus, it was assumed that the virtual device should read/write frames to requests based on timings. And there are some problems here: 1. This was found to violate the virtio specification: if a request is already in the queue, the device can safely read/write there at any time. 2. It looks like this breaks the use case with swiotlb. Personally I'm not sure how the application handles DMA ownership in the case of mmaped buffer. To correctly implement mmap support, instead of transferring data via a message queue, the driver and device must have a shared memory region. We can add mmap in the future when we expand the functionality of the device to support such shared memory. Ah, then this implementation might be an overkill. You're still using the (intermediate) vmalloc buffer allocated via PCM managed mode, and the actual data is copied from/to there. So it doesn't conflict with the mmap operation at all. I guess that the problem you're trying to solve (the immediate d
Re: [PATCH v3] ALSA: virtio: use ack callback
Hi Matias, Thanks for the new patch! On 24.10.2023 00:06, Matias Ezequiel Vara Larsen wrote: This commit uses the ack() callback to determine when a buffer has been updated, then exposes it to guest. The current mechanism splits a dma buffer into descriptors that are exposed to the device. This dma buffer is shared with the user application. When the device consumes a buffer, the driver moves the request from the used ring to available ring. The driver exposes the buffer to the device without knowing if the content has been updated from the user. The section 2.8.21.1 of the virtio spec states that: "The device MAY access the descriptor chains the driver created and the memory they refer to immediately". If the device picks up buffers from the available ring just after it is notified, it happens that the content may be old. When the ack() callback is invoked, the driver exposes only the buffers that have already been updated, i.e., enqueued in the available ring. Thus, the device always picks up a buffer that is updated. For capturing, the driver starts by exposing all the available buffers to device. After device updates the content of a buffer, it enqueues it in the used ring. It is only after the ack() for capturing is issued that the driver re-enqueues the buffer in the available ring. Co-developed-by: Anton Yakovlev Signed-off-by: Anton Yakovlev Signed-off-by: Matias Ezequiel Vara Larsen --- Changelog: v2 -> v3: * Use ack() callback instead of copy()/fill_silence() * Change commit name * Rewrite commit message * Remove virtsnd_pcm_msg_send_locked() * Use single callback for both capture and playback * Fix kernel test robot warnings regarding documentation * v2 patch at: https://ddec1-0-en-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2flore.kernel.org%2flkml%2f87y1fzkq8c.wl%2dtiwai%40suse.de%2fT%2f&umid=6a6586e6-1bcb-48d2-8c0c-75ef6bb15df9&auth=53c7c7de28b92dfd96e93d9dd61a23e634d2fbec-090fe82db9a03f0dc8c4f214d4d2e2bf916e7f1f v1 -> v2: * Use snd_pcm_set_managed_buffer_all()for buffer allocation/freeing. * Make virtsnd_pcm_msg_send() generic by specifying the offset and size for the modified part of the buffer; this way no assumptions need to be made. * Disable SNDRV_PCM_INFO_NO_REWINDS since now only sequential reading/writing of frames is supported. * Correct comment at virtsnd_pcm_msg_send(). * v1 patch at: https://ddec1-0-en-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2flore.kernel.org%2flkml%2f20231016151000.GE119987%40fedora%2ft%2f&umid=6a6586e6-1bcb-48d2-8c0c-75ef6bb15df9&auth=53c7c7de28b92dfd96e93d9dd61a23e634d2fbec-2d4d809544c877beff1a631a29db01290c65d879 sound/virtio/virtio_pcm.c | 1 + sound/virtio/virtio_pcm.h | 6 ++- sound/virtio/virtio_pcm_msg.c | 80 --- sound/virtio/virtio_pcm_ops.c | 30 +++-- 4 files changed, 79 insertions(+), 38 deletions(-) diff --git a/sound/virtio/virtio_pcm.c b/sound/virtio/virtio_pcm.c index c10d91fff2fb..9cc5a95b4913 100644 --- a/sound/virtio/virtio_pcm.c +++ b/sound/virtio/virtio_pcm.c @@ -124,6 +124,7 @@ static int virtsnd_pcm_build_hw(struct virtio_pcm_substream *vss, values = le64_to_cpu(info->formats); vss->hw.formats = 0; + vss->appl_ptr = 0; for (i = 0; i < ARRAY_SIZE(g_v2a_format_map); ++i) if (values & (1ULL << i)) { diff --git a/sound/virtio/virtio_pcm.h b/sound/virtio/virtio_pcm.h index 062eb8e8f2cf..ea3c2845ae9b 100644 --- a/sound/virtio/virtio_pcm.h +++ b/sound/virtio/virtio_pcm.h @@ -27,6 +27,7 @@ struct virtio_pcm_msg; *substream operators. * @buffer_bytes: Current buffer size in bytes. * @hw_ptr: Substream hardware pointer value in bytes [0 ... buffer_bytes). + * @appl_ptr: Substream application pointer value in bytes [0 ... buffer_bytes). * @xfer_enabled: Data transfer state (0 - off, 1 - on). * @xfer_xrun: Data underflow/overflow state (0 - no xrun, 1 - xrun). * @stopped: True if the substream is stopped and must be released on the device @@ -51,13 +52,13 @@ struct virtio_pcm_substream { spinlock_t lock; size_t buffer_bytes; size_t hw_ptr; + size_t appl_ptr; bool xfer_enabled; bool xfer_xrun; bool stopped; bool suspended; struct virtio_pcm_msg **msgs; unsigned int nmsgs; - int msg_last_enqueued; unsigned int msg_count; wait_queue_head_t msg_empty; }; @@ -117,7 +118,8 @@ int virtsnd_pcm_msg_alloc(struct virtio_pcm_substream *vss, void virtsnd_pcm_msg_free(struct virtio_pcm_substream *vss); -int virtsnd_pcm_msg_send(struct virtio_pcm_substream *vss); +int virtsnd_pcm_msg_send(struct virtio_pcm_substream *vss, unsigned long offset, +unsigned long bytes); unsigned int virtsnd_pcm_msg_pending_num(struct virtio_pcm_substream *vss); diff --git a/sound/virtio/virtio_pcm_msg.c b/sound/virtio
[PATCH] vhost-vdpa: fix NULL pointer deref in _compat_vdpa_reset
As subject. There's a vhost_vdpa_reset() done earlier before vhost_dev is initialized via vhost_dev_init(), ending up with NULL pointer dereference. Fix is to check if vqs is initialized before checking backend features and resetting the device. BUG: kernel NULL pointer dereference, address: #PF: supervisor read access in kernel mode #PF: error_code(0x) - not-present page PGD 0 P4D 0 Oops: [#1] SMP CPU: 3 PID: 1727 Comm: qemu-system-x86 Not tainted 6.6.0-rc6+ #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel- a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:_compat_vdpa_reset+0x47/0xc0 [vhost_vdpa] Code: c7 c7 fb 12 56 a0 4c 8d a5 b8 02 00 00 48 89 ea e8 7e b8 c4 48 89 ee 48 c7 c7 19 13 56 a0 4c 8b ad b0 02 00 00 <48> 8b 00 49 00 48 8b 80 88 45 00 00 48 c1 e8 08 48 RSP: 0018:8881063c3c38 EFLAGS: 00010246 RAX: RBX: 8881074eb800 RCX: RDX: RSI: 888103ab4000 RDI: a0561319 RBP: 888103ab4000 R08: dfff R09: 0001 R10: 0003 R11: 7fecbac0 R12: 888103ab42b8 R13: 888106dbe850 R14: 0003 R15: 8881074ebc18 FS: 7f02fba6ef00() GS:5f8c() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: CR3: 0001325e5003 CR4: 00372ea0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: ? __die+0x1f/0x60 ? page_fault_oops+0x14c/0x3b0 ? exc_page_fault+0x74/0x140 ? asm_exc_page_fault+0x22/0x30 ? _compat_vdpa_reset+0x47/0xc0 [vhost_vdpa] ? _compat_vdpa_reset+0x32/0xc0 [vhost_vdpa] vhost_vdpa_open+0x55/0x270 [vhost_vdpa] ? sb_init_dio_done_wq+0x50/0x50 chrdev_open+0xc0/0x210 ? __unregister_chrdev+0x50/0x50 do_dentry_open+0x1fc/0x4f0 path_openat+0xc2d/0xf20 do_filp_open+0xb4/0x160 ? kmem_cache_alloc+0x3c/0x490 do_sys_openat2+0x8d/0xc0 __x64_sys_openat+0x6a/0xa0 do_syscall_64+0x3c/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: 10cbf8dfaf93 ("vhost-vdpa: clean iotlb map during reset for older userspace") Reported-by: Dragos Tatulea Closes: https://lore.kernel.org/all/b4913f84-8b52-4d28-af51-8573dc361...@oracle.com/ Signed-off-by: Si-Wei Liu --- drivers/vhost/vdpa.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 9ce40003793b..9a2343c45df0 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -232,9 +232,11 @@ static int _compat_vdpa_reset(struct vhost_vdpa *v) struct vdpa_device *vdpa = v->vdpa; u32 flags = 0; - flags |= !vhost_backend_has_feature(v->vdev.vqs[0], - VHOST_BACKEND_F_IOTLB_PERSIST) ? -VDPA_RESET_F_CLEAN_MAP : 0; + if (v->vdev.vqs) { + flags |= !vhost_backend_has_feature(v->vdev.vqs[0], + VHOST_BACKEND_F_IOTLB_PERSIST) ? +VDPA_RESET_F_CLEAN_MAP : 0; + } return vdpa_reset(vdpa, flags); } -- 2.39.3 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 5/7] vhost-vdpa: clean iotlb map during reset for older userspace
(+ linux-next) Hi Michael, Dragos reported below oops for which I have a fix at hand (having it fully tested), ready to be posted to linux-next. Please let me know if you want me to respin the original patch series, or you would think it'd be fine to fix it on top. On 10/23/2023 11:59 AM, Dragos Tatulea wrote: On Sat, 2023-10-21 at 02:25 -0700, Si-Wei Liu wrote: Using .compat_reset op from the previous patch, the buggy .reset behaviour can be kept as-is on older userspace apps, which don't ack the IOTLB_PERSIST backend feature. As this compatibility quirk is limited to those drivers that used to be buggy in the past, it won't affect change the behaviour or affect ABI on the setups with API compliant driver. The separation of .compat_reset from the regular .reset allows vhost-vdpa able to know which driver had broken behaviour before, so it can apply the corresponding compatibility quirk to the individual driver whenever needed. Compared to overloading the existing .reset with flags, .compat_reset won't cause any extra burden to the implementation of every compliant driver. Signed-off-by: Si-Wei Liu --- drivers/vhost/vdpa.c | 17 + drivers/virtio/virtio_vdpa.c | 2 +- include/linux/vdpa.h | 7 +-- 3 files changed, 19 insertions(+), 7 deletions(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index acc7c74ba7d6..9ce40003793b 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -227,13 +227,22 @@ static void vhost_vdpa_unsetup_vq_irq(struct vhost_vdpa *v, u16 qid) irq_bypass_unregister_producer(&vq->call_ctx.producer); } -static int vhost_vdpa_reset(struct vhost_vdpa *v) +static int _compat_vdpa_reset(struct vhost_vdpa *v) { struct vdpa_device *vdpa = v->vdpa; + u32 flags = 0; - v->in_batch = 0; + flags |= !vhost_backend_has_feature(v->vdev.vqs[0], + VHOST_BACKEND_F_IOTLB_PERSIST) ? + VDPA_RESET_F_CLEAN_MAP : 0; Hi Si-Wei, I am getting a Oops due to the vqs not being initialized here. Here's how it it looks like: [ 37.817075] BUG: kernel NULL pointer dereference, address: [ 37.817674] #PF: supervisor read access in kernel mode [ 37.818150] #PF: error_code(0x) - not-present page [ 37.818615] PGD 0 P4D 0 [ 37.818893] Oops: [#1] SMP [ 37.819223] CPU: 3 PID: 1727 Comm: qemu-system-x86 Not tainted 6.6.0-rc6+ #2 [ 37.819829] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel- 1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 37.820791] RIP: 0010:_compat_vdpa_reset+0x47/0xc0 [vhost_vdpa] [ 37.821316] Code: c7 c7 fb 12 56 a0 4c 8d a5 b8 02 00 00 48 89 ea e8 7e b8 c4 e0 48 8b 43 28 48 89 ee 48 c7 c7 19 13 56 a0 4c 8b ad b0 02 00 00 <48> 8b 00 49 8b 95 d8 00 00 00 48 8b 80 88 45 00 00 48 c1 e8 08 48 [ 37.822811] RSP: 0018:8881063c3c38 EFLAGS: 00010246 [ 37.823285] RAX: RBX: 8881074eb800 RCX: [ 37.823893] RDX: RSI: 888103ab4000 RDI: a0561319 [ 37.824506] RBP: 888103ab4000 R08: dfff R09: 0001 [ 37.825116] R10: 0003 R11: 7fecbac0 R12: 888103ab42b8 [ 37.825721] R13: 888106dbe850 R14: 0003 R15: 8881074ebc18 [ 37.826326] FS: 7f02fba6ef00() GS:5f8c() knlGS: [ 37.827035] CS: 0010 DS: ES: CR0: 80050033 [ 37.827552] CR2: CR3: 0001325e5003 CR4: 00372ea0 [ 37.828162] DR0: DR1: DR2: [ 37.828772] DR3: DR6: fffe0ff0 DR7: 0400 [ 37.829381] Call Trace: [ 37.829660] [ 37.829911] ? __die+0x1f/0x60 [ 37.830234] ? page_fault_oops+0x14c/0x3b0 [ 37.830623] ? exc_page_fault+0x74/0x140 [ 37.830999] ? asm_exc_page_fault+0x22/0x30 [ 37.831402] ? _compat_vdpa_reset+0x47/0xc0 [vhost_vdpa] [ 37.831888] ? _compat_vdpa_reset+0x32/0xc0 [vhost_vdpa] [ 37.832366] vhost_vdpa_open+0x55/0x270 [vhost_vdpa] [ 37.832821] ? sb_init_dio_done_wq+0x50/0x50 [ 37.833225] chrdev_open+0xc0/0x210 [ 37.833582] ? __unregister_chrdev+0x50/0x50 [ 37.833990] do_dentry_open+0x1fc/0x4f0 [ 37.834363] path_openat+0xc2d/0xf20 [ 37.834721] do_filp_open+0xb4/0x160 [ 37.835082] ? kmem_cache_alloc+0x3c/0x490 [ 37.835474] do_sys_openat2+0x8d/0xc0 [ 37.835834] __x64_sys_openat+0x6a/0xa0 [ 37.836208] do_syscall_64+0x3c/0x80 [ 37.836564] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 37.837021] RIP: 0033:0x7f02fcc2c085 [ 37.837378] Code: 8b 55 d0 48 89 45 b0 75 a0 44 89 55 9c e8 63 7d f8 ff 44 8b 55 9c 89 da 4c 89 e6 41 89 c0 bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 89 45 9c e8 b8 7d f8 ff 8b 45 9c [ 37.838891] RSP: 002b:7ffdea3c8cc0 EFLAGS: 0293 ORIG_RAX: 0101 [ 37.839571] RAX: ffda R
Re: [PATCH v4 0/7] vdpa: decouple reset of iotlb mapping from device reset
On 10/22/2023 8:51 PM, Jason Wang wrote: Hi Si-Wei: On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu wrote: In order to reduce needlessly high setup and teardown cost of iotlb mapping during live migration, it's crucial to decouple the vhost-vdpa iotlb abstraction from the virtio device life cycle, i.e. iotlb mappings should be left intact across virtio device reset [1]. For it to work, the on-chip IOMMU parent device could implement a separate .reset_map() operation callback to restore 1:1 DMA mapping without having to resort to the .reset() callback, the latter of which is mainly used to reset virtio device state. This new .reset_map() callback will be invoked only before the vhost-vdpa driver is to be removed and detached from the vdpa bus, such that other vdpa bus drivers, e.g. virtio-vdpa, can start with 1:1 DMA mapping when they are attached. For the context, those on-chip IOMMU parent devices, create the 1:1 DMA mapping at vdpa device creation, and they would implicitly destroy the 1:1 mapping when the first .set_map or .dma_map callback is invoked. This patchset is rebased on top of the latest vhost tree. [1] Reducing vdpa migration downtime because of memory pin / maps https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html --- v4: - Rework compatibility using new .compat_reset driver op I still think having a set_backend_feature() This will overload backend features with the role of carrying over compatibility quirks, which I tried to avoid from. While I think the .compat_reset from the v4 code just works with the backend features acknowledgement (and maybe others as well) to determine, but not directly tie it to backend features itself. These two have different implications in terms of requirement, scope and maintaining/deprecation, better to cope with compat quirks in explicit and driver visible way. or reset_map(clean=true) might be better. An explicit op might be marginally better in driver writer's point of view. Compliant driver doesn't have to bother asserting clean_map never be true so their code would never bother dealing with this case, as explained in the commit log for patch 5 "vhost-vdpa: clean iotlb map during reset for older userspace": " The separation of .compat_reset from the regular .reset allows vhost-vdpa able to know which driver had broken behavior before, so it can apply the corresponding compatibility quirk to the individual driver whenever needed. Compared to overloading the existing .reset with flags, .compat_reset won't cause any extra burden to the implementation of every compliant driver. " As it tries hard to not introduce new stuff on the bus. Honestly I don't see substantial difference between these other than the color. There's no single best solution that stands out among the 3. And I assume you already noticed it from all the above 3 approaches will have to go with backend features negotiation, that the 1st vdpa reset before backend feature negotiation will use the compliant version of .reset that doesn't clean up the map. While I don't think this nuance matters much to existing older userspace apps, as the maps should already get cleaned by previous process in vhost_vdpa_cleanup(), but if bug-for-bug behavioral compatibility is what you want, module parameter will be the single best answer. Regards, -Siwei But we can listen to others for sure. Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [GIT PULL] virtio: last minute fixes
The pull request you sent on Mon, 23 Oct 2023 01:02:07 -0400: > https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/7c14564010fc1d0f16ca7d39b0ff948b43344209 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 1/2] virtio_pci: Don't make an extra copy of cpu affinity mask
On Thu, Oct 19, 2023 at 08:55 PM +08, Xuan Zhuo wrote: > On Thu, 19 Oct 2023 12:16:24 +0200, Jakub Sitnicki > wrote: >> Since commit 19e226e8cc5d ("virtio: Make vp_set_vq_affinity() take a >> mask.") it is actually not needed to have a local copy of the cpu mask. > > > Could you give more info to prove this? > > If you are right, I think you should delete all code about > msix_affinity_masks? Sorry for the late reply. I've been away. It looks that msix_affinity_masks became unused - intentionally - in 2015, after commit 210d150e1f5d ("virtio_pci: Clear stale cpumask when setting irq affinity") [1]. Originally introduced in 2012 in commit 75a0a52be3c2 ("virtio: introduce an API to set affinity for a virtqueue") [2]. As I understand, it was meant to make it possible to set VQ affinity to more than once CPU. Now that we can pass a CPU mask, listing all CPUs, to set_vq_affinity, msix_affinity_masks seems to no longer have a purpose. So, IMO, you're right. We can remove it. Happy to do that in a follow up series. That is - if you're okay with these two patches in the current form. Thanks for reviewing. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=210d150e1f5da506875e376422ba31ead2d49621 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75a0a52be3c27b58654fbed2c8f2ff401482b9a4 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH V1 vfio 0/9] Introduce a vfio driver over virtio devices
On Mon, 23 Oct 2023 13:20:43 -0300 Jason Gunthorpe wrote: > On Mon, Oct 23, 2023 at 10:09:13AM -0600, Alex Williamson wrote: > > On Mon, 23 Oct 2023 12:42:57 -0300 > > Jason Gunthorpe wrote: > > > > > On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote: > > > > > > > > Alex, > > > > > Are you fine to leave the provisioning of the VF including the > > > > > control > > > > > of its transitional capability in the device hands as was suggested > > > > > by > > > > > Jason ? > > > > > > > > If this is the standard we're going to follow, ie. profiling of a > > > > device is expected to occur prior to the probe of the vfio-pci variant > > > > driver, then we should get the out-of-tree NVIDIA vGPU driver on board > > > > with this too. > > > > > > Those GPU drivers are using mdev not vfio-pci.. > > > > The SR-IOV mdev vGPUs rely on the IOMMU backing device support which > > was removed from upstream. > > It wasn't, but it changed forms. > > mdev is a sysfs framework for managing lifecycle with GUIDs only. > > The thing using mdev can call vfio_register_emulated_iommu_dev() or > vfio_register_group_dev(). > > It doesn't matter to the mdev stuff. > > The thing using mdev is responsible to get the struct device to pass > to vfio_register_group_dev() Are we describing what can be done (possibly limited to out-of-tree drivers) or what should be done and would be accepted upstream? I'm under the impression that mdev has been redefined to be more narrowly focused for emulated IOMMU devices and that devices based around a PCI VF should be making use of a vfio-pci variant driver. Are you suggesting it's the vendor's choice based on whether they want the mdev lifecycle support? We've defined certain aspects of the vfio-mdev interface as only available for emulated IOMMU devices, ex. page pinning. Thanks, Alex ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2] vsock/virtio: initialize the_virtio_vsock before using VQs
On Mon, Oct 23, 2023 at 06:36:21PM +0300, Alexandru Matei wrote: On 10/23/2023 6:13 PM, Stefano Garzarella wrote: On Mon, Oct 23, 2023 at 05:59:45PM +0300, Alexandru Matei wrote: On 10/23/2023 5:52 PM, Alexandru Matei wrote: On 10/23/2023 5:29 PM, Stefano Garzarella wrote: On Mon, Oct 23, 2023 at 05:08:33PM +0300, Alexandru Matei wrote: Once VQs are filled with empty buffers and we kick the host, it can send connection requests. If 'the_virtio_vsock' is not initialized before, replies are silently dropped and do not reach the host. Fixes: 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock") Signed-off-by: Alexandru Matei --- v2: - split virtio_vsock_vqs_init in vqs_init and vqs_fill and moved the_virtio_vsock initialization after vqs_init net/vmw_vsock/virtio_transport.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index e95df847176b..92738d1697c1 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -559,6 +559,11 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock) vsock->tx_run = true; mutex_unlock(&vsock->tx_lock); + return 0; +} + +static void virtio_vsock_vqs_fill(struct virtio_vsock *vsock) What about renaming this function in virtio_vsock_vqs_start() and move also the setting of `tx_run` here? It works but in this case we also need to move rcu_assign_pointer in virtio_vsock_vqs_start(), the assignment needs to be right after setting tx_run to true and before filling the VQs. Why? If `rx_run` is false, we shouldn't need to send replies to the host IIUC. If we need this instead, please add a comment in the code, but also in the commit, because it's not clear why. We need rcu_assign_pointer after setting tx_run to true for connections that are initiated from the guest -> host. virtio_transport_connect() calls virtio_transport_send_pkt(). Once 'the_virtio_vsock' is initialized, virtio_transport_send_pkt() will queue the packet, but virtio_transport_send_pkt_work() will exit if tx_run is false. Okay, but in this case we could safely queue &vsock->send_pkt_work after finishing initialization to send those packets queued earlier. In the meantime I'll try to see if we can leave the initialization of `the_virtio_vsock` as the ulitmate step and maybe go out first in the workers if it's not set. That way just queue all the workers after everything is done and we should be fine. And if we move rcu_assign_pointer then there is no need to split the function in two, We can move rcu_assign_pointer() directly inside virtio_vsock_vqs_init() after setting tx_run. Yep, this could be another option, but we need to change the name of that function in this case. OK, how does virtio_vsock_vqs_setup() sound? Or virtio_vsock_start() (without vqs) Stefano Stefano Thanks, Stefano +{ mutex_lock(&vsock->rx_lock); virtio_vsock_rx_fill(vsock); vsock->rx_run = true; @@ -568,8 +573,6 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock) virtio_vsock_event_fill(vsock); vsock->event_run = true; mutex_unlock(&vsock->event_lock); - - return 0; } static void virtio_vsock_vqs_del(struct virtio_vsock *vsock) @@ -664,6 +667,7 @@ static int virtio_vsock_probe(struct virtio_device *vdev) goto out; rcu_assign_pointer(the_virtio_vsock, vsock); + virtio_vsock_vqs_fill(vsock); mutex_unlock(&the_virtio_vsock_mutex); @@ -736,6 +740,7 @@ static int virtio_vsock_restore(struct virtio_device *vdev) goto out; rcu_assign_pointer(the_virtio_vsock, vsock); + virtio_vsock_vqs_fill(vsock); out: mutex_unlock(&the_virtio_vsock_mutex); -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH V1 vfio 0/9] Introduce a vfio driver over virtio devices
On Mon, 23 Oct 2023 12:42:57 -0300 Jason Gunthorpe wrote: > On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote: > > > > Alex, > > > Are you fine to leave the provisioning of the VF including the control > > > of its transitional capability in the device hands as was suggested by > > > Jason ? > > > > If this is the standard we're going to follow, ie. profiling of a > > device is expected to occur prior to the probe of the vfio-pci variant > > driver, then we should get the out-of-tree NVIDIA vGPU driver on board > > with this too. > > Those GPU drivers are using mdev not vfio-pci.. The SR-IOV mdev vGPUs rely on the IOMMU backing device support which was removed from upstream. They only exist in the mdev form on downstreams which have retained this interface for compatibility and continuity. I'm not aware of any other means by which the SR-IOV RID can be used in the mdev model, therefore only the pre-SR-IOV GPUs should continue to use the mdev interface. > mdev doesn't have a way in its uapi to configure the mdev before it is > created. Of course. Thanks, Alex ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH V1 vfio 0/9] Introduce a vfio driver over virtio devices
On Sun, 22 Oct 2023 11:20:31 +0300 Yishai Hadas wrote: > On 17/10/2023 16:42, Yishai Hadas wrote: > > This series introduce a vfio driver over virtio devices to support the > > legacy interface functionality for VFs. > > > > Background, from the virtio spec [1]. > > > > In some systems, there is a need to support a virtio legacy driver with > > a device that does not directly support the legacy interface. In such > > scenarios, a group owner device can provide the legacy interface > > functionality for the group member devices. The driver of the owner > > device can then access the legacy interface of a member device on behalf > > of the legacy member device driver. > > > > For example, with the SR-IOV group type, group members (VFs) can not > > present the legacy interface in an I/O BAR in BAR0 as expected by the > > legacy pci driver. If the legacy driver is running inside a virtual > > machine, the hypervisor executing the virtual machine can present a > > virtual device with an I/O BAR in BAR0. The hypervisor intercepts the > > legacy driver accesses to this I/O BAR and forwards them to the group > > owner device (PF) using group administration commands. > > > > > > The first 6 patches are in the virtio area and handle the below: > > - Fix common config map for modern device as was reported by Michael > > Tsirkin. > > - Introduce the admin virtqueue infrastcture. > > - Expose the layout of the commands that should be used for > >supporting the legacy access. > > - Expose APIs to enable upper layers as of vfio, net, etc > >to execute admin commands. > > > > The above follows the virtio spec that was lastly accepted in that area > > [1]. > > > > The last 3 patches are in the vfio area and handle the below: > > - Expose some APIs from vfio/pci to be used by the vfio/virtio driver. > > - Introduce a vfio driver over virtio devices to support the legacy > >interface functionality for VFs. > > > > The series was tested successfully over virtio-net VFs in the host, > > while running in the guest both modern and legacy drivers. > > > > [1] > > https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c > > > > Changes from V0: > > https://www.spinics.net/lists/linux-virtualization/msg63802.html > > > > Virtio: > > - Fix the common config map size issue that was reported by Michael > >Tsirkin. > > - Do not use vp_dev->vqs[] array upon vp_del_vqs() as was asked by > >Michael, instead skip the AQ specifically. > > - Move admin vq implementation into virtio_pci_modern.c as was asked by > >Michael. > > - Rename structure virtio_avq to virtio_pci_admin_vq and some extra > >corresponding renames. > > - Remove exported symbols virtio_pci_vf_get_pf_dev(), > >virtio_admin_cmd_exec() as now callers are local to the module. > > - Handle inflight commands as part of the device reset flow. > > - Introduce APIs per admin command in virtio-pci as was asked by Michael. > > > > Vfio: > > - Change to use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL for > >vfio_pci_core_setup_barmap() and vfio_pci_iowrite#xxx() as pointed by > >Alex. > > - Drop the intermediate patch which prepares the commands and calls the > >generic virtio admin command API (i.e. virtio_admin_cmd_exec()). > > - Instead, call directly to the new APIs per admin command that are > >exported from Virtio - based on Michael's request. > > - Enable only virtio-net as part of the pci_device_id table to enforce > >upon binding only what is supported as suggested by Alex. > > - Add support for byte-wise access (read/write) over the device config > >region as was asked by Alex. > > - Consider whether MSIX is practically enabled/disabled to choose the > >right opcode upon issuing read/write admin command, as mentioned > >by Michael. > > - Move to use VIRTIO_PCI_CONFIG_OFF instead of adding some new defines > >as was suggested by Michael. > > - Set the '.close_device' op to vfio_pci_core_close_device() as was > >pointed by Alex. > > - Adapt to Vfio multi-line comment style in a few places. > > - Add virtualization@lists.linux-foundation.org in the MAINTAINERS file > >to be CCed for the new driver as was suggested by Jason. > > > > Yishai > > > > Feng Liu (5): > >virtio-pci: Fix common config map for modern device > >virtio: Define feature bit for administration virtqueue > >virtio-pci: Introduce admin virtqueue > >virtio-pci: Introduce admin command sending function > >virtio-pci: Introduce admin commands > > > > Yishai Hadas (4): > >virtio-pci: Introduce APIs to execute legacy IO admin commands > >vfio/pci: Expose vfio_pci_core_setup_barmap() > >vfio/pci: Expose vfio_pci_iowrite/read##size() > >vfio/virtio: Introduce a vfio driver over virtio devices > > > > MAINTAINERS
Re: [PATCH v4 4/4] vduse: Add LSM hooks to check Virtio device type
On 10/23/2023 12:28 AM, Maxime Coquelin wrote: > > > On 10/21/23 00:20, Casey Schaufler wrote: >> On 10/20/2023 8:58 AM, Maxime Coquelin wrote: >>> This patch introduces LSM hooks for devices creation, >>> destruction and opening operations, checking the >>> application is allowed to perform these operations for >>> the Virtio device type. >> >> Why do you think that there needs to be a special LSM check for virtio >> devices? What can't existing device attributes be used? > > Michael asked for a way for SELinux to allow/prevent the creation of > some types of devices [0]. > > A device is created using ioctl() on VDUSE control chardev. Its type is > specified via a field in the structure passed in argument. > > I didn't see other way than adding dedicated LSM hooks to achieve this, > but it is possible that their is a better way to do it? At the very least the hook should be made more general, and I'd have to see a proposal before commenting on that. security_dev_destroy(dev) might be a better approach. If there's reason to control destruction of vduse devices it's reasonable to assume that there are other devices with the same or similar properties. Since SELinux is your target use case, can you explain why you can't create SELinux policy to enforce the restrictions you're after? I believe (but can be proven wrong, of course) that SELinux has mechanism for dealing with controls on ioctls. > > Thanks, > Maxime > > [0]: > https://lore.kernel.org/all/20230829130430-mutt-send-email-...@kernel.org/ > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2] vsock/virtio: initialize the_virtio_vsock before using VQs
On Mon, Oct 23, 2023 at 05:59:45PM +0300, Alexandru Matei wrote: On 10/23/2023 5:52 PM, Alexandru Matei wrote: On 10/23/2023 5:29 PM, Stefano Garzarella wrote: On Mon, Oct 23, 2023 at 05:08:33PM +0300, Alexandru Matei wrote: Once VQs are filled with empty buffers and we kick the host, it can send connection requests. If 'the_virtio_vsock' is not initialized before, replies are silently dropped and do not reach the host. Fixes: 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock") Signed-off-by: Alexandru Matei --- v2: - split virtio_vsock_vqs_init in vqs_init and vqs_fill and moved the_virtio_vsock initialization after vqs_init net/vmw_vsock/virtio_transport.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index e95df847176b..92738d1697c1 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -559,6 +559,11 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock) vsock->tx_run = true; mutex_unlock(&vsock->tx_lock); + return 0; +} + +static void virtio_vsock_vqs_fill(struct virtio_vsock *vsock) What about renaming this function in virtio_vsock_vqs_start() and move also the setting of `tx_run` here? It works but in this case we also need to move rcu_assign_pointer in virtio_vsock_vqs_start(), the assignment needs to be right after setting tx_run to true and before filling the VQs. Why? If `rx_run` is false, we shouldn't need to send replies to the host IIUC. If we need this instead, please add a comment in the code, but also in the commit, because it's not clear why. And if we move rcu_assign_pointer then there is no need to split the function in two, We can move rcu_assign_pointer() directly inside virtio_vsock_vqs_init() after setting tx_run. Yep, this could be another option, but we need to change the name of that function in this case. Stefano Thanks, Stefano +{ mutex_lock(&vsock->rx_lock); virtio_vsock_rx_fill(vsock); vsock->rx_run = true; @@ -568,8 +573,6 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock) virtio_vsock_event_fill(vsock); vsock->event_run = true; mutex_unlock(&vsock->event_lock); - - return 0; } static void virtio_vsock_vqs_del(struct virtio_vsock *vsock) @@ -664,6 +667,7 @@ static int virtio_vsock_probe(struct virtio_device *vdev) goto out; rcu_assign_pointer(the_virtio_vsock, vsock); + virtio_vsock_vqs_fill(vsock); mutex_unlock(&the_virtio_vsock_mutex); @@ -736,6 +740,7 @@ static int virtio_vsock_restore(struct virtio_device *vdev) goto out; rcu_assign_pointer(the_virtio_vsock, vsock); + virtio_vsock_vqs_fill(vsock); out: mutex_unlock(&the_virtio_vsock_mutex); -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH v3] ALSA: virtio: use ack callback
This commit uses the ack() callback to determine when a buffer has been updated, then exposes it to guest. The current mechanism splits a dma buffer into descriptors that are exposed to the device. This dma buffer is shared with the user application. When the device consumes a buffer, the driver moves the request from the used ring to available ring. The driver exposes the buffer to the device without knowing if the content has been updated from the user. The section 2.8.21.1 of the virtio spec states that: "The device MAY access the descriptor chains the driver created and the memory they refer to immediately". If the device picks up buffers from the available ring just after it is notified, it happens that the content may be old. When the ack() callback is invoked, the driver exposes only the buffers that have already been updated, i.e., enqueued in the available ring. Thus, the device always picks up a buffer that is updated. For capturing, the driver starts by exposing all the available buffers to device. After device updates the content of a buffer, it enqueues it in the used ring. It is only after the ack() for capturing is issued that the driver re-enqueues the buffer in the available ring. Co-developed-by: Anton Yakovlev Signed-off-by: Anton Yakovlev Signed-off-by: Matias Ezequiel Vara Larsen --- Changelog: v2 -> v3: * Use ack() callback instead of copy()/fill_silence() * Change commit name * Rewrite commit message * Remove virtsnd_pcm_msg_send_locked() * Use single callback for both capture and playback * Fix kernel test robot warnings regarding documentation * v2 patch at: https://lore.kernel.org/lkml/87y1fzkq8c.wl-ti...@suse.de/T/ v1 -> v2: * Use snd_pcm_set_managed_buffer_all()for buffer allocation/freeing. * Make virtsnd_pcm_msg_send() generic by specifying the offset and size for the modified part of the buffer; this way no assumptions need to be made. * Disable SNDRV_PCM_INFO_NO_REWINDS since now only sequential reading/writing of frames is supported. * Correct comment at virtsnd_pcm_msg_send(). * v1 patch at: https://lore.kernel.org/lkml/20231016151000.GE119987@fedora/t/ sound/virtio/virtio_pcm.c | 1 + sound/virtio/virtio_pcm.h | 6 ++- sound/virtio/virtio_pcm_msg.c | 80 --- sound/virtio/virtio_pcm_ops.c | 30 +++-- 4 files changed, 79 insertions(+), 38 deletions(-) diff --git a/sound/virtio/virtio_pcm.c b/sound/virtio/virtio_pcm.c index c10d91fff2fb..9cc5a95b4913 100644 --- a/sound/virtio/virtio_pcm.c +++ b/sound/virtio/virtio_pcm.c @@ -124,6 +124,7 @@ static int virtsnd_pcm_build_hw(struct virtio_pcm_substream *vss, values = le64_to_cpu(info->formats); vss->hw.formats = 0; + vss->appl_ptr = 0; for (i = 0; i < ARRAY_SIZE(g_v2a_format_map); ++i) if (values & (1ULL << i)) { diff --git a/sound/virtio/virtio_pcm.h b/sound/virtio/virtio_pcm.h index 062eb8e8f2cf..ea3c2845ae9b 100644 --- a/sound/virtio/virtio_pcm.h +++ b/sound/virtio/virtio_pcm.h @@ -27,6 +27,7 @@ struct virtio_pcm_msg; *substream operators. * @buffer_bytes: Current buffer size in bytes. * @hw_ptr: Substream hardware pointer value in bytes [0 ... buffer_bytes). + * @appl_ptr: Substream application pointer value in bytes [0 ... buffer_bytes). * @xfer_enabled: Data transfer state (0 - off, 1 - on). * @xfer_xrun: Data underflow/overflow state (0 - no xrun, 1 - xrun). * @stopped: True if the substream is stopped and must be released on the device @@ -51,13 +52,13 @@ struct virtio_pcm_substream { spinlock_t lock; size_t buffer_bytes; size_t hw_ptr; + size_t appl_ptr; bool xfer_enabled; bool xfer_xrun; bool stopped; bool suspended; struct virtio_pcm_msg **msgs; unsigned int nmsgs; - int msg_last_enqueued; unsigned int msg_count; wait_queue_head_t msg_empty; }; @@ -117,7 +118,8 @@ int virtsnd_pcm_msg_alloc(struct virtio_pcm_substream *vss, void virtsnd_pcm_msg_free(struct virtio_pcm_substream *vss); -int virtsnd_pcm_msg_send(struct virtio_pcm_substream *vss); +int virtsnd_pcm_msg_send(struct virtio_pcm_substream *vss, unsigned long offset, +unsigned long bytes); unsigned int virtsnd_pcm_msg_pending_num(struct virtio_pcm_substream *vss); diff --git a/sound/virtio/virtio_pcm_msg.c b/sound/virtio/virtio_pcm_msg.c index aca2dc1989ba..106e8e847746 100644 --- a/sound/virtio/virtio_pcm_msg.c +++ b/sound/virtio/virtio_pcm_msg.c @@ -155,7 +155,6 @@ int virtsnd_pcm_msg_alloc(struct virtio_pcm_substream *vss, sizeof(msg->xfer)); sg_init_one(&msg->sgs[PCM_MSG_SG_STATUS], &msg->status, sizeof(msg->status)); - msg->length = period_bytes; virtsnd_pcm_sg_from(&msg->sgs[PCM_MSG_SG_DATA], sg_num, data, period_bytes); @@ -18
Re: [PATCH v2] vsock/virtio: initialize the_virtio_vsock before using VQs
On Mon, Oct 23, 2023 at 05:08:33PM +0300, Alexandru Matei wrote: Once VQs are filled with empty buffers and we kick the host, it can send connection requests. If 'the_virtio_vsock' is not initialized before, replies are silently dropped and do not reach the host. Fixes: 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock") Signed-off-by: Alexandru Matei --- v2: - split virtio_vsock_vqs_init in vqs_init and vqs_fill and moved the_virtio_vsock initialization after vqs_init net/vmw_vsock/virtio_transport.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index e95df847176b..92738d1697c1 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -559,6 +559,11 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock) vsock->tx_run = true; mutex_unlock(&vsock->tx_lock); + return 0; +} + +static void virtio_vsock_vqs_fill(struct virtio_vsock *vsock) What about renaming this function in virtio_vsock_vqs_start() and move also the setting of `tx_run` here? Thanks, Stefano +{ mutex_lock(&vsock->rx_lock); virtio_vsock_rx_fill(vsock); vsock->rx_run = true; @@ -568,8 +573,6 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock) virtio_vsock_event_fill(vsock); vsock->event_run = true; mutex_unlock(&vsock->event_lock); - - return 0; } static void virtio_vsock_vqs_del(struct virtio_vsock *vsock) @@ -664,6 +667,7 @@ static int virtio_vsock_probe(struct virtio_device *vdev) goto out; rcu_assign_pointer(the_virtio_vsock, vsock); + virtio_vsock_vqs_fill(vsock); mutex_unlock(&the_virtio_vsock_mutex); @@ -736,6 +740,7 @@ static int virtio_vsock_restore(struct virtio_device *vdev) goto out; rcu_assign_pointer(the_virtio_vsock, vsock); + virtio_vsock_vqs_fill(vsock); out: mutex_unlock(&the_virtio_vsock_mutex); -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] virtio_ring: add an error code check in virtqueue_resize
On Mon, Oct 23, 2023 at 06:52:34PM +0800, Xuan Zhuo wrote: > On Mon, 23 Oct 2023 17:52:02 +0800, Xuan Zhuo > wrote: > > On Mon, 23 Oct 2023 17:50:46 +0800, Su Hui wrote: > > > On 2023/10/23 13:46, Xuan Zhuo wrote: > > > Well, what are the cases where it can happen practically? > > > >>> Device error. Such as vp_active_vq() > > > >>> > > > >>> Thanks. > > > >> Hmm interesting. OK. But do callers know to recover? > > > > No. > > > > > > > > So I think WARN + broken is suitable. > > > > > > > > Thanks. > > > Sorry for the late, is the following code okay? > > > > > > @@ -2739,7 +2739,7 @@ int virtqueue_resize(struct virtqueue *_vq, > > > u32 num, > > > void (*recycle)(struct virtqueue *vq, void > > > *buf)) > > > { > > > struct vring_virtqueue *vq = to_vvq(_vq); > > > - int err; > > > + int err, err_reset; > > > > > > if (num > vq->vq.num_max) > > > return -E2BIG; > > > @@ -2759,7 +2759,15 @@ int virtqueue_resize(struct virtqueue *_vq, > > > u32 num, > > > else > > > err = virtqueue_resize_split(_vq, num); > > > > > > - return virtqueue_enable_after_reset(_vq); > > > + err_reset = virtqueue_enable_after_reset(_vq); > > > + > > > + if (err) { > > > >>> No err. > > > >>> > > > >>> err is not important. > > > >>> You can remove that. > > > >> Emm, I'm a little confused that which code should I remove ? > > > >> > > > >> > > > >> like this: > > > >>if (vq->packed_ring) > > > >>virtqueue_resize_packed(_vq, num); > > > >>else > > > >>virtqueue_resize_split(_vq, num); > > > >> > > > >> And we should set broken and warn inside > > > >> virtqueue_enable_after_reset()? > > > > > > In my opinion, we should return the error code of > > > virtqueue_resize_packed() / virtqueue_resize_split(). > > > But if this err is not important, this patch makes no sense. > > > Maybe I misunderstand somewhere... > > > If you think it's worth sending a patch, you can send it :).(I'm not > > > familiar with this code). > > > > OK. > > Hi Michael, > > The queue reset code is wrote with the CONFIG_VIRTIO_HARDEN_NOTIFICATION. > > When we disable the vq, the broken is true until we re-enable it. > > So when we re-enable it fail, the vq is broken status. > > Normally, this just happens on the buggy device. > So I think that is enough. > > Thanks. I don't know what to do about CONFIG_VIRTIO_HARDEN_NOTIFICATION. It's known to be broken and it does not look like there's active effort to revive it - should we just drop it for now? > > static int vp_modern_disable_vq_and_reset(struct virtqueue *vq) > { > [...] > > vp_modern_set_queue_reset(mdev, vq->index); > > [...] > > #ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION > ->> __virtqueue_break(vq); > #endif > > [...] > } > > static int vp_modern_enable_vq_after_reset(struct virtqueue *vq) > { > [...] > > if (vp_modern_get_queue_reset(mdev, index)) > return -EBUSY; > > if (vp_modern_get_queue_enable(mdev, index)) > return -EBUSY; > > err = vp_active_vq(vq, info->msix_vector); > if (err) > return err; > > } > > [...] > > #ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION > ->> __virtqueue_unbreak(vq); > #endif > > [...] > } > > > > > > > > Thanks. > > > > > > > > > > Thanks, > > > Su Hui > > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] virtio_ring: add an error code check in virtqueue_resize
On Mon, Oct 23, 2023 at 05:52:02PM +0800, Xuan Zhuo wrote: > On Mon, 23 Oct 2023 17:50:46 +0800, Su Hui wrote: > > On 2023/10/23 13:46, Xuan Zhuo wrote: > > Well, what are the cases where it can happen practically? > > >>> Device error. Such as vp_active_vq() > > >>> > > >>> Thanks. > > >> Hmm interesting. OK. But do callers know to recover? > > > No. > > > > > > So I think WARN + broken is suitable. > > > > > > Thanks. > > Sorry for the late, is the following code okay? > > > > @@ -2739,7 +2739,7 @@ int virtqueue_resize(struct virtqueue *_vq, u32 > > num, > > void (*recycle)(struct virtqueue *vq, void > > *buf)) > > { > > struct vring_virtqueue *vq = to_vvq(_vq); > > - int err; > > + int err, err_reset; > > > > if (num > vq->vq.num_max) > > return -E2BIG; > > @@ -2759,7 +2759,15 @@ int virtqueue_resize(struct virtqueue *_vq, u32 > > num, > > else > > err = virtqueue_resize_split(_vq, num); > > > > - return virtqueue_enable_after_reset(_vq); > > + err_reset = virtqueue_enable_after_reset(_vq); > > + > > + if (err) { > > >>> No err. > > >>> > > >>> err is not important. > > >>> You can remove that. > > >> Emm, I'm a little confused that which code should I remove ? > > >> > > >> > > >> like this: > > >> if (vq->packed_ring) > > >> virtqueue_resize_packed(_vq, num); > > >> else > > >> virtqueue_resize_split(_vq, num); > > >> > > >> And we should set broken and warn inside virtqueue_enable_after_reset()? > > > > In my opinion, we should return the error code of virtqueue_resize_packed() > > / virtqueue_resize_split(). > > But if this err is not important, this patch makes no sense. > > Maybe I misunderstand somewhere... > > If you think it's worth sending a patch, you can send it :).(I'm not > > familiar with this code). > > OK. > > Thanks. I would first try to recover by re-enabling. If that fails we can set broken. > > > > > Thanks, > > Su Hui > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] virtio_ring: add an error code check in virtqueue_resize
On Mon, 23 Oct 2023 17:52:02 +0800, Xuan Zhuo wrote: > On Mon, 23 Oct 2023 17:50:46 +0800, Su Hui wrote: > > On 2023/10/23 13:46, Xuan Zhuo wrote: > > Well, what are the cases where it can happen practically? > > >>> Device error. Such as vp_active_vq() > > >>> > > >>> Thanks. > > >> Hmm interesting. OK. But do callers know to recover? > > > No. > > > > > > So I think WARN + broken is suitable. > > > > > > Thanks. > > Sorry for the late, is the following code okay? > > > > @@ -2739,7 +2739,7 @@ int virtqueue_resize(struct virtqueue *_vq, u32 > > num, > > void (*recycle)(struct virtqueue *vq, void > > *buf)) > > { > > struct vring_virtqueue *vq = to_vvq(_vq); > > - int err; > > + int err, err_reset; > > > > if (num > vq->vq.num_max) > > return -E2BIG; > > @@ -2759,7 +2759,15 @@ int virtqueue_resize(struct virtqueue *_vq, u32 > > num, > > else > > err = virtqueue_resize_split(_vq, num); > > > > - return virtqueue_enable_after_reset(_vq); > > + err_reset = virtqueue_enable_after_reset(_vq); > > + > > + if (err) { > > >>> No err. > > >>> > > >>> err is not important. > > >>> You can remove that. > > >> Emm, I'm a little confused that which code should I remove ? > > >> > > >> > > >> like this: > > >> if (vq->packed_ring) > > >> virtqueue_resize_packed(_vq, num); > > >> else > > >> virtqueue_resize_split(_vq, num); > > >> > > >> And we should set broken and warn inside virtqueue_enable_after_reset()? > > > > In my opinion, we should return the error code of virtqueue_resize_packed() > > / virtqueue_resize_split(). > > But if this err is not important, this patch makes no sense. > > Maybe I misunderstand somewhere... > > If you think it's worth sending a patch, you can send it :).(I'm not > > familiar with this code). > > OK. Hi Michael, The queue reset code is wrote with the CONFIG_VIRTIO_HARDEN_NOTIFICATION. When we disable the vq, the broken is true until we re-enable it. So when we re-enable it fail, the vq is broken status. Normally, this just happens on the buggy device. So I think that is enough. Thanks. static int vp_modern_disable_vq_and_reset(struct virtqueue *vq) { [...] vp_modern_set_queue_reset(mdev, vq->index); [...] #ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION ->> __virtqueue_break(vq); #endif [...] } static int vp_modern_enable_vq_after_reset(struct virtqueue *vq) { [...] if (vp_modern_get_queue_reset(mdev, index)) return -EBUSY; if (vp_modern_get_queue_enable(mdev, index)) return -EBUSY; err = vp_active_vq(vq, info->msix_vector); if (err) return err; } [...] #ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION ->> __virtqueue_unbreak(vq); #endif [...] } > > Thanks. > > > > > > Thanks, > > Su Hui > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] virtio_ring: add an error code check in virtqueue_resize
On Mon, 23 Oct 2023 17:50:46 +0800, Su Hui wrote: > On 2023/10/23 13:46, Xuan Zhuo wrote: > Well, what are the cases where it can happen practically? > >>> Device error. Such as vp_active_vq() > >>> > >>> Thanks. > >> Hmm interesting. OK. But do callers know to recover? > > No. > > > > So I think WARN + broken is suitable. > > > > Thanks. > Sorry for the late, is the following code okay? > > @@ -2739,7 +2739,7 @@ int virtqueue_resize(struct virtqueue *_vq, u32 > num, > void (*recycle)(struct virtqueue *vq, void *buf)) > { > struct vring_virtqueue *vq = to_vvq(_vq); > - int err; > + int err, err_reset; > > if (num > vq->vq.num_max) > return -E2BIG; > @@ -2759,7 +2759,15 @@ int virtqueue_resize(struct virtqueue *_vq, u32 > num, > else > err = virtqueue_resize_split(_vq, num); > > - return virtqueue_enable_after_reset(_vq); > + err_reset = virtqueue_enable_after_reset(_vq); > + > + if (err) { > >>> No err. > >>> > >>> err is not important. > >>> You can remove that. > >> Emm, I'm a little confused that which code should I remove ? > >> > >> > >> like this: > >>if (vq->packed_ring) > >>virtqueue_resize_packed(_vq, num); > >>else > >>virtqueue_resize_split(_vq, num); > >> > >> And we should set broken and warn inside virtqueue_enable_after_reset()? > > In my opinion, we should return the error code of virtqueue_resize_packed() / > virtqueue_resize_split(). > But if this err is not important, this patch makes no sense. > Maybe I misunderstand somewhere... > If you think it's worth sending a patch, you can send it :).(I'm not familiar > with this code). OK. Thanks. > > Thanks, > Su Hui > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v6 9/9] drm: Introduce documentation for hotspot properties
On Mon, 23 Oct 2023 09:46:13 +0200 Albert Esteve wrote: > From: Michael Banack > > To clarify the intent and reasoning behind the hotspot properties > introduce userspace documentation that goes over cursor handling > in para-virtualized environments. > > The documentation is generic enough to not special case for any > specific hypervisor and should apply equally to all. > > Signed-off-by: Zack Rusin Hi, the below doc text: Acked-by: Pekka Paalanen Thanks, pq > --- > Documentation/gpu/drm-kms.rst | 6 > drivers/gpu/drm/drm_plane.c | 58 ++- > 2 files changed, 63 insertions(+), 1 deletion(-) > > diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst > index a0c83fc481264..158cdcc9351f9 100644 > --- a/Documentation/gpu/drm-kms.rst > +++ b/Documentation/gpu/drm-kms.rst > @@ -577,6 +577,12 @@ Variable Refresh Properties > .. kernel-doc:: drivers/gpu/drm/drm_connector.c > :doc: Variable refresh properties > > +Cursor Hotspot Properties > +--- > + > +.. kernel-doc:: drivers/gpu/drm/drm_plane.c > + :doc: hotspot properties > + > Existing KMS Properties > --- > > diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c > index 1dc00ad4c33c3..f3f2eae83cca8 100644 > --- a/drivers/gpu/drm/drm_plane.c > +++ b/drivers/gpu/drm/drm_plane.c > @@ -230,6 +230,61 @@ static int create_in_format_blob(struct drm_device *dev, > struct drm_plane *plane > return 0; > } > > +/** > + * DOC: hotspot properties > + * > + * HOTSPOT_X: property to set mouse hotspot x offset. > + * HOTSPOT_Y: property to set mouse hotspot y offset. > + * > + * When the plane is being used as a cursor image to display a mouse pointer, > + * the "hotspot" is the offset within the cursor image where mouse events > + * are expected to go. > + * > + * Positive values move the hotspot from the top-left corner of the cursor > + * plane towards the right and bottom. > + * > + * Most display drivers do not need this information because the > + * hotspot is not actually connected to anything visible on screen. > + * However, this is necessary for display drivers like the para-virtualized > + * drivers (eg qxl, vbox, virtio, vmwgfx), that are attached to a user > console > + * with a mouse pointer. Since these consoles are often being remoted over a > + * network, they would otherwise have to wait to display the pointer > movement to > + * the user until a full network round-trip has occurred. New mouse events > have > + * to be sent from the user's console, over the network to the virtual input > + * devices, forwarded to the desktop for processing, and then the cursor > plane's > + * position can be updated and sent back to the user's console over the > network. > + * Instead, with the hotspot information, the console can anticipate the new > + * location, and draw the mouse cursor there before the confirmation comes > in. > + * To do that correctly, the user's console must be able predict how the > + * desktop will process mouse events, which normally requires the desktop's > + * mouse topology information, ie where each CRTC sits in the mouse > coordinate > + * space. This is typically sent to the para-virtualized drivers using some > + * driver-specific method, and the driver then forwards it to the console by > + * way of the virtual display device or hypervisor. > + * > + * The assumption is generally made that there is only one cursor plane being > + * used this way at a time, and that the desktop is feeding all mouse devices > + * into the same global pointer. Para-virtualized drivers that require this > + * should only be exposing a single cursor plane, or find some other way > + * to coordinate with a userspace desktop that supports multiple pointers. > + * If the hotspot properties are set, the cursor plane is therefore assumed > to be > + * used only for displaying a mouse cursor image, and the position of the > combined > + * cursor plane + offset can therefore be used for coordinating with input > from a > + * mouse device. > + * > + * The cursor will then be drawn either at the location of the plane in the > CRTC > + * console, or as a free-floating cursor plane on the user's console > + * corresponding to their desktop mouse position. > + * > + * DRM clients which would like to work correctly on drivers which expose > + * hotspot properties should advertise DRM_CLIENT_CAP_CURSOR_PLANE_HOTSPOT. > + * Setting this property on drivers which do not special case > + * cursor planes will return EOPNOTSUPP, which can be used by userspace to > + * gauge requirements of the hardware/drivers they're running on. Advertising > + * DRM_CLIENT_CAP_CURSOR_PLANE_HOTSPOT implies that the userspace client > will be > + * correctly setting the hotspot properties. > + */ > + > /** > * drm_plane_create_hotspot_properties - creates the mouse hotspot > * properties and attaches them to the giv
Re: [PATCH vhost v4 12/16] vdpa/mlx5: Improve mr update flow
On Fri, 2023-10-20 at 18:01 +0200, Eugenio Perez Martin wrote: > On Wed, Oct 18, 2023 at 7:21 PM Dragos Tatulea wrote: > > > > On Wed, 2023-10-18 at 20:14 +0300, Dragos Tatulea wrote: > > > The current flow for updating an mr works directly on mvdev->mr which > > > makes it cumbersome to handle multiple new mr structs. > > > > > > This patch makes the flow more straightforward by having > > > mlx5_vdpa_create_mr return a new mr which will update the old mr (if > > > any). The old mr will be deleted and unlinked from mvdev. For the case > > > when the iotlb is empty (not NULL), the old mr will be cleared. > > > > > > This change paves the way for adding mrs for different ASIDs. > > > > > > The initialized bool is no longer needed as mr is now a pointer in the > > > mlx5_vdpa_dev struct which will be NULL when not initialized. > > > > > > Acked-by: Eugenio Pérez > > > Acked-by: Jason Wang > > > Signed-off-by: Dragos Tatulea > > > --- > > > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 14 +++-- > > > drivers/vdpa/mlx5/core/mr.c | 87 -- > > > drivers/vdpa/mlx5/net/mlx5_vnet.c | 53 +- > > > 3 files changed, 82 insertions(+), 72 deletions(-) > > > > > > diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > index 9c6ac42c21e1..bbe4335106bd 100644 > > > --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h > > > @@ -31,8 +31,6 @@ struct mlx5_vdpa_mr { > > > struct list_head head; > > > unsigned long num_directs; > > > unsigned long num_klms; > > > - /* state of dvq mr */ > > > - bool initialized; > > > > > > bool user_mr; > > > }; > > > @@ -91,7 +89,7 @@ struct mlx5_vdpa_dev { > > > u16 max_idx; > > > u32 generation; > > > > > > - struct mlx5_vdpa_mr mr; > > > + struct mlx5_vdpa_mr *mr; > > > /* serialize mr access */ > > > struct mutex mr_mtx; > > > struct mlx5_control_vq cvq; > > > @@ -114,14 +112,14 @@ void mlx5_vdpa_free_resources(struct mlx5_vdpa_dev > > > *mvdev); > > > int mlx5_vdpa_create_mkey(struct mlx5_vdpa_dev *mvdev, u32 *mkey, u32 > > > *in, > > > int inlen); > > > int mlx5_vdpa_destroy_mkey(struct mlx5_vdpa_dev *mvdev, u32 mkey); > > > -int mlx5_vdpa_handle_set_map(struct mlx5_vdpa_dev *mvdev, struct > > > vhost_iotlb > > > *iotlb, > > > - bool *change_map, unsigned int asid); > > > -int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > > > - struct mlx5_vdpa_mr *mr, > > > - struct vhost_iotlb *iotlb); > > > +struct mlx5_vdpa_mr *mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev, > > > + struct vhost_iotlb *iotlb); > > > void mlx5_vdpa_destroy_mr_resources(struct mlx5_vdpa_dev *mvdev); > > > void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev, > > > struct mlx5_vdpa_mr *mr); > > > +void mlx5_vdpa_update_mr(struct mlx5_vdpa_dev *mvdev, > > > + struct mlx5_vdpa_mr *mr, > > > + unsigned int asid); > > > int mlx5_vdpa_update_cvq_iotlb(struct mlx5_vdpa_dev *mvdev, > > > struct vhost_iotlb *iotlb, > > > unsigned int asid); > > > diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c > > > index abd6a6fb122f..00eff5a07152 100644 > > > --- a/drivers/vdpa/mlx5/core/mr.c > > > +++ b/drivers/vdpa/mlx5/core/mr.c > > > @@ -495,30 +495,51 @@ static void destroy_user_mr(struct mlx5_vdpa_dev > > > *mvdev, > > > struct mlx5_vdpa_mr *mr > > > > > > static void _mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev, struct > > > mlx5_vdpa_mr *mr) > > > { > > > - if (!mr->initialized) > > > - return; > > > - > > > if (mr->user_mr) > > > destroy_user_mr(mvdev, mr); > > > else > > > destroy_dma_mr(mvdev, mr); > > > - > > > - mr->initialized = false; > > > } > > > > > > void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev, > > > struct mlx5_vdpa_mr *mr) > > > { > > > + if (!mr) > > > + return; > > > + > > > mutex_lock(&mvdev->mr_mtx); > > > > > > _mlx5_vdpa_destroy_mr(mvdev, mr); > > > > > > + if (mvdev->mr == mr) > > > + mvdev->mr = NULL; > > > + > > > + mutex_unlock(&mvdev->mr_mtx); > > > + > > > + kfree(mr); > > > +} > > > + > > > +void mlx5_vdpa_update_mr(struct mlx5_vdpa_dev *mvdev, > > > + struct mlx5_vdpa_mr *new_mr, > > > + unsigned int asid) > > > +{ > > > + struct mlx5_vdpa_mr *old_mr = mvdev->mr; > > > + > > > + mutex_lock(&mvdev->mr_mtx); > > > + > > > + mvdev->mr = new_mr; > > > + if (old_mr) { > > > + _mlx5_vdpa_destroy_mr(mvdev, old_mr);
Re: [PATCH v4 3/4] vduse: Temporarily disable control queue features
On 10/23/23 05:08, Jason Wang wrote: On Fri, Oct 20, 2023 at 11:58 PM Maxime Coquelin wrote: Virtio-net driver control queue implementation is not safe when used with VDUSE. If the VDUSE application does not reply to control queue messages, it currently ends up hanging the kernel thread sending this command. Some work is on-going to make the control queue implementation robust with VDUSE. Until it is completed, let's disable control virtqueue and features that depend on it. Signed-off-by: Maxime Coquelin I wonder if it's better to do this with patch 2 or before patch 2 to unbreak the bisection? I think it would be better to keep it in a dedicated patch to ease the revert later when your work will have been accepted, so before patch 2. Thanks, Maxime Thanks --- drivers/vdpa/vdpa_user/vduse_dev.c | 37 ++ 1 file changed, 37 insertions(+) diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c index 73ad3b7efd8e..0243dee9cf0e 100644 --- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include "iova_domain.h" @@ -46,6 +47,30 @@ #define IRQ_UNBOUND -1 +#define VDUSE_NET_VALID_FEATURES_MASK \ + (BIT_ULL(VIRTIO_NET_F_CSUM) | \ +BIT_ULL(VIRTIO_NET_F_GUEST_CSUM) | \ +BIT_ULL(VIRTIO_NET_F_MTU) |\ +BIT_ULL(VIRTIO_NET_F_MAC) |\ +BIT_ULL(VIRTIO_NET_F_GUEST_TSO4) | \ +BIT_ULL(VIRTIO_NET_F_GUEST_TSO6) | \ +BIT_ULL(VIRTIO_NET_F_GUEST_ECN) | \ +BIT_ULL(VIRTIO_NET_F_GUEST_UFO) | \ +BIT_ULL(VIRTIO_NET_F_HOST_TSO4) | \ +BIT_ULL(VIRTIO_NET_F_HOST_TSO6) | \ +BIT_ULL(VIRTIO_NET_F_HOST_ECN) | \ +BIT_ULL(VIRTIO_NET_F_HOST_UFO) | \ +BIT_ULL(VIRTIO_NET_F_MRG_RXBUF) | \ +BIT_ULL(VIRTIO_NET_F_STATUS) | \ +BIT_ULL(VIRTIO_NET_F_HOST_USO) | \ +BIT_ULL(VIRTIO_F_ANY_LAYOUT) | \ +BIT_ULL(VIRTIO_RING_F_INDIRECT_DESC) | \ +BIT_ULL(VIRTIO_RING_F_EVENT_IDX) | \ +BIT_ULL(VIRTIO_F_VERSION_1) | \ +BIT_ULL(VIRTIO_F_ACCESS_PLATFORM) | \ +BIT_ULL(VIRTIO_F_RING_PACKED) |\ +BIT_ULL(VIRTIO_F_IN_ORDER)) + struct vduse_virtqueue { u16 index; u16 num_max; @@ -1778,6 +1803,16 @@ static struct attribute *vduse_dev_attrs[] = { ATTRIBUTE_GROUPS(vduse_dev); +static void vduse_dev_features_filter(struct vduse_dev_config *config) +{ + /* +* Temporarily filter out virtio-net's control virtqueue and features +* that depend on it while CVQ is being made more robust for VDUSE. +*/ + if (config->device_id == VIRTIO_ID_NET) + config->features &= VDUSE_NET_VALID_FEATURES_MASK; +} + static int vduse_create_dev(struct vduse_dev_config *config, void *config_buf, u64 api_version) { @@ -1793,6 +1828,8 @@ static int vduse_create_dev(struct vduse_dev_config *config, if (!dev) goto err; + vduse_dev_features_filter(config); + dev->api_version = api_version; dev->device_features = config->features; dev->device_id = config->device_id; -- 2.41.0 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 1/4] vduse: validate block features only with block devices
On 10/21/23 00:07, Casey Schaufler wrote: On 10/20/2023 8:58 AM, Maxime Coquelin wrote: This patch is preliminary work to enable network device type support to VDUSE. As VIRTIO_BLK_F_CONFIG_WCE shares the same value as VIRTIO_NET_F_HOST_TSO4, we need to restrict its check to Virtio-blk device type. Acked-by: Jason Wang Reviewed-by: Xie Yongji Signed-off-by: Maxime Coquelin --- drivers/vdpa/vdpa_user/vduse_dev.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c index df7869537ef1..5b3879976b3d 100644 --- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -1662,13 +1662,14 @@ static bool device_is_allowed(u32 device_id) return false; } -static bool features_is_valid(u64 features) +static bool features_is_valid(struct vduse_dev_config *config) This should either be features_are_valid() or feature_is_valid(). Correct pluralization is important in the English language. Indeed, I will change to features_are_valid() in next revision. Thanks, Maxime { - if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM))) + if (!(config->features & (1ULL << VIRTIO_F_ACCESS_PLATFORM))) return false; /* Now we only support read-only configuration space */ - if (features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)) + if ((config->device_id == VIRTIO_ID_BLOCK) && + (config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE))) return false; return true; @@ -1695,7 +1696,7 @@ static bool vduse_validate_config(struct vduse_dev_config *config) if (!device_is_allowed(config->device_id)) return false; - if (!features_is_valid(config->features)) + if (!features_is_valid(config)) return false; return true; ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 4/4] vduse: Add LSM hooks to check Virtio device type
On 10/21/23 00:20, Casey Schaufler wrote: On 10/20/2023 8:58 AM, Maxime Coquelin wrote: This patch introduces LSM hooks for devices creation, destruction and opening operations, checking the application is allowed to perform these operations for the Virtio device type. Why do you think that there needs to be a special LSM check for virtio devices? What can't existing device attributes be used? Michael asked for a way for SELinux to allow/prevent the creation of some types of devices [0]. A device is created using ioctl() on VDUSE control chardev. Its type is specified via a field in the structure passed in argument. I didn't see other way than adding dedicated LSM hooks to achieve this, but it is possible that their is a better way to do it? Thanks, Maxime [0]: https://lore.kernel.org/all/20230829130430-mutt-send-email-...@kernel.org/ ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization