Re: [PATCH v4 5/7] vhost-vdpa: clean iotlb map during reset for older userspace

2023-10-23 Thread Jason Wang
On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu  wrote:
>
> Using .compat_reset op from the previous patch, the buggy .reset
> behaviour can be kept as-is on older userspace apps, which don't ack the
> IOTLB_PERSIST backend feature. As this compatibility quirk is limited to
> those drivers that used to be buggy in the past, it won't affect change
> the behaviour or affect ABI on the setups with API compliant driver.
>
> The separation of .compat_reset from the regular .reset allows
> vhost-vdpa able to know which driver had broken behaviour before, so it
> can apply the corresponding compatibility quirk to the individual driver
> whenever needed.  Compared to overloading the existing .reset with
> flags, .compat_reset won't cause any extra burden to the implementation
> of every compliant driver.
>
> Signed-off-by: Si-Wei Liu 
> ---
>  drivers/vhost/vdpa.c | 17 +
>  drivers/virtio/virtio_vdpa.c |  2 +-
>  include/linux/vdpa.h |  7 +--
>  3 files changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index acc7c74ba7d6..9ce40003793b 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -227,13 +227,22 @@ static void vhost_vdpa_unsetup_vq_irq(struct vhost_vdpa 
> *v, u16 qid)
> irq_bypass_unregister_producer(&vq->call_ctx.producer);
>  }
>
> -static int vhost_vdpa_reset(struct vhost_vdpa *v)
> +static int _compat_vdpa_reset(struct vhost_vdpa *v)
>  {
> struct vdpa_device *vdpa = v->vdpa;
> +   u32 flags = 0;
>
> -   v->in_batch = 0;
> +   flags |= !vhost_backend_has_feature(v->vdev.vqs[0],
> +   VHOST_BACKEND_F_IOTLB_PERSIST) ?
> +VDPA_RESET_F_CLEAN_MAP : 0;
> +
> +   return vdpa_reset(vdpa, flags);
> +}
>
> -   return vdpa_reset(vdpa);
> +static int vhost_vdpa_reset(struct vhost_vdpa *v)
> +{
> +   v->in_batch = 0;
> +   return _compat_vdpa_reset(v);
>  }
>
>  static long vhost_vdpa_bind_mm(struct vhost_vdpa *v)
> @@ -312,7 +321,7 @@ static long vhost_vdpa_set_status(struct vhost_vdpa *v, 
> u8 __user *statusp)
> vhost_vdpa_unsetup_vq_irq(v, i);
>
> if (status == 0) {
> -   ret = vdpa_reset(vdpa);
> +   ret = _compat_vdpa_reset(v);
> if (ret)
> return ret;
> } else
> diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
> index 06ce6d8c2e00..8d63e5923d24 100644
> --- a/drivers/virtio/virtio_vdpa.c
> +++ b/drivers/virtio/virtio_vdpa.c
> @@ -100,7 +100,7 @@ static void virtio_vdpa_reset(struct virtio_device *vdev)
>  {
> struct vdpa_device *vdpa = vd_get_vdpa(vdev);
>
> -   vdpa_reset(vdpa);
> +   vdpa_reset(vdpa, 0);
>  }
>
>  static bool virtio_vdpa_notify(struct virtqueue *vq)
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index 6b8cbf75712d..db15ac07f8a6 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -519,14 +519,17 @@ static inline struct device *vdpa_get_dma_dev(struct 
> vdpa_device *vdev)
> return vdev->dma_dev;
>  }
>
> -static inline int vdpa_reset(struct vdpa_device *vdev)
> +static inline int vdpa_reset(struct vdpa_device *vdev, u32 flags)
>  {
> const struct vdpa_config_ops *ops = vdev->config;
> int ret;
>
> down_write(&vdev->cf_lock);
> vdev->features_valid = false;
> -   ret = ops->reset(vdev);
> +   if (ops->compat_reset && flags)
> +   ret = ops->compat_reset(vdev, flags);
> +   else
> +   ret = ops->reset(vdev);

Instead of inventing a new API that carries the flags. Tweak the
existing one seems to be simpler and better?

As compat_reset(vdev, 0) == reset(vdev)

Then you don't need the switch in the parent as well

+static int vdpasim_reset(struct vdpa_device *vdpa)
+{
+   return vdpasim_compat_reset(vdpa, 0);
+}

Thanks


> up_write(&vdev->cf_lock);
> return ret;
>  }
> --
> 2.39.3
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC] vhost: vmap virtio descriptor table kernel/kvm

2023-10-23 Thread Jason Wang
On Tue, Oct 24, 2023 at 11:17 AM Liang Chen  wrote:
>
> The current vhost code uses 'copy_from_user' to copy descriptors from
> userspace to vhost. We attempted to 'vmap' the descriptor table to
> reduce the overhead associated with 'copy_from_user' during descriptor
> access, because it can be accessed quite frequently. This change
> resulted in a moderate performance improvement (approximately 3%) in
> our pktgen test, as shown below. Additionally, the latency in the
> 'vhost_get_vq_desc' function shows a noticeable decrease.
>
> current code:
> IFACE   rxpck/s   txpck/srxkB/stxkB/s
> rxcmp/s   txcmp/s  rxmcst/s   %ifutil
> Average:vnet0  0.31 1330807.03  0.02  77976.98
> 0.00  0.00  0.00  6.39
> # /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc
> avg = 145 nsecs, total: 1455980341 nsecs, count: 9985224
>
> vmap:
> IFACE   rxpck/s   txpck/srxkB/stxkB/s
> rxcmp/s   txcmp/s  rxmcst/s   %ifutil
> Average:vnet0  0.07 1371870.49  0.00  80383.04
> 0.00  0.00  0.00  6.58
> # /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc
> avg = 122 nsecs, total: 1286983929 nsecs, count: 10478134
>
> We are uncertain if there are any aspects we may have overlooked and
> would appreciate any advice before we submit an actual patch.

So the idea is to use a shadow page table instead of the userspace one
to avoid things like spec barriers or SMAP.

I've tried this in the past:

commit 7f466032dc9e5a61217f22ea34b2df932786bbfc (HEAD)
Author: Jason Wang 
Date:   Fri May 24 04:12:18 2019 -0400

vhost: access vq metadata through kernel virtual address

It was noticed that the copy_to/from_user() friends that was used to
access virtqueue metdata tends to be very expensive for dataplane
implementation like vhost since it involves lots of software checks,
speculation barriers, hardware feature toggling (e.g SMAP). The
extra cost will be more obvious when transferring small packets since
the time spent on metadata accessing become more significant.
...

Note that it tries to use a direct map instead of a VMAP as Andrea
suggests. But it led to several fallouts which were tricky to be
fixed[1] (like the use of MMU notifiers to do synchronization). So it
is reverted finally.

I'm not saying it's a dead end. But we need to find a way to solve the
issues or use something different. I'm happy to offer help.

1) Avoid using SMAP for vhost kthread, for example using shed
notifier, I'm not sure this is possible or not
2) A new iov iterator that doesn't do SMAP at all, this looks
dangerous and Al might not like it
3) (Re)using HMM
...

You may want to see archives for more information. We've had a lot of
discussions.

Thanks

[1] https://lore.kernel.org/lkml/20190731084655.7024-1-jasow...@redhat.com/

>
>
> Thanks,
> Liang
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH vhost 2/2] virtio: set broken when re-enabling vq fails

2023-10-23 Thread Xuan Zhuo
In vp_modern_enable_vq_after_reset, we will do some checks to ensure
that the vq is ready to re-enable. If that fails, the vq is good.

If the vq_active_vp() fails, that means the vq is broken.
The driver can not use that vq, this commit sets the vq to broken.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_pci_modern.c |  6 +++---
 drivers/virtio/virtio_ring.c   | 11 ---
 include/linux/virtio_config.h  |  2 ++
 3 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/virtio/virtio_pci_modern.c 
b/drivers/virtio/virtio_pci_modern.c
index ee6a386d250b..56a4075ca5fb 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -286,16 +286,16 @@ static int vp_modern_enable_vq_after_reset(struct 
virtqueue *vq)
int err;
 
if (!vq->reset)
-   return -EBUSY;
+   return -EINVAL;
 
index = vq->index;
info = vp_dev->vqs[index];
 
if (vp_modern_get_queue_reset(mdev, index))
-   return -EBUSY;
+   return -EINVAL;
 
if (vp_modern_get_queue_enable(mdev, index))
-   return -EBUSY;
+   return -EINVAL;
 
err = vp_active_vq(vq, info->msix_vector);
if (err)
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index af310418e66e..91e63c57c112 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2183,11 +2183,16 @@ static int virtqueue_enable_after_reset(struct 
virtqueue *_vq)
 {
struct vring_virtqueue *vq = to_vvq(_vq);
struct virtio_device *vdev = vq->vq.vdev;
+   int err;
+
+   err = vdev->config->enable_vq_after_reset(_vq);
+   if (err == -EINVAL || !err)
+   return err;
 
-   if (vdev->config->enable_vq_after_reset(_vq))
-   return -EBUSY;
+   dev_warn(&vdev->dev, "Fail to re-enable the vq.%u error:%d\n", 
_vq->index, err);
+   __virtqueue_unbreak(_vq);
 
-   return 0;
+   return err;
 }
 
 /*
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 2b3438de2c4d..f96bebf9b632 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -91,6 +91,8 @@ typedef void vq_callback_t(struct virtqueue *);
  * @enable_vq_after_reset: enable a reset queue
  * vq: the virtqueue
  * Returns 0 on success or error status
+ * -EINVAL: the vq is not in the reset status or is not ready to enable.
+ * Other error: enabling vq fails. The vq is in broken status.
  * If disable_vq_and_reset is set, then enable_vq_after_reset must also be
  * set.
  */
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost 1/2] virtio_ring: remove unused code

2023-10-23 Thread Xuan Zhuo
Remove the ignored return values;

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 51d8f3299c10..af310418e66e 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2755,9 +2755,9 @@ int virtqueue_resize(struct virtqueue *_vq, u32 num,
return err;
 
if (vq->packed_ring)
-   err = virtqueue_resize_packed(_vq, num);
+   virtqueue_resize_packed(_vq, num);
else
-   err = virtqueue_resize_split(_vq, num);
+   virtqueue_resize_split(_vq, num);
 
return virtqueue_enable_after_reset(_vq);
 }
-- 
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH vhost 0/2] virtio: set broken when re-enable vq fails

2023-10-23 Thread Xuan Zhuo
In vp_modern_enable_vq_after_reset, we will do some check to ensure
that the vq is in reset status. If that fails, the vq is good.

If the vq_active_vp() fails, that means the vq is broken.
The driver will can not use that vq, this commit sets
the vq to broken.

Xuan Zhuo (2):
  virtio_ring: remove unused code
  virtio: set broken when re-enabling vq fails

 drivers/virtio/virtio_pci_modern.c |  6 +++---
 drivers/virtio/virtio_ring.c   | 15 ++-
 include/linux/virtio_config.h  |  2 ++
 3 files changed, 15 insertions(+), 8 deletions(-)

--
2.32.0.3.g01195cf9f

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC] vhost: vmap virtio descriptor table kernel/kvm

2023-10-23 Thread Liang Chen
The current vhost code uses 'copy_from_user' to copy descriptors from
userspace to vhost. We attempted to 'vmap' the descriptor table to
reduce the overhead associated with 'copy_from_user' during descriptor
access, because it can be accessed quite frequently. This change
resulted in a moderate performance improvement (approximately 3%) in
our pktgen test, as shown below. Additionally, the latency in the
'vhost_get_vq_desc' function shows a noticeable decrease.

current code:
IFACE   rxpck/s   txpck/srxkB/stxkB/s
rxcmp/s   txcmp/s  rxmcst/s   %ifutil
Average:vnet0  0.31 1330807.03  0.02  77976.98
0.00  0.00  0.00  6.39
# /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc
avg = 145 nsecs, total: 1455980341 nsecs, count: 9985224

vmap:
IFACE   rxpck/s   txpck/srxkB/stxkB/s
rxcmp/s   txcmp/s  rxmcst/s   %ifutil
Average:vnet0  0.07 1371870.49  0.00  80383.04
0.00  0.00  0.00  6.58
# /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc
avg = 122 nsecs, total: 1286983929 nsecs, count: 10478134

We are uncertain if there are any aspects we may have overlooked and
would appreciate any advice before we submit an actual patch.


Thanks,
Liang
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC] vmap virtio descriptor table

2023-10-23 Thread Liang Chen
The current vhost code uses 'copy_from_user' to copy descriptors from
userspace to vhost. We attempted to 'vmap' the descriptor table to
reduce the overhead associated with 'copy_from_user' during descriptor
access, because it can be accessed quite frequently. This change
resulted in a moderate performance improvement (approximately 3%) in
our pktgen test, as shown below. Additionally, the latency in the
'vhost_get_vq_desc' function shows a noticeable decrease.

current code:
IFACE   rxpck/s   txpck/srxkB/stxkB/s
rxcmp/s   txcmp/s  rxmcst/s   %ifutil
Average:vnet0  0.31 1330807.03  0.02  77976.98
0.00  0.00  0.00  6.39
# /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc
avg = 145 nsecs, total: 1455980341 nsecs, count: 9985224

kmap:
IFACE   rxpck/s   txpck/srxkB/stxkB/s
rxcmp/s   txcmp/s  rxmcst/s   %ifutil
Average:vnet0  0.07 1371870.49  0.00  80383.04
0.00  0.00  0.00  6.58
# /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc
avg = 122 nsecs, total: 1286983929 nsecs, count: 10478134

We are uncertain if there are any aspects we may have overlooked and
would appreciate any advice before we submit an actual patch.


Thanks,
Liang
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] virtio_pci: Don't make an extra copy of cpu affinity mask

2023-10-23 Thread Xuan Zhuo
On Mon, 23 Oct 2023 18:52:45 +0200, Jakub Sitnicki  wrote:
> On Thu, Oct 19, 2023 at 08:55 PM +08, Xuan Zhuo wrote:
> > On Thu, 19 Oct 2023 12:16:24 +0200, Jakub Sitnicki  
> > wrote:
> >> Since commit 19e226e8cc5d ("virtio: Make vp_set_vq_affinity() take a
> >> mask.") it is actually not needed to have a local copy of the cpu mask.
> >
> >
> > Could you give more info to prove this?


Actually, my question is that can we pass a val on the stack(or temp value) to
the irq_set_affinity_hint()?

Such as the virtio-net uses zalloc_cpumask_var to alloc a cpu_mask, and
that will be released.



int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
  bool setaffinity)
{
unsigned long flags;
struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 
IRQ_GET_DESC_CHECK_GLOBAL);

if (!desc)
return -EINVAL;
->  desc->affinity_hint = m;
irq_put_desc_unlock(desc, flags);
if (m && setaffinity)
__irq_set_affinity(irq, m, false);
return 0;
}
EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint);

The above code directly refers the mask pointer. If the mask is a temp value, I
think that is a bug.

And I notice that many places directly pass the temp value to this API.
And I am a little confused. ^_^ Or I missed something.

Thanks.


> >
> > If you are right, I think you should delete all code about 
> > msix_affinity_masks?
>
> Sorry for the late reply. I've been away.
>
> It looks that msix_affinity_masks became unused - intentionally - in
> 2015, after commit 210d150e1f5d ("virtio_pci: Clear stale cpumask when
> setting irq affinity") [1].
>
> Originally introduced in 2012 in commit 75a0a52be3c2 ("virtio: introduce
> an API to set affinity for a virtqueue") [2]. As I understand, it was
> meant to make it possible to set VQ affinity to more than once CPU.
>
> Now that we can pass a CPU mask, listing all CPUs, to set_vq_affinity,
> msix_affinity_masks seems to no longer have a purpose.
>
> So, IMO, you're right. We can remove it.
>
> Happy to do that in a follow up series.
>
> That is - if you're okay with these two patches in the current form.
>
> Thanks for reviewing.
>
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=210d150e1f5da506875e376422ba31ead2d49621
> [2] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75a0a52be3c27b58654fbed2c8f2ff401482b9a4
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2] ALSA: virtio: use copy and fill_silence callbacks

2023-10-23 Thread Anton Yakovlev via Virtualization

Hi Takashi,

On 19.10.2023 16:48, Takashi Iwai wrote:

On Thu, 19 Oct 2023 03:20:19 +0200,
Anton Yakovlev wrote:


Hi Takashi,

On 19.10.2023 03:07, Takashi Iwai wrote:

On Wed, 18 Oct 2023 12:48:23 +0200,
Matias Ezequiel Vara Larsen wrote:


This commit replaces the mmap mechanism with the copy() and
fill_silence() callbacks for both capturing and playback for the
virtio-sound driver. This change is required to prevent the updating of
the content of a buffer that is already in the available ring.

The current mechanism splits a dma buffer into descriptors that are
exposed to the device. This dma buffer is shared with the user
application. When the device consumes a buffer, the driver moves the
request from the used ring to available ring.

The driver exposes the buffer to the device without knowing if the
content has been updated from the user. The section 2.8.21.1 of the
virtio spec states that: "The device MAY access the descriptor chains
the driver created and the memory they refer to immediately". If the
device picks up buffers from the available ring just after it is
notified, it happens that the content may be old.

By providing the copy() callback, the driver first updates the content
of the buffer, and then, exposes the buffer to the device by enqueuing
it in the available ring. Thus, device always picks up a buffer that is
updated. During copy(), the number of requests enqueued depends on the
"pos" and "bytes" arguments. The length of each request is period_size
bytes.

For capturing, the driver starts by exposing all the available buffers
to device. After device updates the content of a buffer, it enqueues it
in the used ring. It is only after the copy() for capturing is issued
that the driver re-enqueues the buffer in the available ring.

Co-developed-by: Anton Yakovlev 
Signed-off-by: Matias Ezequiel Vara Larsen 
---
Changelog:
v1 -> v2:
   * Use snd_pcm_set_managed_buffer_all()for buffer allocation/freeing.
   * Make virtsnd_pcm_msg_send() generic by specifying the offset and size
 for the modified part of the buffer; this way no assumptions need to
 be made.
   * Disable SNDRV_PCM_INFO_NO_REWINDS since now only sequential
 reading/writing of frames is supported.
   * Correct comment at virtsnd_pcm_msg_send().
   * v1 patch at:
 
https://ddec1-0-en-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2flore.kernel.org%2flkml%2f20231016151000.GE119987%40fedora%2ft%2f&umid=323acbff-2d10-45a8-bbe1-78fc8583abec&auth=53c7c7de28b92dfd96e93d9dd61a23e634d2fbec-6526c5e588ae6e2c247d0c967e0504e4fc7f307a

   sound/virtio/virtio_pcm.c |  7 ++-
   sound/virtio/virtio_pcm.h |  9 ++--
   sound/virtio/virtio_pcm_msg.c | 93 ++-
   sound/virtio/virtio_pcm_ops.c | 81 +-
   4 files changed, 137 insertions(+), 53 deletions(-)


Most of the code changes look good, but I wonder:



diff --git a/sound/virtio/virtio_pcm.c b/sound/virtio/virtio_pcm.c
index c10d91fff2fb..66d67eef1bcc 100644
--- a/sound/virtio/virtio_pcm.c
+++ b/sound/virtio/virtio_pcm.c
@@ -104,12 +104,11 @@ static int virtsnd_pcm_build_hw(struct 
virtio_pcm_substream *vss,
 * only message-based transport.
 */
vss->hw.info =
-   SNDRV_PCM_INFO_MMAP |
-   SNDRV_PCM_INFO_MMAP_VALID |


Do we need the removal of those MMAP features inevitably?
Usually mmap can still work even if the driver implements the copy
ops.  Those aren't always mutual exclusive.


The driver uses a message queue to communicate with the device. Thus,
the audio buffer is sliced into several I/O requests (= number of
periods) of the same size (= period size).

Before this, all such requests were enqueued when the substream started,
and immediately re-enqueued once the request is completed. This approach
made it possible to add mmap support. But for mmap there are no explicit
notifications from the application how many frames were written or read.
Thus, it was assumed that the virtual device should read/write frames to
requests based on timings. And there are some problems here:

   1. This was found to violate the virtio specification: if a request is
  already in the queue, the device can safely read/write there at any
  time.
   2. It looks like this breaks the use case with swiotlb. Personally I'm
  not sure how the application handles DMA ownership in the case of
  mmaped buffer.

To correctly implement mmap support, instead of transferring data via a
message queue, the driver and device must have a shared memory region.
We can add mmap in the future when we expand the functionality of the
device to support such shared memory.


Ah, then this implementation might be an overkill.  You're still using
the (intermediate) vmalloc buffer allocated via PCM managed mode, and
the actual data is copied from/to there.  So it doesn't conflict with
the mmap operation at all.

I guess that the problem you're trying to solve (the immediate d

Re: [PATCH v3] ALSA: virtio: use ack callback

2023-10-23 Thread Anton Yakovlev via Virtualization

Hi Matias,

Thanks for the new patch!



On 24.10.2023 00:06, Matias Ezequiel Vara Larsen wrote:

This commit uses the ack() callback to determine when a buffer has been
updated, then exposes it to guest.

The current mechanism splits a dma buffer into descriptors that are
exposed to the device. This dma buffer is shared with the user
application. When the device consumes a buffer, the driver moves the
request from the used ring to available ring.

The driver exposes the buffer to the device without knowing if the
content has been updated from the user. The section 2.8.21.1 of the
virtio spec states that: "The device MAY access the descriptor chains
the driver created and the memory they refer to immediately". If the
device picks up buffers from the available ring just after it is
notified, it happens that the content may be old.

When the ack() callback is invoked, the driver exposes only the buffers
that have already been updated, i.e., enqueued in the available ring.
Thus, the device always picks up a buffer that is updated.

For capturing, the driver starts by exposing all the available buffers
to device. After device updates the content of a buffer, it enqueues it
in the used ring. It is only after the ack() for capturing is issued
that the driver re-enqueues the buffer in the available ring.

Co-developed-by: Anton Yakovlev 
Signed-off-by: Anton Yakovlev 
Signed-off-by: Matias Ezequiel Vara Larsen 
---
Changelog:
v2 -> v3:
  * Use ack() callback instead of copy()/fill_silence()
  * Change commit name
  * Rewrite commit message
  * Remove virtsnd_pcm_msg_send_locked()
  * Use single callback for both capture and playback
  * Fix kernel test robot warnings regarding documentation
  * v2 patch at:

https://ddec1-0-en-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2flore.kernel.org%2flkml%2f87y1fzkq8c.wl%2dtiwai%40suse.de%2fT%2f&umid=6a6586e6-1bcb-48d2-8c0c-75ef6bb15df9&auth=53c7c7de28b92dfd96e93d9dd61a23e634d2fbec-090fe82db9a03f0dc8c4f214d4d2e2bf916e7f1f
v1 -> v2:
  * Use snd_pcm_set_managed_buffer_all()for buffer allocation/freeing.
  * Make virtsnd_pcm_msg_send() generic by specifying the offset and size
for the modified part of the buffer; this way no assumptions need to
be made.
  * Disable SNDRV_PCM_INFO_NO_REWINDS since now only sequential
reading/writing of frames is supported.
  * Correct comment at virtsnd_pcm_msg_send().
  * v1 patch at:

https://ddec1-0-en-ctp.trendmicro.com:443/wis/clicktime/v1/query?url=https%3a%2f%2flore.kernel.org%2flkml%2f20231016151000.GE119987%40fedora%2ft%2f&umid=6a6586e6-1bcb-48d2-8c0c-75ef6bb15df9&auth=53c7c7de28b92dfd96e93d9dd61a23e634d2fbec-2d4d809544c877beff1a631a29db01290c65d879

  sound/virtio/virtio_pcm.c |  1 +
  sound/virtio/virtio_pcm.h |  6 ++-
  sound/virtio/virtio_pcm_msg.c | 80 ---
  sound/virtio/virtio_pcm_ops.c | 30 +++--
  4 files changed, 79 insertions(+), 38 deletions(-)

diff --git a/sound/virtio/virtio_pcm.c b/sound/virtio/virtio_pcm.c
index c10d91fff2fb..9cc5a95b4913 100644
--- a/sound/virtio/virtio_pcm.c
+++ b/sound/virtio/virtio_pcm.c
@@ -124,6 +124,7 @@ static int virtsnd_pcm_build_hw(struct virtio_pcm_substream 
*vss,
values = le64_to_cpu(info->formats);
  
  	vss->hw.formats = 0;

+   vss->appl_ptr = 0;
  
  	for (i = 0; i < ARRAY_SIZE(g_v2a_format_map); ++i)

if (values & (1ULL << i)) {
diff --git a/sound/virtio/virtio_pcm.h b/sound/virtio/virtio_pcm.h
index 062eb8e8f2cf..ea3c2845ae9b 100644
--- a/sound/virtio/virtio_pcm.h
+++ b/sound/virtio/virtio_pcm.h
@@ -27,6 +27,7 @@ struct virtio_pcm_msg;
   *substream operators.
   * @buffer_bytes: Current buffer size in bytes.
   * @hw_ptr: Substream hardware pointer value in bytes [0 ... buffer_bytes).
+ * @appl_ptr: Substream application pointer value in bytes [0 ... 
buffer_bytes).
   * @xfer_enabled: Data transfer state (0 - off, 1 - on).
   * @xfer_xrun: Data underflow/overflow state (0 - no xrun, 1 - xrun).
   * @stopped: True if the substream is stopped and must be released on the 
device
@@ -51,13 +52,13 @@ struct virtio_pcm_substream {
spinlock_t lock;
size_t buffer_bytes;
size_t hw_ptr;
+   size_t appl_ptr;
bool xfer_enabled;
bool xfer_xrun;
bool stopped;
bool suspended;
struct virtio_pcm_msg **msgs;
unsigned int nmsgs;
-   int msg_last_enqueued;
unsigned int msg_count;
wait_queue_head_t msg_empty;
  };
@@ -117,7 +118,8 @@ int virtsnd_pcm_msg_alloc(struct virtio_pcm_substream *vss,
  
  void virtsnd_pcm_msg_free(struct virtio_pcm_substream *vss);
  
-int virtsnd_pcm_msg_send(struct virtio_pcm_substream *vss);

+int virtsnd_pcm_msg_send(struct virtio_pcm_substream *vss, unsigned long 
offset,
+unsigned long bytes);
  
  unsigned int virtsnd_pcm_msg_pending_num(struct virtio_pcm_substream *vss);
  
diff --git a/sound/virtio/virtio_pcm_msg.c b/sound/virtio

[PATCH] vhost-vdpa: fix NULL pointer deref in _compat_vdpa_reset

2023-10-23 Thread Si-Wei Liu
As subject. There's a vhost_vdpa_reset() done earlier before
vhost_dev is initialized via vhost_dev_init(), ending up with
NULL pointer dereference. Fix is to check if vqs is initialized
before checking backend features and resetting the device.

  BUG: kernel NULL pointer dereference, address: 
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x) - not-present page
  PGD 0 P4D 0
  Oops:  [#1] SMP
  CPU: 3 PID: 1727 Comm: qemu-system-x86 Not tainted 6.6.0-rc6+ #2
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-
  a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:_compat_vdpa_reset+0x47/0xc0 [vhost_vdpa]
  Code: c7 c7 fb 12 56 a0 4c 8d a5 b8 02 00 00 48 89 ea e8 7e b8 c4
  48 89 ee 48 c7 c7 19 13 56 a0 4c 8b ad b0 02 00 00 <48> 8b 00 49
  00 48 8b 80 88 45 00 00 48 c1 e8 08 48
  RSP: 0018:8881063c3c38 EFLAGS: 00010246
  RAX:  RBX: 8881074eb800 RCX: 
  RDX:  RSI: 888103ab4000 RDI: a0561319
  RBP: 888103ab4000 R08: dfff R09: 0001
  R10: 0003 R11: 7fecbac0 R12: 888103ab42b8
  R13: 888106dbe850 R14: 0003 R15: 8881074ebc18
  FS:  7f02fba6ef00() GS:5f8c()
  knlGS:
  CS:  0010 DS:  ES:  CR0: 80050033
  CR2:  CR3: 0001325e5003 CR4: 00372ea0
  DR0:  DR1:  DR2: 
  DR3:  DR6: fffe0ff0 DR7: 0400
  Call Trace:
   
   ? __die+0x1f/0x60
   ? page_fault_oops+0x14c/0x3b0
   ? exc_page_fault+0x74/0x140
   ? asm_exc_page_fault+0x22/0x30
   ? _compat_vdpa_reset+0x47/0xc0 [vhost_vdpa]
   ? _compat_vdpa_reset+0x32/0xc0 [vhost_vdpa]
   vhost_vdpa_open+0x55/0x270 [vhost_vdpa]
   ? sb_init_dio_done_wq+0x50/0x50
   chrdev_open+0xc0/0x210
   ? __unregister_chrdev+0x50/0x50
   do_dentry_open+0x1fc/0x4f0
   path_openat+0xc2d/0xf20
   do_filp_open+0xb4/0x160
   ? kmem_cache_alloc+0x3c/0x490
   do_sys_openat2+0x8d/0xc0
   __x64_sys_openat+0x6a/0xa0
   do_syscall_64+0x3c/0x80
   entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fixes: 10cbf8dfaf93 ("vhost-vdpa: clean iotlb map during reset for older 
userspace")
Reported-by: Dragos Tatulea 
Closes: 
https://lore.kernel.org/all/b4913f84-8b52-4d28-af51-8573dc361...@oracle.com/
Signed-off-by: Si-Wei Liu 
---
 drivers/vhost/vdpa.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 9ce40003793b..9a2343c45df0 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -232,9 +232,11 @@ static int _compat_vdpa_reset(struct vhost_vdpa *v)
struct vdpa_device *vdpa = v->vdpa;
u32 flags = 0;
 
-   flags |= !vhost_backend_has_feature(v->vdev.vqs[0],
-   VHOST_BACKEND_F_IOTLB_PERSIST) ?
-VDPA_RESET_F_CLEAN_MAP : 0;
+   if (v->vdev.vqs) {
+   flags |= !vhost_backend_has_feature(v->vdev.vqs[0],
+   
VHOST_BACKEND_F_IOTLB_PERSIST) ?
+VDPA_RESET_F_CLEAN_MAP : 0;
+   }
 
return vdpa_reset(vdpa, flags);
 }
-- 
2.39.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 5/7] vhost-vdpa: clean iotlb map during reset for older userspace

2023-10-23 Thread Si-Wei Liu

(+ linux-next)

Hi Michael,

Dragos reported below oops for which I have a fix at hand (having it 
fully tested), ready to be posted to linux-next. Please let me know if 
you want me to respin the original patch series, or you would think it'd 
be fine to fix it on top.


On 10/23/2023 11:59 AM, Dragos Tatulea wrote:

On Sat, 2023-10-21 at 02:25 -0700, Si-Wei Liu wrote:

Using .compat_reset op from the previous patch, the buggy .reset
behaviour can be kept as-is on older userspace apps, which don't ack the
IOTLB_PERSIST backend feature. As this compatibility quirk is limited to
those drivers that used to be buggy in the past, it won't affect change
the behaviour or affect ABI on the setups with API compliant driver.

The separation of .compat_reset from the regular .reset allows
vhost-vdpa able to know which driver had broken behaviour before, so it
can apply the corresponding compatibility quirk to the individual driver
whenever needed.  Compared to overloading the existing .reset with
flags, .compat_reset won't cause any extra burden to the implementation
of every compliant driver.

Signed-off-by: Si-Wei Liu 
---
  drivers/vhost/vdpa.c | 17 +
  drivers/virtio/virtio_vdpa.c |  2 +-
  include/linux/vdpa.h |  7 +--
  3 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index acc7c74ba7d6..9ce40003793b 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -227,13 +227,22 @@ static void vhost_vdpa_unsetup_vq_irq(struct vhost_vdpa
*v, u16 qid)
 irq_bypass_unregister_producer(&vq->call_ctx.producer);
  }
  
-static int vhost_vdpa_reset(struct vhost_vdpa *v)

+static int _compat_vdpa_reset(struct vhost_vdpa *v)
  {
 struct vdpa_device *vdpa = v->vdpa;
+   u32 flags = 0;
  
-   v->in_batch = 0;

+   flags |= !vhost_backend_has_feature(v->vdev.vqs[0],
+   VHOST_BACKEND_F_IOTLB_PERSIST) ?
+    VDPA_RESET_F_CLEAN_MAP : 0;

Hi Si-Wei,

I am getting a Oops due to the vqs not being initialized here. Here's how it it
looks like:

[   37.817075] BUG: kernel NULL pointer dereference, address: 
[   37.817674] #PF: supervisor read access in kernel mode
[   37.818150] #PF: error_code(0x) - not-present page
[   37.818615] PGD 0 P4D 0
[   37.818893] Oops:  [#1] SMP
[   37.819223] CPU: 3 PID: 1727 Comm: qemu-system-x86 Not tainted 6.6.0-rc6+ #2
[   37.819829] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-
1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[   37.820791] RIP: 0010:_compat_vdpa_reset+0x47/0xc0 [vhost_vdpa]
[   37.821316] Code: c7 c7 fb 12 56 a0 4c 8d a5 b8 02 00 00 48 89 ea e8 7e b8 c4
e0 48 8b 43 28 48 89 ee 48 c7 c7 19 13 56 a0 4c 8b ad b0 02 00 00 <48> 8b 00 49
8b 95 d8 00 00 00 48 8b 80 88 45 00 00 48 c1 e8 08 48
[   37.822811] RSP: 0018:8881063c3c38 EFLAGS: 00010246
[   37.823285] RAX:  RBX: 8881074eb800 RCX: 
[   37.823893] RDX:  RSI: 888103ab4000 RDI: a0561319
[   37.824506] RBP: 888103ab4000 R08: dfff R09: 0001
[   37.825116] R10: 0003 R11: 7fecbac0 R12: 888103ab42b8
[   37.825721] R13: 888106dbe850 R14: 0003 R15: 8881074ebc18
[   37.826326] FS:  7f02fba6ef00() GS:5f8c()
knlGS:
[   37.827035] CS:  0010 DS:  ES:  CR0: 80050033
[   37.827552] CR2:  CR3: 0001325e5003 CR4: 00372ea0
[   37.828162] DR0:  DR1:  DR2: 
[   37.828772] DR3:  DR6: fffe0ff0 DR7: 0400
[   37.829381] Call Trace:
[   37.829660]  
[   37.829911]  ? __die+0x1f/0x60
[   37.830234]  ? page_fault_oops+0x14c/0x3b0
[   37.830623]  ? exc_page_fault+0x74/0x140
[   37.830999]  ? asm_exc_page_fault+0x22/0x30
[   37.831402]  ? _compat_vdpa_reset+0x47/0xc0 [vhost_vdpa]
[   37.831888]  ? _compat_vdpa_reset+0x32/0xc0 [vhost_vdpa]
[   37.832366]  vhost_vdpa_open+0x55/0x270 [vhost_vdpa]
[   37.832821]  ? sb_init_dio_done_wq+0x50/0x50
[   37.833225]  chrdev_open+0xc0/0x210
[   37.833582]  ? __unregister_chrdev+0x50/0x50
[   37.833990]  do_dentry_open+0x1fc/0x4f0
[   37.834363]  path_openat+0xc2d/0xf20
[   37.834721]  do_filp_open+0xb4/0x160
[   37.835082]  ? kmem_cache_alloc+0x3c/0x490
[   37.835474]  do_sys_openat2+0x8d/0xc0
[   37.835834]  __x64_sys_openat+0x6a/0xa0
[   37.836208]  do_syscall_64+0x3c/0x80
[   37.836564]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[   37.837021] RIP: 0033:0x7f02fcc2c085
[   37.837378] Code: 8b 55 d0 48 89 45 b0 75 a0 44 89 55 9c e8 63 7d f8 ff 44 8b
55 9c 89 da 4c 89 e6 41 89 c0 bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0
ff ff 77 33 44 89 c7 89 45 9c e8 b8 7d f8 ff 8b 45 9c
[   37.838891] RSP: 002b:7ffdea3c8cc0 EFLAGS: 0293 ORIG_RAX:
0101
[   37.839571] RAX: ffda R

Re: [PATCH v4 0/7] vdpa: decouple reset of iotlb mapping from device reset

2023-10-23 Thread Si-Wei Liu



On 10/22/2023 8:51 PM, Jason Wang wrote:

Hi Si-Wei:

On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu  wrote:

In order to reduce needlessly high setup and teardown cost
of iotlb mapping during live migration, it's crucial to
decouple the vhost-vdpa iotlb abstraction from the virtio
device life cycle, i.e. iotlb mappings should be left
intact across virtio device reset [1]. For it to work, the
on-chip IOMMU parent device could implement a separate
.reset_map() operation callback to restore 1:1 DMA mapping
without having to resort to the .reset() callback, the
latter of which is mainly used to reset virtio device state.
This new .reset_map() callback will be invoked only before
the vhost-vdpa driver is to be removed and detached from
the vdpa bus, such that other vdpa bus drivers, e.g.
virtio-vdpa, can start with 1:1 DMA mapping when they
are attached. For the context, those on-chip IOMMU parent
devices, create the 1:1 DMA mapping at vdpa device creation,
and they would implicitly destroy the 1:1 mapping when
the first .set_map or .dma_map callback is invoked.

This patchset is rebased on top of the latest vhost tree.

[1] Reducing vdpa migration downtime because of memory pin / maps
https://www.mail-archive.com/qemu-devel@nongnu.org/msg953755.html

---
v4:
- Rework compatibility using new .compat_reset driver op

I still think having a set_backend_feature()
This will overload backend features with the role of carrying over 
compatibility quirks, which I tried to avoid from. While I think the 
.compat_reset from the v4 code just works with the backend features 
acknowledgement (and maybe others as well) to determine, but not 
directly tie it to backend features itself. These two have different 
implications in terms of requirement, scope and maintaining/deprecation, 
better to cope with compat quirks in explicit and driver visible way.



  or reset_map(clean=true) might be better.
An explicit op might be marginally better in driver writer's point of 
view. Compliant driver doesn't have to bother asserting clean_map never 
be true so their code would never bother dealing with this case, as 
explained in the commit log for patch 5 "vhost-vdpa: clean iotlb map 
during reset for older userspace":


"
    The separation of .compat_reset from the regular .reset allows
    vhost-vdpa able to know which driver had broken behavior before, so it
    can apply the corresponding compatibility quirk to the individual 
driver

    whenever needed.  Compared to overloading the existing .reset with
    flags, .compat_reset won't cause any extra burden to the implementation
    of every compliant driver.
"


  As it tries hard to not introduce new stuff on the bus.
Honestly I don't see substantial difference between these other than the 
color. There's no single best solution that stands out among the 3. And 
I assume you already noticed it from all the above 3 approaches will 
have to go with backend features negotiation, that the 1st vdpa reset 
before backend feature negotiation will use the compliant version of 
.reset that doesn't clean up the map. While I don't think this nuance 
matters much to existing older userspace apps, as the maps should 
already get cleaned by previous process in vhost_vdpa_cleanup(), but if 
bug-for-bug behavioral compatibility is what you want, module parameter 
will be the single best answer.


Regards,
-Siwei


But we can listen to others for sure.

Thanks



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [GIT PULL] virtio: last minute fixes

2023-10-23 Thread pr-tracker-bot
The pull request you sent on Mon, 23 Oct 2023 01:02:07 -0400:

> https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/7c14564010fc1d0f16ca7d39b0ff948b43344209

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] virtio_pci: Don't make an extra copy of cpu affinity mask

2023-10-23 Thread Jakub Sitnicki via Virtualization
On Thu, Oct 19, 2023 at 08:55 PM +08, Xuan Zhuo wrote:
> On Thu, 19 Oct 2023 12:16:24 +0200, Jakub Sitnicki  
> wrote:
>> Since commit 19e226e8cc5d ("virtio: Make vp_set_vq_affinity() take a
>> mask.") it is actually not needed to have a local copy of the cpu mask.
>
>
> Could you give more info to prove this?
>
> If you are right, I think you should delete all code about 
> msix_affinity_masks?

Sorry for the late reply. I've been away.

It looks that msix_affinity_masks became unused - intentionally - in
2015, after commit 210d150e1f5d ("virtio_pci: Clear stale cpumask when
setting irq affinity") [1].

Originally introduced in 2012 in commit 75a0a52be3c2 ("virtio: introduce
an API to set affinity for a virtqueue") [2]. As I understand, it was
meant to make it possible to set VQ affinity to more than once CPU.

Now that we can pass a CPU mask, listing all CPUs, to set_vq_affinity,
msix_affinity_masks seems to no longer have a purpose.

So, IMO, you're right. We can remove it.

Happy to do that in a follow up series.

That is - if you're okay with these two patches in the current form.

Thanks for reviewing.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=210d150e1f5da506875e376422ba31ead2d49621
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75a0a52be3c27b58654fbed2c8f2ff401482b9a4
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V1 vfio 0/9] Introduce a vfio driver over virtio devices

2023-10-23 Thread Alex Williamson
On Mon, 23 Oct 2023 13:20:43 -0300
Jason Gunthorpe  wrote:

> On Mon, Oct 23, 2023 at 10:09:13AM -0600, Alex Williamson wrote:
> > On Mon, 23 Oct 2023 12:42:57 -0300
> > Jason Gunthorpe  wrote:
> >   
> > > On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote:
> > >   
> > > > > Alex,
> > > > > Are you fine to leave the provisioning of the VF including the 
> > > > > control 
> > > > > of its transitional capability in the device hands as was suggested 
> > > > > by 
> > > > > Jason ?
> > > > 
> > > > If this is the standard we're going to follow, ie. profiling of a
> > > > device is expected to occur prior to the probe of the vfio-pci variant
> > > > driver, then we should get the out-of-tree NVIDIA vGPU driver on board
> > > > with this too.
> > > 
> > > Those GPU drivers are using mdev not vfio-pci..  
> > 
> > The SR-IOV mdev vGPUs rely on the IOMMU backing device support which
> > was removed from upstream.
> 
> It wasn't, but it changed forms.
> 
> mdev is a sysfs framework for managing lifecycle with GUIDs only.
> 
> The thing using mdev can call vfio_register_emulated_iommu_dev() or
> vfio_register_group_dev(). 
> 
> It doesn't matter to the mdev stuff.
> 
> The thing using mdev is responsible to get the struct device to pass
> to vfio_register_group_dev()

Are we describing what can be done (possibly limited to out-of-tree
drivers) or what should be done and would be accepted upstream?

I'm under the impression that mdev has been redefined to be more
narrowly focused for emulated IOMMU devices and that devices based
around a PCI VF should be making use of a vfio-pci variant driver.

Are you suggesting it's the vendor's choice based on whether they want
the mdev lifecycle support?

We've defined certain aspects of the vfio-mdev interface as only
available for emulated IOMMU devices, ex. page pinning.  Thanks,

Alex

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2] vsock/virtio: initialize the_virtio_vsock before using VQs

2023-10-23 Thread Stefano Garzarella

On Mon, Oct 23, 2023 at 06:36:21PM +0300, Alexandru Matei wrote:

On 10/23/2023 6:13 PM, Stefano Garzarella wrote:

On Mon, Oct 23, 2023 at 05:59:45PM +0300, Alexandru Matei wrote:

On 10/23/2023 5:52 PM, Alexandru Matei wrote:

On 10/23/2023 5:29 PM, Stefano Garzarella wrote:

On Mon, Oct 23, 2023 at 05:08:33PM +0300, Alexandru Matei wrote:

Once VQs are filled with empty buffers and we kick the host,
it can send connection requests.  If 'the_virtio_vsock' is not
initialized before, replies are silently dropped and do not reach the host.

Fixes: 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free on 
the_virtio_vsock")
Signed-off-by: Alexandru Matei 
---
v2:
- split virtio_vsock_vqs_init in vqs_init and vqs_fill and moved
 the_virtio_vsock initialization after vqs_init

net/vmw_vsock/virtio_transport.c | 9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index e95df847176b..92738d1697c1 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -559,6 +559,11 @@ static int virtio_vsock_vqs_init(struct virtio_vsock 
*vsock)
vsock->tx_run = true;
mutex_unlock(&vsock->tx_lock);

+    return 0;
+}
+
+static void virtio_vsock_vqs_fill(struct virtio_vsock *vsock)


What about renaming this function in virtio_vsock_vqs_start() and move also the 
setting of `tx_run` here?


It works but in this case we also need to move rcu_assign_pointer in 
virtio_vsock_vqs_start(),
the assignment needs to be right after setting tx_run to true and before 
filling the VQs.


Why?

If `rx_run` is false, we shouldn't need to send replies to the host IIUC.

If we need this instead, please add a comment in the code, but also in the 
commit, because it's not clear why.



We need rcu_assign_pointer after setting tx_run to true for connections 
that are initiated from the guest -> host.
virtio_transport_connect() calls virtio_transport_send_pkt().  Once 
'the_virtio_vsock' is initialized, virtio_transport_send_pkt() will 
queue the packet,

but virtio_transport_send_pkt_work() will exit if tx_run is false.


Okay, but in this case we could safely queue &vsock->send_pkt_work after 
finishing initialization to send those packets queued earlier.


In the meantime I'll try to see if we can leave the initialization of 
`the_virtio_vsock` as the ulitmate step and maybe go out first in the 
workers if it's not set.


That way just queue all the workers after everything is done and we 
should be fine.








And if we move rcu_assign_pointer then there is no need to split the function 
in two,
We can move rcu_assign_pointer() directly inside virtio_vsock_vqs_init() after 
setting tx_run.


Yep, this could be another option, but we need to change the name of that 
function in this case.



OK, how does virtio_vsock_vqs_setup() sound?


Or virtio_vsock_start() (without vqs)

Stefano




Stefano





Thanks,
Stefano


+{
mutex_lock(&vsock->rx_lock);
virtio_vsock_rx_fill(vsock);
vsock->rx_run = true;
@@ -568,8 +573,6 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock)
virtio_vsock_event_fill(vsock);
vsock->event_run = true;
mutex_unlock(&vsock->event_lock);
-
-    return 0;
}

static void virtio_vsock_vqs_del(struct virtio_vsock *vsock)
@@ -664,6 +667,7 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
    goto out;

rcu_assign_pointer(the_virtio_vsock, vsock);
+    virtio_vsock_vqs_fill(vsock);

mutex_unlock(&the_virtio_vsock_mutex);

@@ -736,6 +740,7 @@ static int virtio_vsock_restore(struct virtio_device *vdev)
    goto out;

rcu_assign_pointer(the_virtio_vsock, vsock);
+    virtio_vsock_vqs_fill(vsock);

out:
mutex_unlock(&the_virtio_vsock_mutex);
-- 
2.34.1











___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V1 vfio 0/9] Introduce a vfio driver over virtio devices

2023-10-23 Thread Alex Williamson
On Mon, 23 Oct 2023 12:42:57 -0300
Jason Gunthorpe  wrote:

> On Mon, Oct 23, 2023 at 09:33:23AM -0600, Alex Williamson wrote:
> 
> > > Alex,
> > > Are you fine to leave the provisioning of the VF including the control 
> > > of its transitional capability in the device hands as was suggested by 
> > > Jason ?  
> > 
> > If this is the standard we're going to follow, ie. profiling of a
> > device is expected to occur prior to the probe of the vfio-pci variant
> > driver, then we should get the out-of-tree NVIDIA vGPU driver on board
> > with this too.  
> 
> Those GPU drivers are using mdev not vfio-pci..

The SR-IOV mdev vGPUs rely on the IOMMU backing device support which
was removed from upstream.  They only exist in the mdev form on
downstreams which have retained this interface for compatibility and
continuity.  I'm not aware of any other means by which the SR-IOV RID
can be used in the mdev model, therefore only the pre-SR-IOV GPUs
should continue to use the mdev interface.

> mdev doesn't have a way in its uapi to configure the mdev before it is
> created.

Of course.  Thanks,

Alex

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V1 vfio 0/9] Introduce a vfio driver over virtio devices

2023-10-23 Thread Alex Williamson
On Sun, 22 Oct 2023 11:20:31 +0300
Yishai Hadas  wrote:

> On 17/10/2023 16:42, Yishai Hadas wrote:
> > This series introduce a vfio driver over virtio devices to support the
> > legacy interface functionality for VFs.
> >
> > Background, from the virtio spec [1].
> > 
> > In some systems, there is a need to support a virtio legacy driver with
> > a device that does not directly support the legacy interface. In such
> > scenarios, a group owner device can provide the legacy interface
> > functionality for the group member devices. The driver of the owner
> > device can then access the legacy interface of a member device on behalf
> > of the legacy member device driver.
> >
> > For example, with the SR-IOV group type, group members (VFs) can not
> > present the legacy interface in an I/O BAR in BAR0 as expected by the
> > legacy pci driver. If the legacy driver is running inside a virtual
> > machine, the hypervisor executing the virtual machine can present a
> > virtual device with an I/O BAR in BAR0. The hypervisor intercepts the
> > legacy driver accesses to this I/O BAR and forwards them to the group
> > owner device (PF) using group administration commands.
> > 
> >
> > The first 6 patches are in the virtio area and handle the below:
> > - Fix common config map for modern device as was reported by Michael 
> > Tsirkin.
> > - Introduce the admin virtqueue infrastcture.
> > - Expose the layout of the commands that should be used for
> >supporting the legacy access.
> > - Expose APIs to enable upper layers as of vfio, net, etc
> >to execute admin commands.
> >
> > The above follows the virtio spec that was lastly accepted in that area
> > [1].
> >
> > The last 3 patches are in the vfio area and handle the below:
> > - Expose some APIs from vfio/pci to be used by the vfio/virtio driver.
> > - Introduce a vfio driver over virtio devices to support the legacy
> >interface functionality for VFs.
> >
> > The series was tested successfully over virtio-net VFs in the host,
> > while running in the guest both modern and legacy drivers.
> >
> > [1]
> > https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c
> >
> > Changes from V0: 
> > https://www.spinics.net/lists/linux-virtualization/msg63802.html
> >
> > Virtio:
> > - Fix the common config map size issue that was reported by Michael
> >Tsirkin.
> > - Do not use vp_dev->vqs[] array upon vp_del_vqs() as was asked by
> >Michael, instead skip the AQ specifically.
> > - Move admin vq implementation into virtio_pci_modern.c as was asked by
> >Michael.
> > - Rename structure virtio_avq to virtio_pci_admin_vq and some extra
> >corresponding renames.
> > - Remove exported symbols virtio_pci_vf_get_pf_dev(),
> >virtio_admin_cmd_exec() as now callers are local to the module.
> > - Handle inflight commands as part of the device reset flow.
> > - Introduce APIs per admin command in virtio-pci as was asked by Michael.
> >
> > Vfio:
> > - Change to use EXPORT_SYMBOL_GPL instead of EXPORT_SYMBOL for
> >vfio_pci_core_setup_barmap() and vfio_pci_iowrite#xxx() as pointed by
> >Alex.
> > - Drop the intermediate patch which prepares the commands and calls the
> >generic virtio admin command API (i.e. virtio_admin_cmd_exec()).
> > - Instead, call directly to the new APIs per admin command that are
> >exported from Virtio - based on Michael's request.
> > - Enable only virtio-net as part of the pci_device_id table to enforce
> >upon binding only what is supported as suggested by Alex.
> > - Add support for byte-wise access (read/write) over the device config
> >region as was asked by Alex.
> > - Consider whether MSIX is practically enabled/disabled to choose the
> >right opcode upon issuing read/write admin command, as mentioned
> >by Michael.
> > - Move to use VIRTIO_PCI_CONFIG_OFF instead of adding some new defines
> >as was suggested by Michael.
> > - Set the '.close_device' op to vfio_pci_core_close_device() as was
> >pointed by Alex.
> > - Adapt to Vfio multi-line comment style in a few places.
> > - Add virtualization@lists.linux-foundation.org in the MAINTAINERS file
> >to be CCed for the new driver as was suggested by Jason.
> >
> > Yishai
> >
> > Feng Liu (5):
> >virtio-pci: Fix common config map for modern device
> >virtio: Define feature bit for administration virtqueue
> >virtio-pci: Introduce admin virtqueue
> >virtio-pci: Introduce admin command sending function
> >virtio-pci: Introduce admin commands
> >
> > Yishai Hadas (4):
> >virtio-pci: Introduce APIs to execute legacy IO admin commands
> >vfio/pci: Expose vfio_pci_core_setup_barmap()
> >vfio/pci: Expose vfio_pci_iowrite/read##size()
> >vfio/virtio: Introduce a vfio driver over virtio devices
> >
> >   MAINTAINERS 

Re: [PATCH v4 4/4] vduse: Add LSM hooks to check Virtio device type

2023-10-23 Thread Casey Schaufler
On 10/23/2023 12:28 AM, Maxime Coquelin wrote:
>
>
> On 10/21/23 00:20, Casey Schaufler wrote:
>> On 10/20/2023 8:58 AM, Maxime Coquelin wrote:
>>> This patch introduces LSM hooks for devices creation,
>>> destruction and opening operations, checking the
>>> application is allowed to perform these operations for
>>> the Virtio device type.
>>
>> Why do you think that there needs to be a special LSM check for virtio
>> devices? What can't existing device attributes be used?
>
> Michael asked for a way for SELinux to allow/prevent the creation of
> some types of devices [0].
>
> A device is created using ioctl() on VDUSE control chardev. Its type is
> specified via a field in the structure passed in argument.
>
> I didn't see other way than adding dedicated LSM hooks to achieve this,
> but it is possible that their is a better way to do it?

At the very least the hook should be made more general, and I'd have to
see a proposal before commenting on that. security_dev_destroy(dev) might
be a better approach. If there's reason to control destruction of vduse
devices it's reasonable to assume that there are other devices with the
same or similar properties.

Since SELinux is your target use case, can you explain why you can't
create SELinux policy to enforce the restrictions you're after? I believe
(but can be proven wrong, of course) that SELinux has mechanism for dealing
with controls on ioctls.


>
> Thanks,
> Maxime
>
> [0]:
> https://lore.kernel.org/all/20230829130430-mutt-send-email-...@kernel.org/
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2] vsock/virtio: initialize the_virtio_vsock before using VQs

2023-10-23 Thread Stefano Garzarella

On Mon, Oct 23, 2023 at 05:59:45PM +0300, Alexandru Matei wrote:

On 10/23/2023 5:52 PM, Alexandru Matei wrote:

On 10/23/2023 5:29 PM, Stefano Garzarella wrote:

On Mon, Oct 23, 2023 at 05:08:33PM +0300, Alexandru Matei wrote:

Once VQs are filled with empty buffers and we kick the host,
it can send connection requests.  If 'the_virtio_vsock' is not
initialized before, replies are silently dropped and do not reach the host.

Fixes: 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free on 
the_virtio_vsock")
Signed-off-by: Alexandru Matei 
---
v2:
- split virtio_vsock_vqs_init in vqs_init and vqs_fill and moved
 the_virtio_vsock initialization after vqs_init

net/vmw_vsock/virtio_transport.c | 9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index e95df847176b..92738d1697c1 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -559,6 +559,11 @@ static int virtio_vsock_vqs_init(struct virtio_vsock 
*vsock)
vsock->tx_run = true;
mutex_unlock(&vsock->tx_lock);

+    return 0;
+}
+
+static void virtio_vsock_vqs_fill(struct virtio_vsock *vsock)


What about renaming this function in virtio_vsock_vqs_start() and move also the 
setting of `tx_run` here?


It works but in this case we also need to move rcu_assign_pointer in 
virtio_vsock_vqs_start(),
the assignment needs to be right after setting tx_run to true and before 
filling the VQs.


Why?

If `rx_run` is false, we shouldn't need to send replies to the host 
IIUC.


If we need this instead, please add a comment in the code, but also in 
the commit, because it's not clear why.






And if we move rcu_assign_pointer then there is no need to split the function 
in two,
We can move rcu_assign_pointer() directly inside virtio_vsock_vqs_init() after 
setting tx_run.


Yep, this could be another option, but we need to change the name of 
that function in this case.


Stefano





Thanks,
Stefano


+{
mutex_lock(&vsock->rx_lock);
virtio_vsock_rx_fill(vsock);
vsock->rx_run = true;
@@ -568,8 +573,6 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock)
virtio_vsock_event_fill(vsock);
vsock->event_run = true;
mutex_unlock(&vsock->event_lock);
-
-    return 0;
}

static void virtio_vsock_vqs_del(struct virtio_vsock *vsock)
@@ -664,6 +667,7 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
    goto out;

rcu_assign_pointer(the_virtio_vsock, vsock);
+    virtio_vsock_vqs_fill(vsock);

mutex_unlock(&the_virtio_vsock_mutex);

@@ -736,6 +740,7 @@ static int virtio_vsock_restore(struct virtio_device *vdev)
    goto out;

rcu_assign_pointer(the_virtio_vsock, vsock);
+    virtio_vsock_vqs_fill(vsock);

out:
mutex_unlock(&the_virtio_vsock_mutex);
-- 
2.34.1







___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3] ALSA: virtio: use ack callback

2023-10-23 Thread Matias Ezequiel Vara Larsen
This commit uses the ack() callback to determine when a buffer has been
updated, then exposes it to guest.

The current mechanism splits a dma buffer into descriptors that are
exposed to the device. This dma buffer is shared with the user
application. When the device consumes a buffer, the driver moves the
request from the used ring to available ring.

The driver exposes the buffer to the device without knowing if the
content has been updated from the user. The section 2.8.21.1 of the
virtio spec states that: "The device MAY access the descriptor chains
the driver created and the memory they refer to immediately". If the
device picks up buffers from the available ring just after it is
notified, it happens that the content may be old.

When the ack() callback is invoked, the driver exposes only the buffers
that have already been updated, i.e., enqueued in the available ring.
Thus, the device always picks up a buffer that is updated.

For capturing, the driver starts by exposing all the available buffers
to device. After device updates the content of a buffer, it enqueues it
in the used ring. It is only after the ack() for capturing is issued
that the driver re-enqueues the buffer in the available ring.

Co-developed-by: Anton Yakovlev 
Signed-off-by: Anton Yakovlev 
Signed-off-by: Matias Ezequiel Vara Larsen 
---
Changelog:
v2 -> v3:
 * Use ack() callback instead of copy()/fill_silence()
 * Change commit name
 * Rewrite commit message
 * Remove virtsnd_pcm_msg_send_locked()
 * Use single callback for both capture and playback
 * Fix kernel test robot warnings regarding documentation
 * v2 patch at:
   https://lore.kernel.org/lkml/87y1fzkq8c.wl-ti...@suse.de/T/
v1 -> v2:
 * Use snd_pcm_set_managed_buffer_all()for buffer allocation/freeing.
 * Make virtsnd_pcm_msg_send() generic by specifying the offset and size
   for the modified part of the buffer; this way no assumptions need to
   be made.
 * Disable SNDRV_PCM_INFO_NO_REWINDS since now only sequential
   reading/writing of frames is supported.
 * Correct comment at virtsnd_pcm_msg_send().
 * v1 patch at:
   https://lore.kernel.org/lkml/20231016151000.GE119987@fedora/t/

 sound/virtio/virtio_pcm.c |  1 +
 sound/virtio/virtio_pcm.h |  6 ++-
 sound/virtio/virtio_pcm_msg.c | 80 ---
 sound/virtio/virtio_pcm_ops.c | 30 +++--
 4 files changed, 79 insertions(+), 38 deletions(-)

diff --git a/sound/virtio/virtio_pcm.c b/sound/virtio/virtio_pcm.c
index c10d91fff2fb..9cc5a95b4913 100644
--- a/sound/virtio/virtio_pcm.c
+++ b/sound/virtio/virtio_pcm.c
@@ -124,6 +124,7 @@ static int virtsnd_pcm_build_hw(struct virtio_pcm_substream 
*vss,
values = le64_to_cpu(info->formats);
 
vss->hw.formats = 0;
+   vss->appl_ptr = 0;
 
for (i = 0; i < ARRAY_SIZE(g_v2a_format_map); ++i)
if (values & (1ULL << i)) {
diff --git a/sound/virtio/virtio_pcm.h b/sound/virtio/virtio_pcm.h
index 062eb8e8f2cf..ea3c2845ae9b 100644
--- a/sound/virtio/virtio_pcm.h
+++ b/sound/virtio/virtio_pcm.h
@@ -27,6 +27,7 @@ struct virtio_pcm_msg;
  *substream operators.
  * @buffer_bytes: Current buffer size in bytes.
  * @hw_ptr: Substream hardware pointer value in bytes [0 ... buffer_bytes).
+ * @appl_ptr: Substream application pointer value in bytes [0 ... 
buffer_bytes).
  * @xfer_enabled: Data transfer state (0 - off, 1 - on).
  * @xfer_xrun: Data underflow/overflow state (0 - no xrun, 1 - xrun).
  * @stopped: True if the substream is stopped and must be released on the 
device
@@ -51,13 +52,13 @@ struct virtio_pcm_substream {
spinlock_t lock;
size_t buffer_bytes;
size_t hw_ptr;
+   size_t appl_ptr;
bool xfer_enabled;
bool xfer_xrun;
bool stopped;
bool suspended;
struct virtio_pcm_msg **msgs;
unsigned int nmsgs;
-   int msg_last_enqueued;
unsigned int msg_count;
wait_queue_head_t msg_empty;
 };
@@ -117,7 +118,8 @@ int virtsnd_pcm_msg_alloc(struct virtio_pcm_substream *vss,
 
 void virtsnd_pcm_msg_free(struct virtio_pcm_substream *vss);
 
-int virtsnd_pcm_msg_send(struct virtio_pcm_substream *vss);
+int virtsnd_pcm_msg_send(struct virtio_pcm_substream *vss, unsigned long 
offset,
+unsigned long bytes);
 
 unsigned int virtsnd_pcm_msg_pending_num(struct virtio_pcm_substream *vss);
 
diff --git a/sound/virtio/virtio_pcm_msg.c b/sound/virtio/virtio_pcm_msg.c
index aca2dc1989ba..106e8e847746 100644
--- a/sound/virtio/virtio_pcm_msg.c
+++ b/sound/virtio/virtio_pcm_msg.c
@@ -155,7 +155,6 @@ int virtsnd_pcm_msg_alloc(struct virtio_pcm_substream *vss,
sizeof(msg->xfer));
sg_init_one(&msg->sgs[PCM_MSG_SG_STATUS], &msg->status,
sizeof(msg->status));
-   msg->length = period_bytes;
virtsnd_pcm_sg_from(&msg->sgs[PCM_MSG_SG_DATA], sg_num, data,
period_bytes);
 
@@ -18

Re: [PATCH v2] vsock/virtio: initialize the_virtio_vsock before using VQs

2023-10-23 Thread Stefano Garzarella

On Mon, Oct 23, 2023 at 05:08:33PM +0300, Alexandru Matei wrote:

Once VQs are filled with empty buffers and we kick the host,
it can send connection requests.  If 'the_virtio_vsock' is not
initialized before, replies are silently dropped and do not reach the host.

Fixes: 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free on 
the_virtio_vsock")
Signed-off-by: Alexandru Matei 
---
v2:
- split virtio_vsock_vqs_init in vqs_init and vqs_fill and moved
 the_virtio_vsock initialization after vqs_init

net/vmw_vsock/virtio_transport.c | 9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index e95df847176b..92738d1697c1 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -559,6 +559,11 @@ static int virtio_vsock_vqs_init(struct virtio_vsock 
*vsock)
vsock->tx_run = true;
mutex_unlock(&vsock->tx_lock);

+   return 0;
+}
+
+static void virtio_vsock_vqs_fill(struct virtio_vsock *vsock)


What about renaming this function in virtio_vsock_vqs_start() and move 
also the setting of `tx_run` here?


Thanks,
Stefano


+{
mutex_lock(&vsock->rx_lock);
virtio_vsock_rx_fill(vsock);
vsock->rx_run = true;
@@ -568,8 +573,6 @@ static int virtio_vsock_vqs_init(struct virtio_vsock *vsock)
virtio_vsock_event_fill(vsock);
vsock->event_run = true;
mutex_unlock(&vsock->event_lock);
-
-   return 0;
}

static void virtio_vsock_vqs_del(struct virtio_vsock *vsock)
@@ -664,6 +667,7 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
goto out;

rcu_assign_pointer(the_virtio_vsock, vsock);
+   virtio_vsock_vqs_fill(vsock);

mutex_unlock(&the_virtio_vsock_mutex);

@@ -736,6 +740,7 @@ static int virtio_vsock_restore(struct virtio_device *vdev)
goto out;

rcu_assign_pointer(the_virtio_vsock, vsock);
+   virtio_vsock_vqs_fill(vsock);

out:
mutex_unlock(&the_virtio_vsock_mutex);
--
2.34.1



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_ring: add an error code check in virtqueue_resize

2023-10-23 Thread Michael S. Tsirkin
On Mon, Oct 23, 2023 at 06:52:34PM +0800, Xuan Zhuo wrote:
> On Mon, 23 Oct 2023 17:52:02 +0800, Xuan Zhuo  
> wrote:
> > On Mon, 23 Oct 2023 17:50:46 +0800, Su Hui  wrote:
> > > On 2023/10/23 13:46, Xuan Zhuo wrote:
> > >  Well, what are the cases where it can happen practically?
> > > >>> Device error. Such as vp_active_vq()
> > > >>>
> > > >>> Thanks.
> > > >> Hmm interesting. OK. But do callers know to recover?
> > > > No.
> > > >
> > > > So I think WARN + broken is suitable.
> > > >
> > > > Thanks.
> > >  Sorry for the late, is the following code okay?
> > > 
> > >  @@ -2739,7 +2739,7 @@ int virtqueue_resize(struct virtqueue *_vq, 
> > >  u32 num,
> > >  void (*recycle)(struct virtqueue *vq, void 
> > >  *buf))
> > >  {
> > > struct vring_virtqueue *vq = to_vvq(_vq);
> > >  -   int err;
> > >  +   int err, err_reset;
> > > 
> > > if (num > vq->vq.num_max)
> > > return -E2BIG;
> > >  @@ -2759,7 +2759,15 @@ int virtqueue_resize(struct virtqueue *_vq, 
> > >  u32 num,
> > > else
> > > err = virtqueue_resize_split(_vq, num);
> > > 
> > >  -   return virtqueue_enable_after_reset(_vq);
> > >  +   err_reset = virtqueue_enable_after_reset(_vq);
> > >  +
> > >  +   if (err) {
> > > >>> No err.
> > > >>>
> > > >>> err is not important.
> > > >>> You can remove that.
> > > >> Emm, I'm a little confused that which code should I remove ?
> > > >>
> > > >>
> > > >> like this:
> > > >>if (vq->packed_ring)
> > > >>virtqueue_resize_packed(_vq, num);
> > > >>else
> > > >>virtqueue_resize_split(_vq, num);
> > > >>
> > > >> And we should set broken and warn inside 
> > > >> virtqueue_enable_after_reset()?
> > >
> > > In my opinion, we should return the error code of 
> > > virtqueue_resize_packed() / virtqueue_resize_split().
> > > But if this err is not important, this patch makes no sense.
> > > Maybe I misunderstand somewhere...
> > > If you think it's worth sending a patch, you can send it :).(I'm not 
> > > familiar with this code).
> >
> > OK.
> 
> Hi Michael,
> 
> The queue reset code is wrote with the CONFIG_VIRTIO_HARDEN_NOTIFICATION.
> 
> When we disable the vq, the broken is true until we re-enable it.
> 
> So when we re-enable it fail, the vq is broken status.
> 
> Normally, this just happens on the buggy device.
> So I think that is enough.
> 
> Thanks.

I don't know what to do about CONFIG_VIRTIO_HARDEN_NOTIFICATION.
It's known to be broken and it does not look like there's
active effort to revive it - should we just drop it for now?


> 
>   static int vp_modern_disable_vq_and_reset(struct virtqueue *vq)
>   {
>   [...]
> 
>   vp_modern_set_queue_reset(mdev, vq->index);
> 
>   [...]
> 
>   #ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION
> ->>   __virtqueue_break(vq);
>   #endif
> 
>   [...]
>   }
> 
>   static int vp_modern_enable_vq_after_reset(struct virtqueue *vq)
>   {
>   [...]
> 
>   if (vp_modern_get_queue_reset(mdev, index))
>   return -EBUSY;
> 
>   if (vp_modern_get_queue_enable(mdev, index))
>   return -EBUSY;
> 
>   err = vp_active_vq(vq, info->msix_vector);
>   if (err)
>   return err;
> 
>   }
> 
>   [...]
> 
>   #ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION
> ->>   __virtqueue_unbreak(vq);
>   #endif
> 
>   [...]
>   }
> 
> 
> 
> 
> >
> > Thanks.
> >
> >
> > >
> > > Thanks,
> > > Su Hui
> > >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_ring: add an error code check in virtqueue_resize

2023-10-23 Thread Michael S. Tsirkin
On Mon, Oct 23, 2023 at 05:52:02PM +0800, Xuan Zhuo wrote:
> On Mon, 23 Oct 2023 17:50:46 +0800, Su Hui  wrote:
> > On 2023/10/23 13:46, Xuan Zhuo wrote:
> >  Well, what are the cases where it can happen practically?
> > >>> Device error. Such as vp_active_vq()
> > >>>
> > >>> Thanks.
> > >> Hmm interesting. OK. But do callers know to recover?
> > > No.
> > >
> > > So I think WARN + broken is suitable.
> > >
> > > Thanks.
> >  Sorry for the late, is the following code okay?
> > 
> >  @@ -2739,7 +2739,7 @@ int virtqueue_resize(struct virtqueue *_vq, u32 
> >  num,
> >  void (*recycle)(struct virtqueue *vq, void 
> >  *buf))
> >  {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >  -   int err;
> >  +   int err, err_reset;
> > 
> > if (num > vq->vq.num_max)
> > return -E2BIG;
> >  @@ -2759,7 +2759,15 @@ int virtqueue_resize(struct virtqueue *_vq, u32 
> >  num,
> > else
> > err = virtqueue_resize_split(_vq, num);
> > 
> >  -   return virtqueue_enable_after_reset(_vq);
> >  +   err_reset = virtqueue_enable_after_reset(_vq);
> >  +
> >  +   if (err) {
> > >>> No err.
> > >>>
> > >>> err is not important.
> > >>> You can remove that.
> > >> Emm, I'm a little confused that which code should I remove ?
> > >>
> > >>
> > >> like this:
> > >>  if (vq->packed_ring)
> > >>  virtqueue_resize_packed(_vq, num);
> > >>  else
> > >>  virtqueue_resize_split(_vq, num);
> > >>
> > >> And we should set broken and warn inside virtqueue_enable_after_reset()?
> >
> > In my opinion, we should return the error code of virtqueue_resize_packed() 
> > / virtqueue_resize_split().
> > But if this err is not important, this patch makes no sense.
> > Maybe I misunderstand somewhere...
> > If you think it's worth sending a patch, you can send it :).(I'm not 
> > familiar with this code).
> 
> OK.
> 
> Thanks.

I would first try to recover by re-enabling.
If that fails we can set broken.


> 
> >
> > Thanks,
> > Su Hui
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_ring: add an error code check in virtqueue_resize

2023-10-23 Thread Xuan Zhuo
On Mon, 23 Oct 2023 17:52:02 +0800, Xuan Zhuo  
wrote:
> On Mon, 23 Oct 2023 17:50:46 +0800, Su Hui  wrote:
> > On 2023/10/23 13:46, Xuan Zhuo wrote:
> >  Well, what are the cases where it can happen practically?
> > >>> Device error. Such as vp_active_vq()
> > >>>
> > >>> Thanks.
> > >> Hmm interesting. OK. But do callers know to recover?
> > > No.
> > >
> > > So I think WARN + broken is suitable.
> > >
> > > Thanks.
> >  Sorry for the late, is the following code okay?
> > 
> >  @@ -2739,7 +2739,7 @@ int virtqueue_resize(struct virtqueue *_vq, u32 
> >  num,
> >  void (*recycle)(struct virtqueue *vq, void 
> >  *buf))
> >  {
> > struct vring_virtqueue *vq = to_vvq(_vq);
> >  -   int err;
> >  +   int err, err_reset;
> > 
> > if (num > vq->vq.num_max)
> > return -E2BIG;
> >  @@ -2759,7 +2759,15 @@ int virtqueue_resize(struct virtqueue *_vq, u32 
> >  num,
> > else
> > err = virtqueue_resize_split(_vq, num);
> > 
> >  -   return virtqueue_enable_after_reset(_vq);
> >  +   err_reset = virtqueue_enable_after_reset(_vq);
> >  +
> >  +   if (err) {
> > >>> No err.
> > >>>
> > >>> err is not important.
> > >>> You can remove that.
> > >> Emm, I'm a little confused that which code should I remove ?
> > >>
> > >>
> > >> like this:
> > >>  if (vq->packed_ring)
> > >>  virtqueue_resize_packed(_vq, num);
> > >>  else
> > >>  virtqueue_resize_split(_vq, num);
> > >>
> > >> And we should set broken and warn inside virtqueue_enable_after_reset()?
> >
> > In my opinion, we should return the error code of virtqueue_resize_packed() 
> > / virtqueue_resize_split().
> > But if this err is not important, this patch makes no sense.
> > Maybe I misunderstand somewhere...
> > If you think it's worth sending a patch, you can send it :).(I'm not 
> > familiar with this code).
>
> OK.

Hi Michael,

The queue reset code is wrote with the CONFIG_VIRTIO_HARDEN_NOTIFICATION.

When we disable the vq, the broken is true until we re-enable it.

So when we re-enable it fail, the vq is broken status.

Normally, this just happens on the buggy device.
So I think that is enough.

Thanks.


static int vp_modern_disable_vq_and_reset(struct virtqueue *vq)
{
[...]

vp_modern_set_queue_reset(mdev, vq->index);

[...]

#ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION
->> __virtqueue_break(vq);
#endif

[...]
}

static int vp_modern_enable_vq_after_reset(struct virtqueue *vq)
{
[...]

if (vp_modern_get_queue_reset(mdev, index))
return -EBUSY;

if (vp_modern_get_queue_enable(mdev, index))
return -EBUSY;

err = vp_active_vq(vq, info->msix_vector);
if (err)
return err;

}

[...]

#ifdef CONFIG_VIRTIO_HARDEN_NOTIFICATION
->> __virtqueue_unbreak(vq);
#endif

[...]
}




>
> Thanks.
>
>
> >
> > Thanks,
> > Su Hui
> >
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_ring: add an error code check in virtqueue_resize

2023-10-23 Thread Xuan Zhuo
On Mon, 23 Oct 2023 17:50:46 +0800, Su Hui  wrote:
> On 2023/10/23 13:46, Xuan Zhuo wrote:
>  Well, what are the cases where it can happen practically?
> >>> Device error. Such as vp_active_vq()
> >>>
> >>> Thanks.
> >> Hmm interesting. OK. But do callers know to recover?
> > No.
> >
> > So I think WARN + broken is suitable.
> >
> > Thanks.
>  Sorry for the late, is the following code okay?
> 
>  @@ -2739,7 +2739,7 @@ int virtqueue_resize(struct virtqueue *_vq, u32 
>  num,
>  void (*recycle)(struct virtqueue *vq, void *buf))
>  {
> struct vring_virtqueue *vq = to_vvq(_vq);
>  -   int err;
>  +   int err, err_reset;
> 
> if (num > vq->vq.num_max)
> return -E2BIG;
>  @@ -2759,7 +2759,15 @@ int virtqueue_resize(struct virtqueue *_vq, u32 
>  num,
> else
> err = virtqueue_resize_split(_vq, num);
> 
>  -   return virtqueue_enable_after_reset(_vq);
>  +   err_reset = virtqueue_enable_after_reset(_vq);
>  +
>  +   if (err) {
> >>> No err.
> >>>
> >>> err is not important.
> >>> You can remove that.
> >> Emm, I'm a little confused that which code should I remove ?
> >>
> >>
> >> like this:
> >>if (vq->packed_ring)
> >>virtqueue_resize_packed(_vq, num);
> >>else
> >>virtqueue_resize_split(_vq, num);
> >>
> >> And we should set broken and warn inside virtqueue_enable_after_reset()?
>
> In my opinion, we should return the error code of virtqueue_resize_packed() / 
> virtqueue_resize_split().
> But if this err is not important, this patch makes no sense.
> Maybe I misunderstand somewhere...
> If you think it's worth sending a patch, you can send it :).(I'm not familiar 
> with this code).

OK.

Thanks.


>
> Thanks,
> Su Hui
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v6 9/9] drm: Introduce documentation for hotspot properties

2023-10-23 Thread Pekka Paalanen
On Mon, 23 Oct 2023 09:46:13 +0200
Albert Esteve  wrote:

> From: Michael Banack 
> 
> To clarify the intent and reasoning behind the hotspot properties
> introduce userspace documentation that goes over cursor handling
> in para-virtualized environments.
> 
> The documentation is generic enough to not special case for any
> specific hypervisor and should apply equally to all.
> 
> Signed-off-by: Zack Rusin 


Hi,

the below doc text:

Acked-by: Pekka Paalanen 


Thanks,
pq


> ---
>  Documentation/gpu/drm-kms.rst |  6 
>  drivers/gpu/drm/drm_plane.c   | 58 ++-
>  2 files changed, 63 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst
> index a0c83fc481264..158cdcc9351f9 100644
> --- a/Documentation/gpu/drm-kms.rst
> +++ b/Documentation/gpu/drm-kms.rst
> @@ -577,6 +577,12 @@ Variable Refresh Properties
>  .. kernel-doc:: drivers/gpu/drm/drm_connector.c
> :doc: Variable refresh properties
>  
> +Cursor Hotspot Properties
> +---
> +
> +.. kernel-doc:: drivers/gpu/drm/drm_plane.c
> +   :doc: hotspot properties
> +
>  Existing KMS Properties
>  ---
>  
> diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
> index 1dc00ad4c33c3..f3f2eae83cca8 100644
> --- a/drivers/gpu/drm/drm_plane.c
> +++ b/drivers/gpu/drm/drm_plane.c
> @@ -230,6 +230,61 @@ static int create_in_format_blob(struct drm_device *dev, 
> struct drm_plane *plane
>   return 0;
>  }
>  
> +/**
> + * DOC: hotspot properties
> + *
> + * HOTSPOT_X: property to set mouse hotspot x offset.
> + * HOTSPOT_Y: property to set mouse hotspot y offset.
> + *
> + * When the plane is being used as a cursor image to display a mouse pointer,
> + * the "hotspot" is the offset within the cursor image where mouse events
> + * are expected to go.
> + *
> + * Positive values move the hotspot from the top-left corner of the cursor
> + * plane towards the right and bottom.
> + *
> + * Most display drivers do not need this information because the
> + * hotspot is not actually connected to anything visible on screen.
> + * However, this is necessary for display drivers like the para-virtualized
> + * drivers (eg qxl, vbox, virtio, vmwgfx), that are attached to a user 
> console
> + * with a mouse pointer.  Since these consoles are often being remoted over a
> + * network, they would otherwise have to wait to display the pointer 
> movement to
> + * the user until a full network round-trip has occurred.  New mouse events 
> have
> + * to be sent from the user's console, over the network to the virtual input
> + * devices, forwarded to the desktop for processing, and then the cursor 
> plane's
> + * position can be updated and sent back to the user's console over the 
> network.
> + * Instead, with the hotspot information, the console can anticipate the new
> + * location, and draw the mouse cursor there before the confirmation comes 
> in.
> + * To do that correctly, the user's console must be able predict how the
> + * desktop will process mouse events, which normally requires the desktop's
> + * mouse topology information, ie where each CRTC sits in the mouse 
> coordinate
> + * space.  This is typically sent to the para-virtualized drivers using some
> + * driver-specific method, and the driver then forwards it to the console by
> + * way of the virtual display device or hypervisor.
> + *
> + * The assumption is generally made that there is only one cursor plane being
> + * used this way at a time, and that the desktop is feeding all mouse devices
> + * into the same global pointer.  Para-virtualized drivers that require this
> + * should only be exposing a single cursor plane, or find some other way
> + * to coordinate with a userspace desktop that supports multiple pointers.
> + * If the hotspot properties are set, the cursor plane is therefore assumed 
> to be
> + * used only for displaying a mouse cursor image, and the position of the 
> combined
> + * cursor plane + offset can therefore be used for coordinating with input 
> from a
> + * mouse device.
> + *
> + * The cursor will then be drawn either at the location of the plane in the 
> CRTC
> + * console, or as a free-floating cursor plane on the user's console
> + * corresponding to their desktop mouse position.
> + *
> + * DRM clients which would like to work correctly on drivers which expose
> + * hotspot properties should advertise DRM_CLIENT_CAP_CURSOR_PLANE_HOTSPOT.
> + * Setting this property on drivers which do not special case
> + * cursor planes will return EOPNOTSUPP, which can be used by userspace to
> + * gauge requirements of the hardware/drivers they're running on. Advertising
> + * DRM_CLIENT_CAP_CURSOR_PLANE_HOTSPOT implies that the userspace client 
> will be
> + * correctly setting the hotspot properties.
> + */
> +
>  /**
>   * drm_plane_create_hotspot_properties - creates the mouse hotspot
>   * properties and attaches them to the giv

Re: [PATCH vhost v4 12/16] vdpa/mlx5: Improve mr update flow

2023-10-23 Thread Dragos Tatulea via Virtualization
On Fri, 2023-10-20 at 18:01 +0200, Eugenio Perez Martin wrote:
> On Wed, Oct 18, 2023 at 7:21 PM Dragos Tatulea  wrote:
> > 
> > On Wed, 2023-10-18 at 20:14 +0300, Dragos Tatulea wrote:
> > > The current flow for updating an mr works directly on mvdev->mr which
> > > makes it cumbersome to handle multiple new mr structs.
> > > 
> > > This patch makes the flow more straightforward by having
> > > mlx5_vdpa_create_mr return a new mr which will update the old mr (if
> > > any). The old mr will be deleted and unlinked from mvdev. For the case
> > > when the iotlb is empty (not NULL), the old mr will be cleared.
> > > 
> > > This change paves the way for adding mrs for different ASIDs.
> > > 
> > > The initialized bool is no longer needed as mr is now a pointer in the
> > > mlx5_vdpa_dev struct which will be NULL when not initialized.
> > > 
> > > Acked-by: Eugenio Pérez 
> > > Acked-by: Jason Wang 
> > > Signed-off-by: Dragos Tatulea 
> > > ---
> > >  drivers/vdpa/mlx5/core/mlx5_vdpa.h | 14 +++--
> > >  drivers/vdpa/mlx5/core/mr.c    | 87 --
> > >  drivers/vdpa/mlx5/net/mlx5_vnet.c  | 53 +-
> > >  3 files changed, 82 insertions(+), 72 deletions(-)
> > > 
> > > diff --git a/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> > > b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> > > index 9c6ac42c21e1..bbe4335106bd 100644
> > > --- a/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> > > +++ b/drivers/vdpa/mlx5/core/mlx5_vdpa.h
> > > @@ -31,8 +31,6 @@ struct mlx5_vdpa_mr {
> > >     struct list_head head;
> > >     unsigned long num_directs;
> > >     unsigned long num_klms;
> > > -   /* state of dvq mr */
> > > -   bool initialized;
> > > 
> > >     bool user_mr;
> > >  };
> > > @@ -91,7 +89,7 @@ struct mlx5_vdpa_dev {
> > >     u16 max_idx;
> > >     u32 generation;
> > > 
> > > -   struct mlx5_vdpa_mr mr;
> > > +   struct mlx5_vdpa_mr *mr;
> > >     /* serialize mr access */
> > >     struct mutex mr_mtx;
> > >     struct mlx5_control_vq cvq;
> > > @@ -114,14 +112,14 @@ void mlx5_vdpa_free_resources(struct mlx5_vdpa_dev
> > > *mvdev);
> > >  int mlx5_vdpa_create_mkey(struct mlx5_vdpa_dev *mvdev, u32 *mkey, u32
> > > *in,
> > >   int inlen);
> > >  int mlx5_vdpa_destroy_mkey(struct mlx5_vdpa_dev *mvdev, u32 mkey);
> > > -int mlx5_vdpa_handle_set_map(struct mlx5_vdpa_dev *mvdev, struct
> > > vhost_iotlb
> > > *iotlb,
> > > -    bool *change_map, unsigned int asid);
> > > -int mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev,
> > > -   struct mlx5_vdpa_mr *mr,
> > > -   struct vhost_iotlb *iotlb);
> > > +struct mlx5_vdpa_mr *mlx5_vdpa_create_mr(struct mlx5_vdpa_dev *mvdev,
> > > +    struct vhost_iotlb *iotlb);
> > >  void mlx5_vdpa_destroy_mr_resources(struct mlx5_vdpa_dev *mvdev);
> > >  void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev,
> > >   struct mlx5_vdpa_mr *mr);
> > > +void mlx5_vdpa_update_mr(struct mlx5_vdpa_dev *mvdev,
> > > +    struct mlx5_vdpa_mr *mr,
> > > +    unsigned int asid);
> > >  int mlx5_vdpa_update_cvq_iotlb(struct mlx5_vdpa_dev *mvdev,
> > >     struct vhost_iotlb *iotlb,
> > >     unsigned int asid);
> > > diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c
> > > index abd6a6fb122f..00eff5a07152 100644
> > > --- a/drivers/vdpa/mlx5/core/mr.c
> > > +++ b/drivers/vdpa/mlx5/core/mr.c
> > > @@ -495,30 +495,51 @@ static void destroy_user_mr(struct mlx5_vdpa_dev
> > > *mvdev,
> > > struct mlx5_vdpa_mr *mr
> > > 
> > >  static void _mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev, struct
> > > mlx5_vdpa_mr *mr)
> > >  {
> > > -   if (!mr->initialized)
> > > -   return;
> > > -
> > >     if (mr->user_mr)
> > >     destroy_user_mr(mvdev, mr);
> > >     else
> > >     destroy_dma_mr(mvdev, mr);
> > > -
> > > -   mr->initialized = false;
> > >  }
> > > 
> > >  void mlx5_vdpa_destroy_mr(struct mlx5_vdpa_dev *mvdev,
> > >   struct mlx5_vdpa_mr *mr)
> > >  {
> > > +   if (!mr)
> > > +   return;
> > > +
> > >     mutex_lock(&mvdev->mr_mtx);
> > > 
> > >     _mlx5_vdpa_destroy_mr(mvdev, mr);
> > > 
> > > +   if (mvdev->mr == mr)
> > > +   mvdev->mr = NULL;
> > > +
> > > +   mutex_unlock(&mvdev->mr_mtx);
> > > +
> > > +   kfree(mr);
> > > +}
> > > +
> > > +void mlx5_vdpa_update_mr(struct mlx5_vdpa_dev *mvdev,
> > > +    struct mlx5_vdpa_mr *new_mr,
> > > +    unsigned int asid)
> > > +{
> > > +   struct mlx5_vdpa_mr *old_mr = mvdev->mr;
> > > +
> > > +   mutex_lock(&mvdev->mr_mtx);
> > > +
> > > +   mvdev->mr = new_mr;
> > > +   if (old_mr) {
> > > +   _mlx5_vdpa_destroy_mr(mvdev, old_mr);

Re: [PATCH v4 3/4] vduse: Temporarily disable control queue features

2023-10-23 Thread Maxime Coquelin



On 10/23/23 05:08, Jason Wang wrote:

On Fri, Oct 20, 2023 at 11:58 PM Maxime Coquelin
 wrote:


Virtio-net driver control queue implementation is not safe
when used with VDUSE. If the VDUSE application does not
reply to control queue messages, it currently ends up
hanging the kernel thread sending this command.

Some work is on-going to make the control queue
implementation robust with VDUSE. Until it is completed,
let's disable control virtqueue and features that depend on
it.

Signed-off-by: Maxime Coquelin 


I wonder if it's better to do this with patch 2 or before patch 2 to
unbreak the bisection?


I think it would be better to keep it in a dedicated patch to ease the
revert later when your work will have been accepted, so before patch 2.

Thanks,
Maxime


Thanks


---
  drivers/vdpa/vdpa_user/vduse_dev.c | 37 ++
  1 file changed, 37 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 73ad3b7efd8e..0243dee9cf0e 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -28,6 +28,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 

  #include "iova_domain.h"
@@ -46,6 +47,30 @@

  #define IRQ_UNBOUND -1

+#define VDUSE_NET_VALID_FEATURES_MASK   \
+   (BIT_ULL(VIRTIO_NET_F_CSUM) |   \
+BIT_ULL(VIRTIO_NET_F_GUEST_CSUM) | \
+BIT_ULL(VIRTIO_NET_F_MTU) |\
+BIT_ULL(VIRTIO_NET_F_MAC) |\
+BIT_ULL(VIRTIO_NET_F_GUEST_TSO4) | \
+BIT_ULL(VIRTIO_NET_F_GUEST_TSO6) | \
+BIT_ULL(VIRTIO_NET_F_GUEST_ECN) |  \
+BIT_ULL(VIRTIO_NET_F_GUEST_UFO) |  \
+BIT_ULL(VIRTIO_NET_F_HOST_TSO4) |  \
+BIT_ULL(VIRTIO_NET_F_HOST_TSO6) |  \
+BIT_ULL(VIRTIO_NET_F_HOST_ECN) |   \
+BIT_ULL(VIRTIO_NET_F_HOST_UFO) |   \
+BIT_ULL(VIRTIO_NET_F_MRG_RXBUF) |  \
+BIT_ULL(VIRTIO_NET_F_STATUS) | \
+BIT_ULL(VIRTIO_NET_F_HOST_USO) |   \
+BIT_ULL(VIRTIO_F_ANY_LAYOUT) | \
+BIT_ULL(VIRTIO_RING_F_INDIRECT_DESC) | \
+BIT_ULL(VIRTIO_RING_F_EVENT_IDX) |  \
+BIT_ULL(VIRTIO_F_VERSION_1) |  \
+BIT_ULL(VIRTIO_F_ACCESS_PLATFORM) | \
+BIT_ULL(VIRTIO_F_RING_PACKED) |\
+BIT_ULL(VIRTIO_F_IN_ORDER))
+
  struct vduse_virtqueue {
 u16 index;
 u16 num_max;
@@ -1778,6 +1803,16 @@ static struct attribute *vduse_dev_attrs[] = {

  ATTRIBUTE_GROUPS(vduse_dev);

+static void vduse_dev_features_filter(struct vduse_dev_config *config)
+{
+   /*
+* Temporarily filter out virtio-net's control virtqueue and features
+* that depend on it while CVQ is being made more robust for VDUSE.
+*/
+   if (config->device_id == VIRTIO_ID_NET)
+   config->features &= VDUSE_NET_VALID_FEATURES_MASK;
+}
+
  static int vduse_create_dev(struct vduse_dev_config *config,
 void *config_buf, u64 api_version)
  {
@@ -1793,6 +1828,8 @@ static int vduse_create_dev(struct vduse_dev_config 
*config,
 if (!dev)
 goto err;

+   vduse_dev_features_filter(config);
+
 dev->api_version = api_version;
 dev->device_features = config->features;
 dev->device_id = config->device_id;
--
2.41.0





___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v4 1/4] vduse: validate block features only with block devices

2023-10-23 Thread Maxime Coquelin




On 10/21/23 00:07, Casey Schaufler wrote:

On 10/20/2023 8:58 AM, Maxime Coquelin wrote:

This patch is preliminary work to enable network device
type support to VDUSE.

As VIRTIO_BLK_F_CONFIG_WCE shares the same value as
VIRTIO_NET_F_HOST_TSO4, we need to restrict its check
to Virtio-blk device type.

Acked-by: Jason Wang 
Reviewed-by: Xie Yongji 
Signed-off-by: Maxime Coquelin 
---
  drivers/vdpa/vdpa_user/vduse_dev.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index df7869537ef1..5b3879976b3d 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1662,13 +1662,14 @@ static bool device_is_allowed(u32 device_id)
return false;
  }
  
-static bool features_is_valid(u64 features)

+static bool features_is_valid(struct vduse_dev_config *config)


This should either be features_are_valid() or feature_is_valid().
Correct pluralization is important in the English language.


Indeed, I will change to features_are_valid() in next revision.

Thanks,
Maxime


  {
-   if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
+   if (!(config->features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
return false;
  
  	/* Now we only support read-only configuration space */

-   if (features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE))
+   if ((config->device_id == VIRTIO_ID_BLOCK) &&
+   (config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)))
return false;
  
  	return true;

@@ -1695,7 +1696,7 @@ static bool vduse_validate_config(struct vduse_dev_config 
*config)
if (!device_is_allowed(config->device_id))
return false;
  
-	if (!features_is_valid(config->features))

+   if (!features_is_valid(config))
return false;
  
  	return true;




___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 4/4] vduse: Add LSM hooks to check Virtio device type

2023-10-23 Thread Maxime Coquelin




On 10/21/23 00:20, Casey Schaufler wrote:

On 10/20/2023 8:58 AM, Maxime Coquelin wrote:

This patch introduces LSM hooks for devices creation,
destruction and opening operations, checking the
application is allowed to perform these operations for
the Virtio device type.


Why do you think that there needs to be a special LSM check for virtio
devices? What can't existing device attributes be used?


Michael asked for a way for SELinux to allow/prevent the creation of
some types of devices [0].

A device is created using ioctl() on VDUSE control chardev. Its type is
specified via a field in the structure passed in argument.

I didn't see other way than adding dedicated LSM hooks to achieve this,
but it is possible that their is a better way to do it?

Thanks,
Maxime

[0]: 
https://lore.kernel.org/all/20230829130430-mutt-send-email-...@kernel.org/


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization