[PATCH v2 1/3] vdpa: factor out vdpa_set_features_unlocked for vdpa internal use

2022-01-14 Thread Si-Wei Liu
No functional change introduced. vdpa bus driver such as virtio_vdpa
or vhost_vdpa is not supposed to take care of the locking for core
by its own. The locked API vdpa_set_features should suffice the
bus driver's need.

Signed-off-by: Si-Wei Liu 
Reviewed-by: Eli Cohen 
---
 drivers/vdpa/vdpa.c  |  2 +-
 drivers/vhost/vdpa.c |  2 +-
 drivers/virtio/virtio_vdpa.c |  2 +-
 include/linux/vdpa.h | 18 --
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 9846c9d..1ea5254 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -393,7 +393,7 @@ static void vdpa_get_config_unlocked(struct vdpa_device 
*vdev,
 * If it does happen we assume a legacy guest.
 */
if (!vdev->features_valid)
-   vdpa_set_features(vdev, 0, true);
+   vdpa_set_features_unlocked(vdev, 0);
ops->get_config(vdev, offset, buf, len);
 }
 
diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 8515398..ec5249e 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -286,7 +286,7 @@ static long vhost_vdpa_set_features(struct vhost_vdpa *v, 
u64 __user *featurep)
if (copy_from_user(&features, featurep, sizeof(features)))
return -EFAULT;
 
-   if (vdpa_set_features(vdpa, features, false))
+   if (vdpa_set_features(vdpa, features))
return -EINVAL;
 
return 0;
diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
index 7767a7f..7650455 100644
--- a/drivers/virtio/virtio_vdpa.c
+++ b/drivers/virtio/virtio_vdpa.c
@@ -317,7 +317,7 @@ static int virtio_vdpa_finalize_features(struct 
virtio_device *vdev)
/* Give virtio_ring a chance to accept features. */
vring_transport_features(vdev);
 
-   return vdpa_set_features(vdpa, vdev->features, false);
+   return vdpa_set_features(vdpa, vdev->features);
 }
 
 static const char *virtio_vdpa_bus_name(struct virtio_device *vdev)
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 2de442e..721089b 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -401,18 +401,24 @@ static inline int vdpa_reset(struct vdpa_device *vdev)
return ret;
 }
 
-static inline int vdpa_set_features(struct vdpa_device *vdev, u64 features, 
bool locked)
+static inline int vdpa_set_features_unlocked(struct vdpa_device *vdev, u64 
features)
 {
const struct vdpa_config_ops *ops = vdev->config;
int ret;
 
-   if (!locked)
-   mutex_lock(&vdev->cf_mutex);
-
vdev->features_valid = true;
ret = ops->set_driver_features(vdev, features);
-   if (!locked)
-   mutex_unlock(&vdev->cf_mutex);
+
+   return ret;
+}
+
+static inline int vdpa_set_features(struct vdpa_device *vdev, u64 features)
+{
+   int ret;
+
+   mutex_lock(&vdev->cf_mutex);
+   ret = vdpa_set_features_unlocked(vdev, features);
+   mutex_unlock(&vdev->cf_mutex);
 
return ret;
 }
-- 
1.8.3.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 3/3] vdpa/mlx5: add validation for VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command

2022-01-14 Thread Si-Wei Liu
When control vq receives a VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command
request from the driver, presently there is no validation against the
number of queue pairs to configure, or even if multiqueue had been
negotiated or not is unverified. This may lead to kernel panic due to
uninitialized resource for the queues were there any bogus request
sent down by untrusted driver. Tie up the loose ends there.

Fixes: 52893733f2c5 ("vdpa/mlx5: Add multiqueue support")
Signed-off-by: Si-Wei Liu 
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 6fa968f..86f84dc 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1563,11 +1563,27 @@ static virtio_net_ctrl_ack handle_ctrl_mq(struct 
mlx5_vdpa_dev *mvdev, u8 cmd)
 
switch (cmd) {
case VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET:
+   /* This mq feature check aligns with pre-existing userspace
+* implementation.
+*
+* Without it, an untrusted driver could fake a multiqueue 
config
+* request down to a non-mq device that may cause kernel to
+* panic due to uninitialized resources for extra vqs. Even with
+* a well behaving guest driver, it is not expected to allow
+* changing the number of vqs on a non-mq device.
+*/
+   if (!MLX5_FEATURE(mvdev, VIRTIO_NET_F_MQ))
+   break;
+
read = vringh_iov_pull_iotlb(&cvq->vring, &cvq->riov, (void 
*)&mq, sizeof(mq));
if (read != sizeof(mq))
break;
 
newqps = mlx5vdpa16_to_cpu(mvdev, mq.virtqueue_pairs);
+   if (newqps < VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN ||
+   newqps > mlx5_vdpa_max_qps(mvdev->max_vqs))
+   break;
+
if (ndev->cur_num_vqs == 2 * newqps) {
status = VIRTIO_NET_OK;
break;
-- 
1.8.3.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 2/3] vdpa/mlx5: should verify CTRL_VQ feature exists for MQ

2022-01-14 Thread Si-Wei Liu
Per VIRTIO v1.1 specification, section 5.1.3.1 Feature bit requirements:
"VIRTIO_NET_F_MQ Requires VIRTIO_NET_F_CTRL_VQ".

There's assumption in the mlx5_vdpa multiqueue code that MQ must come
together with CTRL_VQ. However, there's nowhere in the upper layer to
guarantee this assumption would hold. Were there an untrusted driver
sending down MQ without CTRL_VQ, it would compromise various spots for
e.g. is_index_valid() and is_ctrl_vq_idx(). Although this doesn't end
up with immediate panic or security loophole as of today's code, the
chance for this to be taken advantage of due to future code change is
not zero.

Harden the crispy assumption by failing the set_driver_features() call
when seeing (MQ && !CTRL_VQ). For that end, verify_min_features() is
renamed to verify_driver_features() to reflect the fact that it now does
more than just validate the minimum features. verify_driver_features()
is now used to accommodate various checks against the driver features
for set_driver_features().

Signed-off-by: Si-Wei Liu 
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index b53603d..6fa968f 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1897,11 +1897,25 @@ static u64 mlx5_vdpa_get_device_features(struct 
vdpa_device *vdev)
return ndev->mvdev.mlx_features;
 }
 
-static int verify_min_features(struct mlx5_vdpa_dev *mvdev, u64 features)
+static int verify_driver_features(struct mlx5_vdpa_dev *mvdev, u64 features)
 {
+   /* Minimum features to expect */
if (!(features & BIT_ULL(VIRTIO_F_ACCESS_PLATFORM)))
return -EOPNOTSUPP;
 
+   /* Double check features combination sent down by the driver.
+* Fail invalid features due to absence of the depended feature.
+*
+* Per VIRTIO v1.1 specification, section 5.1.3.1 Feature bit
+* requirements: "VIRTIO_NET_F_MQ Requires VIRTIO_NET_F_CTRL_VQ".
+* By failing the invalid features sent down by untrusted drivers,
+* we're assured the assumption made upon is_index_valid() and
+* is_ctrl_vq_idx() will not be compromised.
+*/
+   if ((features & (BIT_ULL(VIRTIO_NET_F_MQ) | 
BIT_ULL(VIRTIO_NET_F_CTRL_VQ))) ==
+BIT_ULL(VIRTIO_NET_F_MQ))
+   return -EINVAL;
+
return 0;
 }
 
@@ -1977,7 +1991,7 @@ static int mlx5_vdpa_set_driver_features(struct 
vdpa_device *vdev, u64 features)
 
print_features(mvdev, features, true);
 
-   err = verify_min_features(mvdev, features);
+   err = verify_driver_features(mvdev, features);
if (err)
return err;
 
-- 
1.8.3.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 0/3] fixes for mlx5_vdpa multiqueue support

2022-01-14 Thread Si-Wei Liu
This patchset contains the fixes and code cleanups for a couple issues
uncovered during the review for the "Allow for configuring max number
of virtqueue pairs" series.

It is based on Eli's fixes:
2e4cda633a22 ("vdpa/mlx5: Fix tracking of current number of VQs")
in the vhost tree.

--
v1 -> v2:
  - removed unneeded Fixes tag to avoid confusion
  - failing the set_features op instead of clearing invalid feature
  - add motivation to the commit log
  - fix or add code comments in place per review feedback
  - re-align the subject in some patches to better reflect the actual
code change
  - fixed corrupted email address format in SoB

Si-Wei Liu (3):
  vdpa: factor out vdpa_set_features_unlocked for vdpa internal use
  vdpa/mlx5: should verify CTRL_VQ feature exists for MQ
  vdpa/mlx5: add validation for VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command

 drivers/vdpa/mlx5/net/mlx5_vnet.c | 34 --
 drivers/vdpa/vdpa.c   |  2 +-
 drivers/vhost/vdpa.c  |  2 +-
 drivers/virtio/virtio_vdpa.c  |  2 +-
 include/linux/vdpa.h  | 18 --
 5 files changed, 47 insertions(+), 11 deletions(-)

-- 
1.8.3.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [GIT PULL] virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes

2022-01-14 Thread Michael S. Tsirkin
On Fri, Jan 14, 2022 at 08:56:14PM +, Jean-Philippe Brucker wrote:
> Hi,
> 
> On Fri, Jan 14, 2022 at 03:35:15PM -0500, Michael S. Tsirkin wrote:
> > Jean-Philippe Brucker (5):
> >   iommu/virtio: Add definitions for VIRTIO_IOMMU_F_BYPASS_CONFIG
> >   iommu/virtio: Support bypass domains
> >   iommu/virtio: Sort reserved regions
> >   iommu/virtio: Pass end address to viommu_add_mapping()
> >   iommu/virtio: Support identity-mapped domains
> 
> Please could you drop these patches, they are from an old version of the
> series. The newer version was already in Joerg's pull request and was
> merged, so this will conflict.
> 
> Thanks,
> Jean

I just sent v2 without your changes, thanks.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[GIT PULL v2] virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes

2022-01-14 Thread Michael S. Tsirkin
Changes from v1:

Dropped iommu changes at author's request. Pity this was only
requested after I sent pull v1 :(

Note that since these were the first in queue other hashes
do not match what was in linux-next, however as the changes
are in a separate driver, this should not matter.

The following changes since commit c9e6606c7fe92b50a02ce51dda82586ebdf99b48:

  Linux 5.16-rc8 (2022-01-02 14:23:25 -0800)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

for you to fetch changes up to b03fc43e73877e180c1803a33aea3e7396642367:

  vdpa/mlx5: Fix tracking of current number of VQs (2022-01-14 18:50:54 -0500)


virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes

partial support for < MAX_ORDER - 1 granularity for virtio-mem
driver_override for vdpa
sysfs ABI documentation for vdpa
multiqueue config support for mlx5 vdpa

Misc fixes, cleanups.

Signed-off-by: Michael S. Tsirkin 


Christophe JAILLET (1):
  eni_vdpa: Simplify 'eni_vdpa_probe()'

Dapeng Mi (1):
  virtio: fix a typo in function "vp_modern_remove" comments.

David Hildenbrand (2):
  virtio-mem: prepare page onlining code for granularity smaller than 
MAX_ORDER - 1
  virtio-mem: prepare fake page onlining code for granularity smaller than 
MAX_ORDER - 1

Eli Cohen (20):
  net/mlx5_vdpa: Offer VIRTIO_NET_F_MTU when setting MTU
  vdpa/mlx5: Fix wrong configuration of virtio_version_1_0
  vdpa: Provide interface to read driver features
  vdpa/mlx5: Distribute RX virtqueues in RQT object
  vdpa: Sync calls set/get config/status with cf_mutex
  vdpa: Read device configuration only if FEATURES_OK
  vdpa: Allow to configure max data virtqueues
  vdpa/mlx5: Fix config_attr_mask assignment
  vdpa/mlx5: Support configuring max data virtqueue
  vdpa: Add support for returning device configuration information
  vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps()
  vdpa: Support reporting max device capabilities
  vdpa/mlx5: Report max device capabilities
  vdpa/vdpa_sim: Configure max supported virtqueues
  vdpa: Use BIT_ULL for bit operations
  vdpa/vdpa_sim_net: Report max device capabilities
  vdpa: Avoid taking cf_mutex lock on get status
  vdpa: Protect vdpa reset with cf_mutex
  vdpa/mlx5: Fix is_index_valid() to refer to features
  vdpa/mlx5: Fix tracking of current number of VQs

Eugenio Pérez (2):
  vdpa: Avoid duplicate call to vp_vdpa get_status
  vdpa: Mark vdpa_config_ops.get_vq_notification as optional

Guanjun (1):
  vduse: moving kvfree into caller

Johan Hovold (4):
  firmware: qemu_fw_cfg: fix NULL-pointer deref on duplicate entries
  firmware: qemu_fw_cfg: fix kobject leak in probe error path
  firmware: qemu_fw_cfg: fix sysfs information leak
  firmware: qemu_fw_cfg: remove sysfs entries explicitly

Laura Abbott (1):
  vdpa: clean up get_config_size ret value handling

Michael S. Tsirkin (3):
  virtio: wrap config->reset calls
  hwrng: virtio - unregister device before reset
  virtio_ring: mark ring unused on error

Peng Hao (2):
  virtio/virtio_mem: handle a possible NULL as a memcpy parameter
  virtio/virtio_pci_legacy_dev: ensure the correct return value

Stefano Garzarella (2):
  docs: document sysfs ABI for vDPA bus
  vdpa: add driver_override support

Xianting Tian (1):
  vhost/test: fix memory leak of vhost virtqueues

Zhu Lingshan (1):
  ifcvf/vDPA: fix misuse virtio-net device config size for blk dev

王贇 (1):
  virtio-pci: fix the confusing error message

 Documentation/ABI/testing/sysfs-bus-vdpa   |  57 ++
 MAINTAINERS|   1 +
 arch/um/drivers/virt-pci.c |   2 +-
 drivers/block/virtio_blk.c |   4 +-
 drivers/bluetooth/virtio_bt.c  |   2 +-
 drivers/char/hw_random/virtio-rng.c|   2 +-
 drivers/char/virtio_console.c  |   4 +-
 drivers/crypto/virtio/virtio_crypto_core.c |   8 +-
 drivers/firmware/arm_scmi/virtio.c |   2 +-
 drivers/firmware/qemu_fw_cfg.c |  21 ++--
 drivers/gpio/gpio-virtio.c |   2 +-
 drivers/gpu/drm/virtio/virtgpu_kms.c   |   2 +-
 drivers/i2c/busses/i2c-virtio.c|   2 +-
 drivers/iommu/virtio-iommu.c   |   2 +-
 drivers/net/caif/caif_virtio.c |   2 +-
 drivers/net/virtio_net.c   |   4 +-
 drivers/net/wireless/mac80211_hwsim.c  |   2 +-
 drivers/nvdimm/virtio_pmem.c   |   2 +-
 drivers/rpmsg/virtio_rpmsg_bus.c   |   2 +-
 drivers/scsi/virtio_scsi.c |   2 +-
 drivers/vdpa/alibaba/eni_vdpa.c|  28 +++--
 drivers/vdpa/ifcvf/ifcvf_base.c|  41 ++--
 drivers/vdpa/ifcvf/ifcvf_base.h

Re: [GIT PULL] virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes

2022-01-14 Thread Michael S. Tsirkin
On Fri, Jan 14, 2022 at 08:56:14PM +, Jean-Philippe Brucker wrote:
> Hi,
> 
> On Fri, Jan 14, 2022 at 03:35:15PM -0500, Michael S. Tsirkin wrote:
> > Jean-Philippe Brucker (5):
> >   iommu/virtio: Add definitions for VIRTIO_IOMMU_F_BYPASS_CONFIG
> >   iommu/virtio: Support bypass domains
> >   iommu/virtio: Sort reserved regions
> >   iommu/virtio: Pass end address to viommu_add_mapping()
> >   iommu/virtio: Support identity-mapped domains
> 
> Please could you drop these patches, they are from an old version of the
> series. The newer version was already in Joerg's pull request and was
> merged, so this will conflict.
> 
> Thanks,
> Jean

It's weird that this wasn't detected, these have been in linux-next
for a long time now. I'l drop, though it's unfortunate as
hashes will not match with what was tested in linux-next.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [GIT PULL] virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes

2022-01-14 Thread Michael S. Tsirkin
On Fri, Jan 14, 2022 at 03:35:15PM -0500, Michael S. Tsirkin wrote:
> The following changes since commit c9e6606c7fe92b50a02ce51dda82586ebdf99b48:
> 
>   Linux 5.16-rc8 (2022-01-02 14:23:25 -0800)
> 
> are available in the Git repository at:
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
> 
> for you to fetch changes up to f04ac267029c8063fc35116b385cd37656b3c81a:
> 
>   virtio: acknowledge all features before access (2022-01-14 14:58:41 -0500)
> 
> 
> virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes
> 
> IOMMU bypass support in virtio-iommu
> partial support for < MAX_ORDER - 1 granularity for virtio-mem
> driver_override for vdpa
> sysfs ABI documentation for vdpa
> multiqueue config support for mlx5 vdpa
> 
> Misc fixes, cleanups.
> 
> Signed-off-by: Michael S. Tsirkin 

self-NACK at author's request.

Will send v2. Sorry.

> 
> Christophe JAILLET (1):
>   eni_vdpa: Simplify 'eni_vdpa_probe()'
> 
> Dapeng Mi (1):
>   virtio: fix a typo in function "vp_modern_remove" comments.
> 
> David Hildenbrand (2):
>   virtio-mem: prepare page onlining code for granularity smaller than 
> MAX_ORDER - 1
>   virtio-mem: prepare fake page onlining code for granularity smaller 
> than MAX_ORDER - 1
> 
> Eli Cohen (20):
>   net/mlx5_vdpa: Offer VIRTIO_NET_F_MTU when setting MTU
>   vdpa/mlx5: Fix wrong configuration of virtio_version_1_0
>   vdpa: Provide interface to read driver features
>   vdpa/mlx5: Distribute RX virtqueues in RQT object
>   vdpa: Sync calls set/get config/status with cf_mutex
>   vdpa: Read device configuration only if FEATURES_OK
>   vdpa: Allow to configure max data virtqueues
>   vdpa/mlx5: Fix config_attr_mask assignment
>   vdpa/mlx5: Support configuring max data virtqueue
>   vdpa: Add support for returning device configuration information
>   vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps()
>   vdpa: Support reporting max device capabilities
>   vdpa/mlx5: Report max device capabilities
>   vdpa/vdpa_sim: Configure max supported virtqueues
>   vdpa: Use BIT_ULL for bit operations
>   vdpa/vdpa_sim_net: Report max device capabilities
>   vdpa: Avoid taking cf_mutex lock on get status
>   vdpa: Protect vdpa reset with cf_mutex
>   vdpa/mlx5: Fix is_index_valid() to refer to features
>   vdpa/mlx5: Fix tracking of current number of VQs
> 
> Eugenio Pérez (2):
>   vdpa: Avoid duplicate call to vp_vdpa get_status
>   vdpa: Mark vdpa_config_ops.get_vq_notification as optional
> 
> Guanjun (1):
>   vduse: moving kvfree into caller
> 
> Jean-Philippe Brucker (5):
>   iommu/virtio: Add definitions for VIRTIO_IOMMU_F_BYPASS_CONFIG
>   iommu/virtio: Support bypass domains
>   iommu/virtio: Sort reserved regions
>   iommu/virtio: Pass end address to viommu_add_mapping()
>   iommu/virtio: Support identity-mapped domains
> 
> Johan Hovold (4):
>   firmware: qemu_fw_cfg: fix NULL-pointer deref on duplicate entries
>   firmware: qemu_fw_cfg: fix kobject leak in probe error path
>   firmware: qemu_fw_cfg: fix sysfs information leak
>   firmware: qemu_fw_cfg: remove sysfs entries explicitly
> 
> Laura Abbott (1):
>   vdpa: clean up get_config_size ret value handling
> 
> Michael S. Tsirkin (5):
>   virtio: wrap config->reset calls
>   hwrng: virtio - unregister device before reset
>   virtio_ring: mark ring unused on error
>   virtio: unexport virtio_finalize_features
>   virtio: acknowledge all features before access
> 
> Peng Hao (2):
>   virtio/virtio_mem: handle a possible NULL as a memcpy parameter
>   virtio/virtio_pci_legacy_dev: ensure the correct return value
> 
> Stefano Garzarella (2):
>   docs: document sysfs ABI for vDPA bus
>   vdpa: add driver_override support
> 
> Xianting Tian (1):
>   vhost/test: fix memory leak of vhost virtqueues
> 
> Zhu Lingshan (1):
>   ifcvf/vDPA: fix misuse virtio-net device config size for blk dev
> 
> 王贇 (1):
>   virtio-pci: fix the confusing error message
> 
>  Documentation/ABI/testing/sysfs-bus-vdpa   |  57 ++
>  MAINTAINERS|   1 +
>  arch/um/drivers/virt-pci.c |   2 +-
>  drivers/block/virtio_blk.c |   4 +-
>  drivers/bluetooth/virtio_bt.c  |   2 +-
>  drivers/char/hw_random/virtio-rng.c|   2 +-
>  drivers/char/virtio_console.c  |   4 +-
>  drivers/crypto/virtio/virtio_crypto_core.c |   8 +-
>  drivers/firmware/arm_scmi/virtio.c |   2 +-
>  drivers/firmware/qemu_fw_cfg.c |  21 ++--
>  drivers/gpio/gpio-virtio.c |   2 +-
>  drivers/gpu/drm/virtio/virtgpu_kms.c   |   2 +-
>  drivers/i2c/busses/i2c-virtio.c|   2 +-
>  drivers/iommu/vir

[PATCH v2 3/3] virtio_mem: break device on remove

2022-01-14 Thread Michael S. Tsirkin
A common pattern for device reset is currently:
vdev->config->reset(vdev);
.. cleanup ..

reset prevents new interrupts from arriving and waits for interrupt
handlers to finish.

However if - as is common - the handler queues a work request which is
flushed during the cleanup stage, we have code adding buffers / trying
to get buffers while device is reset. Not good.

This was reproduced by running
modprobe virtio_console
modprobe -r virtio_console
in a loop, and this reasoning seems to apply to virtio mem though
I could not reproduce it there.

Fix this up by calling virtio_break_device + flush before reset.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/virtio/virtio_mem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 38becd8d578c..33b8a118a3ae 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -2888,6 +2888,8 @@ static void virtio_mem_remove(struct virtio_device *vdev)
virtio_mem_deinit_hotplug(vm);
 
/* reset the device and cleanup the queues */
+   virtio_break_device(vdev);
+   flush_work(&vm->wq);
virtio_reset_device(vdev);
vdev->config->del_vqs(vdev);
 
-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 1/3] virtio: document virtio_reset_device

2022-01-14 Thread Michael S. Tsirkin
Looks like most callers get driver/device removal wrong.
Document what's expected of callers.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/virtio/virtio.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 2ed6e2451fd8..631a346a3aa6 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -201,6 +201,22 @@ static int virtio_finalize_features(struct virtio_device 
*dev)
return 0;
 }
 
+/**
+ * virtio_reset_device - quiesce device for removal
+ * @dev: the device to reset
+ *
+ * Prevents device from sending interrupts and accessing memory.
+ *
+ * Generally used for cleanup during driver / device removal.
+ *
+ * Once this has been invoked, caller must ensure that
+ * virtqueue_notify / virtqueue_kick are not in progress.
+ *
+ * Note: this guarantees that vq callbacks are not in progress, however caller
+ * is responsible for preventing access from other contexts, such as a system
+ * call/workqueue/bh.  Invoking virtio_break_device then flushing any such
+ * contexts is one way to handle that.
+ * */
 void virtio_reset_device(struct virtio_device *dev)
 {
dev->config->reset(dev);
-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 2/3] virtio_console: break out of buf poll on remove

2022-01-14 Thread Michael S. Tsirkin
A common pattern for device reset is currently:
vdev->config->reset(vdev);
.. cleanup ..

reset prevents new interrupts from arriving and waits for interrupt
handlers to finish.

However if - as is common - the handler queues a work request which is
flushed during the cleanup stage, we have code adding buffers / trying
to get buffers while device is reset. Not good.

This was reproduced by running
modprobe virtio_console
modprobe -r virtio_console
in a loop.

Fix this up by calling virtio_break_device + flush before reset.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1786239
Signed-off-by: Michael S. Tsirkin 
---
 drivers/char/virtio_console.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 2359889a35a0..e3c430539a17 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -1957,6 +1957,13 @@ static void virtcons_remove(struct virtio_device *vdev)
list_del(&portdev->list);
spin_unlock_irq(&pdrvdata_lock);
 
+   /* Device is going away, exit any polling for buffers */
+   virtio_break_device(vdev);
+   if (use_multiport(portdev))
+   flush_work(&portdev->control_work);
+   else
+   flush_work(&portdev->config_work);
+
/* Disable interrupts for vqs */
virtio_reset_device(vdev);
/* Finish up work that's lined up */
-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_console: break out of buf poll on remove

2022-01-14 Thread Michael S. Tsirkin
On Tue, Oct 05, 2021 at 03:33:42PM +0200, Greg Kroah-Hartman wrote:
> On Tue, Oct 05, 2021 at 03:04:07AM -0400, Michael S. Tsirkin wrote:
> > A common pattern for device reset is currently:
> > vdev->config->reset(vdev);
> > .. cleanup ..
> > 
> > reset prevents new interrupts from arriving and waits for interrupt
> > handlers to finish.
> > 
> > However if - as is common - the handler queues a work request which is
> > flushed during the cleanup stage, we have code adding buffers / trying
> > to get buffers while device is reset. Not good.
> > 
> > This was reproduced by running
> > modprobe virtio_console
> > modprobe -r virtio_console
> > in a loop.
> 
> That's a pathological case that is not "in the field" except by people
> who want to abuse the system as root.  And they can do much worse things
> than that.
> 
> > Fixing this comprehensively needs some thought, and new APIs.
> > Let's at least handle the specific case of virtio_console
> > removal that was reported in the field.
> 
> Let's fix this correctly, don't just hack it up now.

Well I poked at it some more, and things are not as bad
as I thought. It's mostly just console and possibly virtio-mem.
Well and virtio-bt has a completely borken cleanup that
does not even bother to reset the device, but that's
a separate issue, discussing it with the maintainer.

So I wrote some patches to document the requirements better, added a
wrapper for reset and generally cleaned the API up a bit, and added a
patch for mem, but generally I no longer think we need a major API
change.


> > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1786239
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >  drivers/char/virtio_console.c | 6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
> > index 7eaf303a7a86..c852ce0b4d56 100644
> > --- a/drivers/char/virtio_console.c
> > +++ b/drivers/char/virtio_console.c
> > @@ -1956,6 +1956,12 @@ static void virtcons_remove(struct virtio_device 
> > *vdev)
> > list_del(&portdev->list);
> > spin_unlock_irq(&pdrvdata_lock);
> >  
> > +   /* Device is going away, exit any polling for buffers */
> > +   virtio_break_device(vdev);
> > +   if (use_multiport(portdev))
> > +   flush_work(&portdev->control_work);
> > +   else
> > +   flush_work(&portdev->config_work);
> > /* Disable interrupts for vqs */
> 
> newline before comment?

sure

> thanks,
> 
> greg k-h

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] virtio_mem: break device on remove

2022-01-14 Thread Michael S. Tsirkin
A common pattern for device reset is currently:
vdev->config->reset(vdev);
.. cleanup ..

reset prevents new interrupts from arriving and waits for interrupt
handlers to finish.

However if - as is common - the handler queues a work request which is
flushed during the cleanup stage, we have code adding buffers / trying
to get buffers while device is reset. Not good.

This was reproduced by running
modprobe virtio_console
modprobe -r virtio_console
in a loop, and this reasoning seems to apply to virtio mem though
I could not reproduce it there.

Fix this up by calling virtio_break_device + flush before reset.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/virtio/virtio_mem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 38becd8d578c..33b8a118a3ae 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -2888,6 +2888,8 @@ static void virtio_mem_remove(struct virtio_device *vdev)
virtio_mem_deinit_hotplug(vm);
 
/* reset the device and cleanup the queues */
+   virtio_break_device(vdev);
+   flush_work(&vm->wq);
virtio_reset_device(vdev);
vdev->config->del_vqs(vdev);
 
-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [GIT PULL] virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes

2022-01-14 Thread Jean-Philippe Brucker
Hi,

On Fri, Jan 14, 2022 at 03:35:15PM -0500, Michael S. Tsirkin wrote:
> Jean-Philippe Brucker (5):
>   iommu/virtio: Add definitions for VIRTIO_IOMMU_F_BYPASS_CONFIG
>   iommu/virtio: Support bypass domains
>   iommu/virtio: Sort reserved regions
>   iommu/virtio: Pass end address to viommu_add_mapping()
>   iommu/virtio: Support identity-mapped domains

Please could you drop these patches, they are from an old version of the
series. The newer version was already in Joerg's pull request and was
merged, so this will conflict.

Thanks,
Jean

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[GIT PULL] virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes

2022-01-14 Thread Michael S. Tsirkin
The following changes since commit c9e6606c7fe92b50a02ce51dda82586ebdf99b48:

  Linux 5.16-rc8 (2022-01-02 14:23:25 -0800)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

for you to fetch changes up to f04ac267029c8063fc35116b385cd37656b3c81a:

  virtio: acknowledge all features before access (2022-01-14 14:58:41 -0500)


virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes

IOMMU bypass support in virtio-iommu
partial support for < MAX_ORDER - 1 granularity for virtio-mem
driver_override for vdpa
sysfs ABI documentation for vdpa
multiqueue config support for mlx5 vdpa

Misc fixes, cleanups.

Signed-off-by: Michael S. Tsirkin 


Christophe JAILLET (1):
  eni_vdpa: Simplify 'eni_vdpa_probe()'

Dapeng Mi (1):
  virtio: fix a typo in function "vp_modern_remove" comments.

David Hildenbrand (2):
  virtio-mem: prepare page onlining code for granularity smaller than 
MAX_ORDER - 1
  virtio-mem: prepare fake page onlining code for granularity smaller than 
MAX_ORDER - 1

Eli Cohen (20):
  net/mlx5_vdpa: Offer VIRTIO_NET_F_MTU when setting MTU
  vdpa/mlx5: Fix wrong configuration of virtio_version_1_0
  vdpa: Provide interface to read driver features
  vdpa/mlx5: Distribute RX virtqueues in RQT object
  vdpa: Sync calls set/get config/status with cf_mutex
  vdpa: Read device configuration only if FEATURES_OK
  vdpa: Allow to configure max data virtqueues
  vdpa/mlx5: Fix config_attr_mask assignment
  vdpa/mlx5: Support configuring max data virtqueue
  vdpa: Add support for returning device configuration information
  vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps()
  vdpa: Support reporting max device capabilities
  vdpa/mlx5: Report max device capabilities
  vdpa/vdpa_sim: Configure max supported virtqueues
  vdpa: Use BIT_ULL for bit operations
  vdpa/vdpa_sim_net: Report max device capabilities
  vdpa: Avoid taking cf_mutex lock on get status
  vdpa: Protect vdpa reset with cf_mutex
  vdpa/mlx5: Fix is_index_valid() to refer to features
  vdpa/mlx5: Fix tracking of current number of VQs

Eugenio Pérez (2):
  vdpa: Avoid duplicate call to vp_vdpa get_status
  vdpa: Mark vdpa_config_ops.get_vq_notification as optional

Guanjun (1):
  vduse: moving kvfree into caller

Jean-Philippe Brucker (5):
  iommu/virtio: Add definitions for VIRTIO_IOMMU_F_BYPASS_CONFIG
  iommu/virtio: Support bypass domains
  iommu/virtio: Sort reserved regions
  iommu/virtio: Pass end address to viommu_add_mapping()
  iommu/virtio: Support identity-mapped domains

Johan Hovold (4):
  firmware: qemu_fw_cfg: fix NULL-pointer deref on duplicate entries
  firmware: qemu_fw_cfg: fix kobject leak in probe error path
  firmware: qemu_fw_cfg: fix sysfs information leak
  firmware: qemu_fw_cfg: remove sysfs entries explicitly

Laura Abbott (1):
  vdpa: clean up get_config_size ret value handling

Michael S. Tsirkin (5):
  virtio: wrap config->reset calls
  hwrng: virtio - unregister device before reset
  virtio_ring: mark ring unused on error
  virtio: unexport virtio_finalize_features
  virtio: acknowledge all features before access

Peng Hao (2):
  virtio/virtio_mem: handle a possible NULL as a memcpy parameter
  virtio/virtio_pci_legacy_dev: ensure the correct return value

Stefano Garzarella (2):
  docs: document sysfs ABI for vDPA bus
  vdpa: add driver_override support

Xianting Tian (1):
  vhost/test: fix memory leak of vhost virtqueues

Zhu Lingshan (1):
  ifcvf/vDPA: fix misuse virtio-net device config size for blk dev

王贇 (1):
  virtio-pci: fix the confusing error message

 Documentation/ABI/testing/sysfs-bus-vdpa   |  57 ++
 MAINTAINERS|   1 +
 arch/um/drivers/virt-pci.c |   2 +-
 drivers/block/virtio_blk.c |   4 +-
 drivers/bluetooth/virtio_bt.c  |   2 +-
 drivers/char/hw_random/virtio-rng.c|   2 +-
 drivers/char/virtio_console.c  |   4 +-
 drivers/crypto/virtio/virtio_crypto_core.c |   8 +-
 drivers/firmware/arm_scmi/virtio.c |   2 +-
 drivers/firmware/qemu_fw_cfg.c |  21 ++--
 drivers/gpio/gpio-virtio.c |   2 +-
 drivers/gpu/drm/virtio/virtgpu_kms.c   |   2 +-
 drivers/i2c/busses/i2c-virtio.c|   2 +-
 drivers/iommu/virtio-iommu.c   | 115 
 drivers/net/caif/caif_virtio.c |   2 +-
 drivers/net/virtio_net.c   |   4 +-
 drivers/net/wireless/mac80211_hwsim.c  |   2 +-
 drivers/nvdimm/virtio_pmem.c   |   2 +-
 drivers/rpmsg/virtio_rpmsg_bus.c   |   2 +-
 drivers/scsi/virtio_scsi.c 

Re: [PATCH 4/4] vdpa/mlx5: Fix tracking of current number of VQs

2022-01-14 Thread Michael S. Tsirkin
On Tue, Jan 11, 2022 at 02:14:47PM -0800, Si-Wei Liu wrote:
> Reviewed-by: Si-Wei Liu

Note: the correct format is:

Reviewed-by: Si-Wei Liu 

(with a space before <>).

Pls take care in the future.
Thanks!

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] Bluetooth: virtio_bt: fix device removal

2022-01-14 Thread Michael S. Tsirkin
On Thu, Dec 16, 2021 at 08:58:31PM +0100, Marcel Holtmann wrote:
> Hi Michael,
> 
> > Device removal is clearly out of virtio spec: it attempts to remove
> > unused buffers from a VQ before invoking device reset. To fix, make
> > open/close NOPs and do all cleanup/setup in probe/remove.
>  
>  so the virtbt_{open,close} as NOP is not really what a driver is suppose
>  to be doing. These are transport enable/disable callbacks from the BT
>  Core towards the driver. It maps to a device being enabled/disabled by
>  something like bluetoothd for example. So if disabled, I expect that no
>  resources/queues are in use.
>  
>  Maybe I misunderstand the virtio spec in that regard, but I would like
>  to keep this fundamental concept of a Bluetooth driver. It does work
>  with all other transports like USB, SDIO, UART etc.
>  
> > The cost here is a single skb wasted on an unused bt device - which
> > seems modest.
>  
>  There should be no buffer used if the device is powered off. We also 
>  don’t
>  have any USB URBs in-flight if the transport is not active.
>  
> > NB: with this fix in place driver still suffers from a race condition if
> > an interrupt triggers while device is being reset. Work on a fix for
> > that issue is in progress.
>  
>  In the virtbt_close() callback we should deactivate all interrupts.
>  
>  Regards
>  
>  Marcel
> >>> 
> >>> So Marcel, do I read it right that you are working on a fix
> >>> and I can drop this patch for now?
> >> 
> >> ping
> > 
> > 
> > If I don't hear otherwise I'll queue my version - it might not
> > be ideal but it at least does not violate the spec.
> > We can work on not allocating/freeing buffers later
> > as appropriate.
> 
> I have a patch, but it is not fully tested yet.
> 
> Regards
> 
> Marcel

ping

it's been a month ...

I'm working on cleaning up module/device removal in virtio and bt
is kind of sticking out.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH] virtio: acknowledge all features before access

2022-01-14 Thread Michael S. Tsirkin
The feature negotiation was designed in a way that
makes it possible for devices to know which config
fields will be accessed by drivers.

This is broken since commit 404123c2db79 ("virtio: allow drivers to
validate features") with fallout in at least block and net.
We have a partial work-around in commit 2f9a174f918e ("virtio: write
back F_VERSION_1 before validate") which at least lets devices
find out which format should config space have, but this
is a partial fix: guests should not access config space
without acknowledging features since otherwise we'll never
be able to change the config space format.

As a side effect, this also reduces the amount of hypervisor accesses -
we now only acknowledge features once unless we are clearing any
features when validating.

Cc: sta...@vger.kernel.org
Fixes: 404123c2db79 ("virtio: allow drivers to validate features")
Fixes: 2f9a174f918e ("virtio: write back F_VERSION_1 before validate")
Cc: "Halil Pasic" 
Signed-off-by: Michael S. Tsirkin 
---

Halil, I thought hard about our situation with transitional and
today I finally thought of something I am happy with.
Pls let me know what you think. Testing on big endian would
also be much appreciated!

 drivers/virtio/virtio.c | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index d891b0a354b0..2ed6e2451fd8 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -168,12 +168,10 @@ EXPORT_SYMBOL_GPL(virtio_add_status);
 
 static int virtio_finalize_features(struct virtio_device *dev)
 {
-   int ret = dev->config->finalize_features(dev);
unsigned status;
+   int ret;
 
might_sleep();
-   if (ret)
-   return ret;
 
ret = arch_has_restricted_virtio_memory_access();
if (ret) {
@@ -244,17 +242,6 @@ static int virtio_dev_probe(struct device *_d)
driver_features_legacy = driver_features;
}
 
-   /*
-* Some devices detect legacy solely via F_VERSION_1. Write
-* F_VERSION_1 to force LE config space accesses before FEATURES_OK for
-* these when needed.
-*/
-   if (drv->validate && !virtio_legacy_is_little_endian()
- && device_features & BIT_ULL(VIRTIO_F_VERSION_1)) {
-   dev->features = BIT_ULL(VIRTIO_F_VERSION_1);
-   dev->config->finalize_features(dev);
-   }
-
if (device_features & (1ULL << VIRTIO_F_VERSION_1))
dev->features = driver_features & device_features;
else
@@ -265,10 +252,22 @@ static int virtio_dev_probe(struct device *_d)
if (device_features & (1ULL << i))
__virtio_set_bit(dev, i);
 
+   err = dev->config->finalize_features(dev);
+   if (err)
+   goto err;
+
if (drv->validate) {
+   u64 features = dev->features;
+
err = drv->validate(dev);
if (err)
goto err;
+
+   if (features != dev->features) {
+   err = dev->config->finalize_features(dev);
+   if (err)
+   goto err;
+   }
}
 
err = virtio_finalize_features(dev);
@@ -495,6 +494,10 @@ int virtio_device_restore(struct virtio_device *dev)
/* We have a driver! */
virtio_add_status(dev, VIRTIO_CONFIG_S_DRIVER);
 
+   ret = dev->config->finalize_features(dev);
+   if (ret)
+   goto err;
+
ret = virtio_finalize_features(dev);
if (ret)
goto err;
-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] virtio: unexport virtio_finalize_features

2022-01-14 Thread Michael S. Tsirkin
virtio_finalize_features is only used internally within virtio.
No reason to export it.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/virtio/virtio.c | 3 +--
 include/linux/virtio.h  | 1 -
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 00ac9db792a4..d891b0a354b0 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -166,7 +166,7 @@ void virtio_add_status(struct virtio_device *dev, unsigned 
int status)
 }
 EXPORT_SYMBOL_GPL(virtio_add_status);
 
-int virtio_finalize_features(struct virtio_device *dev)
+static int virtio_finalize_features(struct virtio_device *dev)
 {
int ret = dev->config->finalize_features(dev);
unsigned status;
@@ -202,7 +202,6 @@ int virtio_finalize_features(struct virtio_device *dev)
}
return 0;
 }
-EXPORT_SYMBOL_GPL(virtio_finalize_features);
 
 void virtio_reset_device(struct virtio_device *dev)
 {
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 72292a62cd90..5464f398912a 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -133,7 +133,6 @@ bool is_virtio_device(struct device *dev);
 void virtio_break_device(struct virtio_device *dev);
 
 void virtio_config_changed(struct virtio_device *dev);
-int virtio_finalize_features(struct virtio_device *dev);
 #ifdef CONFIG_PM_SLEEP
 int virtio_device_freeze(struct virtio_device *dev);
 int virtio_device_restore(struct virtio_device *dev);
-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v3 7/8] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size.

2022-01-14 Thread David Hildenbrand
On 05.01.22 22:47, Zi Yan wrote:
> From: Zi Yan 
> 
> alloc_contig_range() now only needs to be aligned to pageblock_order,
> drop virtio_mem size requirement that it needs to be the max of
> pageblock_order and MAX_ORDER.
> 
> Signed-off-by: Zi Yan 
> ---
>  drivers/virtio/virtio_mem.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
> index a6a78685cfbe..2664dc16d0f9 100644
> --- a/drivers/virtio/virtio_mem.c
> +++ b/drivers/virtio/virtio_mem.c
> @@ -2481,8 +2481,7 @@ static int virtio_mem_init_hotplug(struct virtio_mem 
> *vm)
>* - Is required for now for alloc_contig_range() to work reliably -
>*   it doesn't properly handle smaller granularity on ZONE_NORMAL.
>*/

Please also update this comment.

-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1] vhost: cache avail index in vhost_enable_notify()

2022-01-14 Thread Michael S. Tsirkin
On Fri, Jan 14, 2022 at 02:38:16PM +0100, Stefano Garzarella wrote:
> On Fri, Jan 14, 2022 at 07:45:35AM -0500, Michael S. Tsirkin wrote:
> > On Fri, Jan 14, 2022 at 10:05:08AM +0100, Stefano Garzarella wrote:
> > > In vhost_enable_notify() we enable the notifications and we read
> > > the avail index to check if new buffers have become available in
> > > the meantime.
> > > 
> > > We are not caching the avail index, so when the device will call
> > > vhost_get_vq_desc(), it will find the old value in the cache and
> > > it will read the avail index again.
> > > 
> > > It would be better to refresh the cache every time we read avail
> > > index, so let's change vhost_enable_notify() caching the value in
> > > `avail_idx` and compare it with `last_avail_idx` to check if there
> > > are new buffers available.
> > > 
> > > Anyway, we don't expect a significant performance boost because
> > > the above path is not very common, indeed vhost_enable_notify()
> > > is often called with unlikely(), expecting that avail index has
> > > not been updated.
> > > 
> > > Signed-off-by: Stefano Garzarella 
> > 
> > ... and can in theory even hurt due to an extra memory write.
> > So ... performance test restults pls?
> 
> Right, could be.
> 
> I'll run some perf test with vsock, about net, do you have a test suite or
> common step to follow to test it?
> 
> Thanks,
> Stefano

You can use the vhost test as a unit test as well.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v3 5/8] mm: page_isolation: check specified range for unmovable pages during isolation.

2022-01-14 Thread David Hildenbrand
On 05.01.22 22:47, Zi Yan wrote:
> From: Zi Yan 
> 
> Enable set_migratetype_isolate() to check specified sub-range for
> unmovable pages during isolation. Page isolation is done
> at max(MAX_ORDER_NR_PAEGS, pageblock_nr_pages) granularity, but not all
> pages within that granularity are intended to be isolated. For example,
> alloc_contig_range(), which uses page isolation, allows ranges without
> alignment. This commit makes unmovable page check only look for
> interesting pages, so that page isolation can succeed for any
> non-overlapping ranges.

Are you handling if we start checking in the middle of a compound page
and actually have to lookup the head to figure out if movable or not?

> 
> has_unmovable_pages() is moved to mm/page_isolation.c since it is only
> used by page isolation.

Please move that into a separate patch upfront, makes this patch much
easier to review.

-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1] vhost: cache avail index in vhost_enable_notify()

2022-01-14 Thread Stefano Garzarella

On Fri, Jan 14, 2022 at 07:45:35AM -0500, Michael S. Tsirkin wrote:

On Fri, Jan 14, 2022 at 10:05:08AM +0100, Stefano Garzarella wrote:

In vhost_enable_notify() we enable the notifications and we read
the avail index to check if new buffers have become available in
the meantime.

We are not caching the avail index, so when the device will call
vhost_get_vq_desc(), it will find the old value in the cache and
it will read the avail index again.

It would be better to refresh the cache every time we read avail
index, so let's change vhost_enable_notify() caching the value in
`avail_idx` and compare it with `last_avail_idx` to check if there
are new buffers available.

Anyway, we don't expect a significant performance boost because
the above path is not very common, indeed vhost_enable_notify()
is often called with unlikely(), expecting that avail index has
not been updated.

Signed-off-by: Stefano Garzarella 


... and can in theory even hurt due to an extra memory write.
So ... performance test restults pls?


Right, could be.

I'll run some perf test with vsock, about net, do you have a test suite 
or common step to follow to test it?


Thanks,
Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/3] vdpa/mlx5: set_features should nack MQ if no CTRL_VQ

2022-01-14 Thread Michael S. Tsirkin
On Fri, Jan 14, 2022 at 12:51:55AM -0800, Si-Wei Liu wrote:
> 
> 
> On 1/12/2022 10:57 PM, Michael S. Tsirkin wrote:
> > On Thu, Jan 13, 2022 at 12:10:50AM -0500, Si-Wei Liu wrote:
> > > Made corresponding change per spec:
> > 
> > > The device MUST NOT offer a feature which requires another feature
> > > which was not offered.
> > Says nothing about the driver though, and you seem to be
> > doing things to driver features?
> Yes, it's about validation for driver features, though the spec doesn't have
> clear way how to deal with this situation. I guess this in reality leaves
> quite some space for the implementation. To step back, in recent days with
> latent spec revision for feature negotiation due to endianness and MTU
> validation, what do we expect device to work if the driver is not compliant
> and comes up with invalid features set? To clear a subset of driver features
> unsupported by the device, such that driver may figure out by reading it
> from device config space later on? Or fail the entire features and have
> driver to re-try a different setting? Do you feel its possible for device to
> clear a subset of invalid or unsupported features sent down by the driver,
> which may allow driver continue to work without having to fail the entire
> feature negotiation?
> 
> The current userspace implementation in qemu may filter out invalid features
> from driver by clearing a subset and tailor it to fit what host/device can
> offer. I thought it should be safe to follow the existing practice. That way
> guest driver can get know of the effective features during feature
> negotiation, or after features_ok is set (that's what I call by "re-read" of
> features, sorry if I used the wrong term). Did I miss something? I can
> definitely add more explanation for the motivation, remove the reference to
> spec and delete the Fixes tag to avoid confusions. Do you think this would
> work?
> 
> Another option would be just return failure for the set_driver_features()
> call when seeing (MQ && !CTRL_VQ). Simple enough and easy to implement.
> Efficient to indicate which individual feature is failing? Probably not,
> driver has to retry a few times using binary search to know.
> 
> > pls explain the motivation. which config are you trying to
> > fix what is current and expected behaviour.
> The current mq code for mlx5_vdpa driver is written in the assumption that
> MQ must come together with CTRL_VQ. I would like to point out that right now
> there's nowhere in the host side even QEMU to guarantee this assumption
> would hold. Were there a malicious driver sending down MQ without CTRL_VQ,
> it would compromise various spots such as is_index_valid() and
> is_ctrl_vq_idx(). This doesn't end up with immediate panic or security
> loophole in the host currently, but still the chance for this being taken
> advantage of is not zero, especially when future code change is involved.
> You can say it's code cleanup, but the added check helps harden the crispy
> assumption and assures peace of mind.

I think that right now the right thing to do is to validate untrusted
input and fail invalid operations.
The spec does say "VIRTIO_NET_F_MQ Requires VIRTIO_NET_F_CTRL_VQ".
If there are existing legacy drivers
violating some rules, then we should consider working around that (and
maybe documenting that in the spec in the legacy section).


> > 
> > > Fixes: 52893733f2c5 ("vdpa/mlx5: Add multiqueue support")
> > 
> > It's all theoretical right? Fixes really means
> > "if you have commit ABC then you should pick this one up".
> > not really appropriate for theoretical fixes.
> Yeah. This was discovered in code review. Didn't see a real issue. I can
> remove the tag.
> 
> -Siwei
> > 
> > > Signed-off-by: Si-Wei Liu
> > > ---
> > >   drivers/vdpa/mlx5/net/mlx5_vnet.c | 16 +---
> > >   1 file changed, 13 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
> > > b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > index b53603d..46d4deb 100644
> > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> > > @@ -1897,11 +1897,21 @@ static u64 mlx5_vdpa_get_device_features(struct 
> > > vdpa_device *vdev)
> > >   return ndev->mvdev.mlx_features;
> > >   }
> > > -static int verify_min_features(struct mlx5_vdpa_dev *mvdev, u64 features)
> > > +static int verify_driver_features(struct mlx5_vdpa_dev *mvdev, u64 
> > > *features)
> > 
> > Good rename actually but document in commit log with an
> > explanation.
> > 
> > >   {
> > > - if (!(features & BIT_ULL(VIRTIO_F_ACCESS_PLATFORM)))
> > > + /* minimum features to expect */
> > > + if (!(*features & BIT_ULL(VIRTIO_F_ACCESS_PLATFORM)))
> > >   return -EOPNOTSUPP;
> > > + /* Double check features combination sent down by the driver.
> > > +  * NACK invalid feature due to the absence of depended feature.
> > Pls rewrite this to make it grammatical.  There's no NACK in spec. What
> > does this do? Fai

Re: [PATCH v1] vhost: cache avail index in vhost_enable_notify()

2022-01-14 Thread Michael S. Tsirkin
On Fri, Jan 14, 2022 at 10:05:08AM +0100, Stefano Garzarella wrote:
> In vhost_enable_notify() we enable the notifications and we read
> the avail index to check if new buffers have become available in
> the meantime.
> 
> We are not caching the avail index, so when the device will call
> vhost_get_vq_desc(), it will find the old value in the cache and
> it will read the avail index again.
> 
> It would be better to refresh the cache every time we read avail
> index, so let's change vhost_enable_notify() caching the value in
> `avail_idx` and compare it with `last_avail_idx` to check if there
> are new buffers available.
> 
> Anyway, we don't expect a significant performance boost because
> the above path is not very common, indeed vhost_enable_notify()
> is often called with unlikely(), expecting that avail index has
> not been updated.
> 
> Signed-off-by: Stefano Garzarella 

... and can in theory even hurt due to an extra memory write.
So ... performance test restults pls?

> ---
> v1:
> - improved the commit description [MST, Jason]
> ---
>  drivers/vhost/vhost.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 59edb5a1ffe2..07363dff559e 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -2543,8 +2543,9 @@ bool vhost_enable_notify(struct vhost_dev *dev, struct 
> vhost_virtqueue *vq)
>  &vq->avail->idx, r);
>   return false;
>   }
> + vq->avail_idx = vhost16_to_cpu(vq, avail_idx);
>  
> - return vhost16_to_cpu(vq, avail_idx) != vq->avail_idx;
> + return vq->avail_idx != vq->last_avail_idx;
>  }
>  EXPORT_SYMBOL_GPL(vhost_enable_notify);
>  
> -- 
> 2.31.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/3] fixes for mlx5_vdpa multiqueue support

2022-01-14 Thread Si-Wei Liu



On 1/12/2022 11:03 PM, Michael S. Tsirkin wrote:

On Thu, Jan 13, 2022 at 12:10:48AM -0500, Si-Wei Liu wrote:

This patchset contains the fixes for a few issues uncovered during the
review for the "Allow for configuring max number of virtqueue pairs"
series.

It is based on Eli's fixes:
2e4cda633a22 ("vdpa/mlx5: Fix tracking of current number of VQs")
in the vhost tree.

It's really cleanups more than fixes. I'm not sure about the
code changes (the vdpa change looks ok, mlx5 ones need review
by nvidia folks) but from documentation POV this patchset needs
more work.
Yeah, this changeset is aiming to tighten the lose end found during code 
review.  Will try to improve commit message/comment and get back.


There will be another set that I am working on for real issues. Stay tuned.

-Siwei



Si-Wei Liu (3):
   vdpa: factor out vdpa_set_features_unlocked for vdpa internal use
   vdpa/mlx5: set_features should nack MQ if no CTRL_VQ
   vdpa/mlx5: validate the queue pair value from driver

  drivers/vdpa/mlx5/net/mlx5_vnet.c | 26 +++---
  drivers/vdpa/vdpa.c   |  2 +-
  drivers/vhost/vdpa.c  |  2 +-
  drivers/virtio/virtio_vdpa.c  |  2 +-
  include/linux/vdpa.h  | 18 --
  5 files changed, 38 insertions(+), 12 deletions(-)

--
1.8.3.1


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 3/3] vdpa/mlx5: validate the queue pair value from driver

2022-01-14 Thread Si-Wei Liu




On 1/12/2022 11:00 PM, Michael S. Tsirkin wrote:

On Thu, Jan 13, 2022 at 12:10:51AM -0500, Si-Wei Liu wrote:

Fixes: 52893733f2c5 ("vdpa/mlx5: Add multiqueue support")
Signed-off-by: Si-Wei Liu

Add motivation for change in the commit log.



---
  drivers/vdpa/mlx5/net/mlx5_vnet.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 46d4deb..491127f 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1563,11 +1563,21 @@ static virtio_net_ctrl_ack handle_ctrl_mq(struct 
mlx5_vdpa_dev *mvdev, u8 cmd)
  
  	switch (cmd) {

case VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET:
+   /* This mq feature check aligns with pre-existing userspace 
implementation,
+* although the spec doesn't mandate so.

And so ... why do we bother? what breaks if we don't?
Without it, a malicious driver could fake a config multiqueue request 
down to a non-mq backend that may cause kernel to panic due to 
uninitialized resources for the queue. Even with a well behaving guest 
driver, it is not expected to allow such kind of change.



+*/
+   if (!MLX5_FEATURE(mvdev, VIRTIO_NET_F_MQ))
+   break;
+


this part is not described in the commit log at all.
is it intentional?


No, I can add it.



read = vringh_iov_pull_iotlb(&cvq->vring, &cvq->riov, (void 
*)&mq, sizeof(mq));
if (read != sizeof(mq))
break;
  
  		newqps = mlx5vdpa16_to_cpu(mvdev, mq.virtqueue_pairs);

+   if (newqps < VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MIN ||
+   newqps > mlx5_vdpa_max_qps(mvdev->max_vqs))
+   break;
+
if (ndev->cur_num_vqs == 2 * newqps) {
status = VIRTIO_NET_OK;
break;
--
1.8.3.1


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v1] vhost: cache avail index in vhost_enable_notify()

2022-01-14 Thread Stefano Garzarella
In vhost_enable_notify() we enable the notifications and we read
the avail index to check if new buffers have become available in
the meantime.

We are not caching the avail index, so when the device will call
vhost_get_vq_desc(), it will find the old value in the cache and
it will read the avail index again.

It would be better to refresh the cache every time we read avail
index, so let's change vhost_enable_notify() caching the value in
`avail_idx` and compare it with `last_avail_idx` to check if there
are new buffers available.

Anyway, we don't expect a significant performance boost because
the above path is not very common, indeed vhost_enable_notify()
is often called with unlikely(), expecting that avail index has
not been updated.

Signed-off-by: Stefano Garzarella 
---
v1:
- improved the commit description [MST, Jason]
---
 drivers/vhost/vhost.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 59edb5a1ffe2..07363dff559e 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2543,8 +2543,9 @@ bool vhost_enable_notify(struct vhost_dev *dev, struct 
vhost_virtqueue *vq)
   &vq->avail->idx, r);
return false;
}
+   vq->avail_idx = vhost16_to_cpu(vq, avail_idx);
 
-   return vhost16_to_cpu(vq, avail_idx) != vq->avail_idx;
+   return vq->avail_idx != vq->last_avail_idx;
 }
 EXPORT_SYMBOL_GPL(vhost_enable_notify);
 
-- 
2.31.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/3] vdpa/mlx5: set_features should nack MQ if no CTRL_VQ

2022-01-14 Thread Si-Wei Liu




On 1/12/2022 10:57 PM, Michael S. Tsirkin wrote:

On Thu, Jan 13, 2022 at 12:10:50AM -0500, Si-Wei Liu wrote:

Made corresponding change per spec:



The device MUST NOT offer a feature which requires another feature
which was not offered.

Says nothing about the driver though, and you seem to be
doing things to driver features?
Yes, it's about validation for driver features, though the spec doesn't 
have clear way how to deal with this situation. I guess this in reality 
leaves quite some space for the implementation. To step back, in recent 
days with latent spec revision for feature negotiation due to endianness 
and MTU validation, what do we expect device to work if the driver is 
not compliant and comes up with invalid features set? To clear a subset 
of driver features unsupported by the device, such that driver may 
figure out by reading it from device config space later on? Or fail the 
entire features and have driver to re-try a different setting? Do you 
feel its possible for device to clear a subset of invalid or unsupported 
features sent down by the driver, which may allow driver continue to 
work without having to fail the entire feature negotiation?


The current userspace implementation in qemu may filter out invalid 
features from driver by clearing a subset and tailor it to fit what 
host/device can offer. I thought it should be safe to follow the 
existing practice. That way guest driver can get know of the effective 
features during feature negotiation, or after features_ok is set (that's 
what I call by "re-read" of features, sorry if I used the wrong term). 
Did I miss something? I can definitely add more explanation for the 
motivation, remove the reference to spec and delete the Fixes tag to 
avoid confusions. Do you think this would work?


Another option would be just return failure for the 
set_driver_features() call when seeing (MQ && !CTRL_VQ). Simple enough 
and easy to implement. Efficient to indicate which individual feature is 
failing? Probably not, driver has to retry a few times using binary 
search to know.



pls explain the motivation. which config are you trying to
fix what is current and expected behaviour.
The current mq code for mlx5_vdpa driver is written in the assumption 
that MQ must come together with CTRL_VQ. I would like to point out that 
right now there's nowhere in the host side even QEMU to guarantee this 
assumption would hold. Were there a malicious driver sending down MQ 
without CTRL_VQ, it would compromise various spots such as 
is_index_valid() and is_ctrl_vq_idx(). This doesn't end up with 
immediate panic or security loophole in the host currently, but still 
the chance for this being taken advantage of is not zero, especially 
when future code change is involved. You can say it's code cleanup, but 
the added check helps harden the crispy assumption and assures peace of 
mind.





Fixes: 52893733f2c5 ("vdpa/mlx5: Add multiqueue support")


It's all theoretical right? Fixes really means
"if you have commit ABC then you should pick this one up".
not really appropriate for theoretical fixes.
Yeah. This was discovered in code review. Didn't see a real issue. I can 
remove the tag.


-Siwei



Signed-off-by: Si-Wei Liu
---
  drivers/vdpa/mlx5/net/mlx5_vnet.c | 16 +---
  1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index b53603d..46d4deb 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1897,11 +1897,21 @@ static u64 mlx5_vdpa_get_device_features(struct 
vdpa_device *vdev)
return ndev->mvdev.mlx_features;
  }
  
-static int verify_min_features(struct mlx5_vdpa_dev *mvdev, u64 features)

+static int verify_driver_features(struct mlx5_vdpa_dev *mvdev, u64 *features)


Good rename actually but document in commit log with an
explanation.


  {
-   if (!(features & BIT_ULL(VIRTIO_F_ACCESS_PLATFORM)))
+   /* minimum features to expect */
+   if (!(*features & BIT_ULL(VIRTIO_F_ACCESS_PLATFORM)))
return -EOPNOTSUPP;
  
+	/* Double check features combination sent down by the driver.

+* NACK invalid feature due to the absence of depended feature.

Pls rewrite this to make it grammatical.  There's no NACK in spec. What
does this do? Fails to set FEATURES_OK?


+* Driver is expected to re-read the negotiated features once
+* return from set_driver_features.

once return is ungrammatical. What to say here depends on what
you mean by this, so I'm not sure.


Here's text from spec:

\item\label{itm:General Initialization And Device Operation /
Device Initialization / Read feature bits} Read device feature bits, and write 
the subset of feature bits
understood by the OS and driver to the device.  During this step the
driver MAY read (but MUST NOT write) the device-specific configuration 
fields to check that it can support the device before accepting it.

Re: [RFC PATCH] vhost: cache avail index in vhost_enable_notify()

2022-01-14 Thread Stefano Garzarella

On Fri, Jan 14, 2022 at 02:18:01PM +0800, Jason Wang wrote:

On Thu, Jan 13, 2022 at 10:57 PM Stefano Garzarella  wrote:


In vhost_enable_notify() we enable the notifications and we read
the avail index to check if new buffers have become available in
the meantime. In this case, the device would go to re-read avail
index to access the descriptor.

As we already do in other place, we can cache the value in `avail_idx`
and compare it with `last_avail_idx` to check if there are new
buffers available.

Signed-off-by: Stefano Garzarella 


Patch looks fine but I guess we won't get performance improvement
since it doesn't save any userspace/VM memory access?


It should save the memory access when vhost_enable_notify() find 
something new in the VQ, so in this path:


vhost_enable_notify() <- VM memory access for avail index
  == true

vhost_disable_notify()
...

vhost_get_vq_desc()   <- VM memory access for avail index
 with the patch applied, this access is 
 avoided since avail index is cached


In any case, I don't expect this to be a very common path, indeed we
usually use unlikely() for this path:

if (unlikely(vhost_enable_notify(dev, vq))) {
vhost_disable_notify(dev, vq);
continue;
}

So I don't expect a significant performance increase.

v1 coming with a better commit description.

Thanks,
Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization