Re: [PATCH v3] drm/virtio: fix missing dma_fence_put() in virtio_gpu_execbuffer_ioctl()

2020-08-06 Thread Greg KH
On Fri, Aug 07, 2020 at 11:00:11AM +0800, 何鑫 wrote:
> Xin He  于2020年7月21日周二 下午6:17写道:
> >
> > From: Qi Liu 
> >
> > We should put the reference count of the fence after calling
> > virtio_gpu_cmd_submit(). So add the missing dma_fence_put().
> >
> > Fixes: 2cd7b6f08bc4 ("drm/virtio: add in/out fence support for explicit 
> > synchronization")
> > Co-developed-by: Xin He 
> > Signed-off-by: Xin He 
> > Signed-off-by: Qi Liu 
> > Reviewed-by: Muchun Song 
> > ---
> >
> > changelog in v3:
> > 1) Change the subject from "drm/virtio: fixed memory leak in 
> > virtio_gpu_execbuffer_ioctl()" to
> >"drm/virtio: fix missing dma_fence_put() in 
> > virtio_gpu_execbuffer_ioctl()"
> > 2) Rework the commit log
> >
> > changelog in v2:
> > 1) Add a change description
> >
> >  drivers/gpu/drm/virtio/virtgpu_ioctl.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
> > b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> > index 5df722072ba0..19c5bc01eb79 100644
> > --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> > +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> > @@ -179,6 +179,7 @@ static int virtio_gpu_execbuffer_ioctl(struct 
> > drm_device *dev, void *data,
> >
> > virtio_gpu_cmd_submit(vgdev, buf, exbuf->size,
> >   vfpriv->ctx_id, buflist, out_fence);
> > +   dma_fence_put(&out_fence->f);
> > virtio_gpu_notify(vgdev);
> > return 0;
> >
> > --
> > 2.21.1 (Apple Git-122.3)
> >
> 
> cc Greg

Why?

$ ./scripts/get_maintainer.pl --file drivers/gpu/drm/virtio/virtgpu_ioctl.c
David Airlie  (maintainer:VIRTIO GPU DRIVER)
Gerd Hoffmann  (maintainer:VIRTIO GPU DRIVER)
Daniel Vetter  (maintainer:DRM DRIVERS)
Sumit Semwal  (maintainer:DMA BUFFER SHARING FRAMEWORK)
"Christian König"  (maintainer:DMA BUFFER SHARING 
FRAMEWORK)
dri-de...@lists.freedesktop.org (open list:VIRTIO GPU DRIVER)
virtualization@lists.linux-foundation.org (open list:VIRTIO GPU DRIVER)
linux-ker...@vger.kernel.org (open list)
linux-me...@vger.kernel.org (open list:DMA BUFFER SHARING FRAMEWORK)
linaro-mm-...@lists.linaro.org (moderated list:DMA BUFFER SHARING FRAMEWORK)

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] vdpa/mlx5: Fix erroneous null pointer checks

2020-08-06 Thread Jason Wang


On 2020/8/7 上午11:37, Jason Wang wrote:


On 2020/8/7 上午3:18, Alex Dewar wrote:

In alloc_inout() in net/mlx5_vnet.c, there are a few places where memory
is allocated to *in and *out, but only the values of in and out are
null-checked (i.e. there is a missing dereference). Fix this.

Addresses-Coverity: ("CID 1496603: (REVERSE_INULL)")
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 
devices")

Signed-off-by: Alex Dewar 



Acked-by: Jason Wang 



Colin posted something similar: [PATCH][next] vdpa/mlx5: fix memory 
allocation failure checks


And I think his fix is better since it prevent raw pointers to be freed.

Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH][next] vdpa/mlx5: fix memory allocation failure checks

2020-08-06 Thread Jason Wang


On 2020/8/7 上午12:08, Colin King wrote:

From: Colin Ian King 

The memory allocation failure checking for in and out is currently
checking if the pointers are valid rather than the contents of what
they point to. Hence the null check on failed memory allocations is
incorrect.  Fix this by adding the missing indirection in the check.
Also for the default case, just set the *in and *out to null as
these don't have any thing allocated to kfree. Finally remove the
redundant *in and *out check as these have been already done on each
allocation in the case statement.

Addresses-Coverity: ("Null pointer dereference")
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: Colin Ian King 



Acked-by: Jason Wang 



---
  drivers/vdpa/mlx5/net/mlx5_vnet.c | 13 ++---
  1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 3ec44a4f0e45..55bc58e1dae9 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -867,7 +867,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
*outlen = MLX5_ST_SZ_BYTES(qp_2rst_out);
*in = kzalloc(*inlen, GFP_KERNEL);
*out = kzalloc(*outlen, GFP_KERNEL);
-   if (!in || !out)
+   if (!*in || !*out)
goto outerr;
  
  		MLX5_SET(qp_2rst_in, *in, opcode, cmd);

@@ -879,7 +879,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
*outlen = MLX5_ST_SZ_BYTES(rst2init_qp_out);
*in = kzalloc(*inlen, GFP_KERNEL);
*out = kzalloc(MLX5_ST_SZ_BYTES(rst2init_qp_out), GFP_KERNEL);
-   if (!in || !out)
+   if (!*in || !*out)
goto outerr;
  
  		MLX5_SET(rst2init_qp_in, *in, opcode, cmd);

@@ -896,7 +896,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
*outlen = MLX5_ST_SZ_BYTES(init2rtr_qp_out);
*in = kzalloc(*inlen, GFP_KERNEL);
*out = kzalloc(MLX5_ST_SZ_BYTES(init2rtr_qp_out), GFP_KERNEL);
-   if (!in || !out)
+   if (!*in || !*out)
goto outerr;
  
  		MLX5_SET(init2rtr_qp_in, *in, opcode, cmd);

@@ -914,7 +914,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
*outlen = MLX5_ST_SZ_BYTES(rtr2rts_qp_out);
*in = kzalloc(*inlen, GFP_KERNEL);
*out = kzalloc(MLX5_ST_SZ_BYTES(rtr2rts_qp_out), GFP_KERNEL);
-   if (!in || !out)
+   if (!*in || !*out)
goto outerr;
  
  		MLX5_SET(rtr2rts_qp_in, *in, opcode, cmd);

@@ -927,16 +927,15 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
MLX5_SET(qpc, qpc, rnr_retry, 7);
break;
default:
-   goto outerr;
+   goto outerr_nullify;
}
-   if (!*in || !*out)
-   goto outerr;
  
  	return;
  
  outerr:

kfree(*in);
kfree(*out);
+outerr_nullify:
*in = NULL;
*out = NULL;
  }


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] vdpa/mlx5: Fix uninitialised variable in core/mr.c

2020-08-06 Thread Jason Wang


On 2020/8/7 上午2:56, Alex Dewar wrote:

If the kernel is unable to allocate memory for the variable dmr then
err will be returned without being set. Set err to -ENOMEM in this
case.

Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code")
Addresses-Coverity: ("Uninitialized variables")
Signed-off-by: Alex Dewar 



Acked-by: Jason Wang 



---
  drivers/vdpa/mlx5/core/mr.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c
index f5dec0274133..ef1c550f8266 100644
--- a/drivers/vdpa/mlx5/core/mr.c
+++ b/drivers/vdpa/mlx5/core/mr.c
@@ -319,8 +319,10 @@ static int add_direct_chain(struct mlx5_vdpa_dev *mvdev, 
u64 start, u64 size, u8
while (size) {
sz = (u32)min_t(u64, MAX_KLM_SIZE, size);
dmr = kzalloc(sizeof(*dmr), GFP_KERNEL);
-   if (!dmr)
+   if (!dmr) {
+   err = -ENOMEM;
goto err_alloc;
+   }

dmr->start = st;
dmr->end = st + sz;
--
2.28.0



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 1/2] vdpa: ifcvf: return err when fail to request config irq

2020-08-06 Thread Jason Wang


On 2020/7/23 下午5:12, Jason Wang wrote:

We ignore the err of requesting config interrupt, fix this.

Fixes: e7991f376a4d ("ifcvf: implement config interrupt in IFCVF")
Cc: Zhu Lingshan 
Signed-off-by: Jason Wang 
---
  drivers/vdpa/ifcvf/ifcvf_main.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index f5a60c14b979..ae7110955a44 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -76,6 +76,10 @@ static int ifcvf_request_irq(struct ifcvf_adapter *adapter)
ret = devm_request_irq(&pdev->dev, irq,
   ifcvf_config_changed, 0,
   vf->config_msix_name, vf);
+   if (ret) {
+   IFCVF_ERR(pdev, "Failed to request config irq\n");
+   return ret;
+   }
  
  	for (i = 0; i < IFCVF_MAX_QUEUE_PAIRS * 2; i++) {

snprintf(vf->vring[i].msix_name, 256, "ifcvf[%s]-%d\n",



Hi Michael:

Any comments on this series?

Thanks


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] vdpa/mlx5: Fix erroneous null pointer checks

2020-08-06 Thread Jason Wang


On 2020/8/7 上午3:18, Alex Dewar wrote:

In alloc_inout() in net/mlx5_vnet.c, there are a few places where memory
is allocated to *in and *out, but only the values of in and out are
null-checked (i.e. there is a missing dereference). Fix this.

Addresses-Coverity: ("CID 1496603: (REVERSE_INULL)")
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: Alex Dewar 



Acked-by: Jason Wang 



---
  drivers/vdpa/mlx5/net/mlx5_vnet.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 3ec44a4f0e45..bcb6600c2839 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -867,7 +867,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
*outlen = MLX5_ST_SZ_BYTES(qp_2rst_out);
*in = kzalloc(*inlen, GFP_KERNEL);
*out = kzalloc(*outlen, GFP_KERNEL);
-   if (!in || !out)
+   if (!*in || !*out)
goto outerr;
  
  		MLX5_SET(qp_2rst_in, *in, opcode, cmd);

@@ -879,7 +879,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
*outlen = MLX5_ST_SZ_BYTES(rst2init_qp_out);
*in = kzalloc(*inlen, GFP_KERNEL);
*out = kzalloc(MLX5_ST_SZ_BYTES(rst2init_qp_out), GFP_KERNEL);
-   if (!in || !out)
+   if (!*in || !*out)
goto outerr;
  
  		MLX5_SET(rst2init_qp_in, *in, opcode, cmd);

@@ -896,7 +896,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
*outlen = MLX5_ST_SZ_BYTES(init2rtr_qp_out);
*in = kzalloc(*inlen, GFP_KERNEL);
*out = kzalloc(MLX5_ST_SZ_BYTES(init2rtr_qp_out), GFP_KERNEL);
-   if (!in || !out)
+   if (!*in || !*out)
goto outerr;
  
  		MLX5_SET(init2rtr_qp_in, *in, opcode, cmd);

@@ -914,7 +914,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
*outlen = MLX5_ST_SZ_BYTES(rtr2rts_qp_out);
*in = kzalloc(*inlen, GFP_KERNEL);
*out = kzalloc(MLX5_ST_SZ_BYTES(rtr2rts_qp_out), GFP_KERNEL);
-   if (!in || !out)
+   if (!*in || !*out)
goto outerr;
  
  		MLX5_SET(rtr2rts_qp_in, *in, opcode, cmd);


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 03/24] virtio: allow __virtioXX, __leXX in config space

2020-08-06 Thread Jason Wang


On 2020/8/6 下午1:58, Michael S. Tsirkin wrote:

On Thu, Aug 06, 2020 at 11:37:38AM +0800, Jason Wang wrote:

On 2020/8/5 下午7:45, Michael S. Tsirkin wrote:

#define virtio_cread(vdev, structname, member, ptr) \
do {\
might_sleep();  \
/* Must match the member's type, and be integer */  \
-   if (!typecheck(typeofstructname*)0)->member)), *(ptr))) \
+   if (!__virtio_typecheck(structname, member, *(ptr)))\
(*ptr) = 1; \

A silly question,  compare to using set()/get() directly, what's the value
of the accessors macro here?

Thanks

get/set don't convert to the native endian, I guess that's why
drivers use cread/cwrite. It is also nice that there's type
safety, checking the correct integer width is used.


Yes, but this is simply because a macro is used here, how about just doing
things similar like virtio_cread_bytes():

static inline void virtio_cread(struct virtio_device *vdev,
                   unsigned int offset,
                   void *buf, size_t len)


And do the endian conversion inside?

Thanks


Then you lose type safety. It's very easy to have an le32 field
and try to read it into a u16 by mistake.

These macros are all about preventing bugs: and the whole patchset
is about several bugs sparse found - that is what prompted me to make
type checks more strict.



Yes, but we need to do the macro with compiler extensions. I wonder 
whether or not the kernel has already had something since this request 
here is pretty common?


Thanks







___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 1/4] vdpa: introduce config op to get valid iova range

2020-08-06 Thread Jason Wang


On 2020/8/6 下午8:29, Michael S. Tsirkin wrote:

On Thu, Aug 06, 2020 at 03:03:55PM +0300, Eli Cohen wrote:

On Wed, Aug 05, 2020 at 08:51:56AM -0400, Michael S. Tsirkin wrote:

On Wed, Jun 17, 2020 at 11:29:44AM +0800, Jason Wang wrote:

This patch introduce a config op to get valid iova range from the vDPA
device.

Signed-off-by: Jason Wang
---
  include/linux/vdpa.h | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 239db794357c..b7633ed2500c 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -41,6 +41,16 @@ struct vdpa_device {
unsigned int index;
  };
  
+/**

+ * vDPA IOVA range - the IOVA range support by the device
+ * @start: start of the IOVA range
+ * @end: end of the IOVA range
+ */
+struct vdpa_iova_range {
+   u64 start;
+   u64 end;
+};
+

This is ambiguous. Is end in the range or just behind it?
How about first/last?

It is customary in the kernel to use start-end where end corresponds to
the byte following the last in the range. See struct vm_area_struct
vm_start and vm_end fields

Exactly my point:

include/linux/mm_types.h:   unsigned long vm_end;   /* The first 
byte after our end address

in this case Jason wants it to be the last byte, not one behind.



Ok, I somehow recall the reason :)

See:

struct iommu_domain_geometry {
    dma_addr_t aperture_start; /* First address that can be mapped    */
    dma_addr_t aperture_end;   /* Last address that can be mapped */
    bool force_aperture;   /* DMA only allowed in mappable range? */
};


So what I proposed here is to be consistent with it.

Thanks







___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 1/4] vdpa: introduce config op to get valid iova range

2020-08-06 Thread Jason Wang


On 2020/8/6 下午8:10, Eli Cohen wrote:

On Wed, Jun 17, 2020 at 06:29:44AM +0300, Jason Wang wrote:

This patch introduce a config op to get valid iova range from the vDPA
device.

Signed-off-by: Jason Wang
---
  include/linux/vdpa.h | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 239db794357c..b7633ed2500c 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -41,6 +41,16 @@ struct vdpa_device {
unsigned int index;
  };
  
+/**

+ * vDPA IOVA range - the IOVA range support by the device
+ * @start: start of the IOVA range
+ * @end: end of the IOVA range
+ */
+struct vdpa_iova_range {
+   u64 start;
+   u64 end;
+};
+

What do you do with this information? Suppose some device tells you it
supports some limited range, say, from 0x4000 to 0x8000. What
does qemu do with this information?



For qemu, when qemu will fail the vDPA device creation when:

1) vIOMMU is not enabled and GPA is out of this range
2) vIOMMU is enabled but it can't report such range to guest

For other userspace application, it will know it can only use this range 
as its IOVA.


Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 19/24] vdpa: make sure set_features in invoked for legacy

2020-08-06 Thread Jason Wang


On 2020/8/6 下午6:00, Michael S. Tsirkin wrote:

On Thu, Aug 06, 2020 at 03:27:38PM +0800, Jason Wang wrote:

On 2020/8/6 下午1:53, Michael S. Tsirkin wrote:

On Thu, Aug 06, 2020 at 11:23:05AM +0800, Jason Wang wrote:

On 2020/8/5 下午7:40, Michael S. Tsirkin wrote:

On Wed, Aug 05, 2020 at 02:14:07PM +0800, Jason Wang wrote:

On 2020/8/4 上午5:00, Michael S. Tsirkin wrote:

Some legacy guests just assume features are 0 after reset.
We detect that config space is accessed before features are
set and set features to 0 automatically.
Note: some legacy guests might not even access config space, if this is
reported in the field we might need to catch a kick to handle these.

I wonder whether it's easier to just support modern device?

Thanks

Well hardware vendors are I think interested in supporting legacy
guests. Limiting vdpa to modern only would make it uncompetitive.

My understanding is that, IOMMU_PLATFORM is mandatory for hardware vDPA to
work.

Hmm I don't really see why. Assume host maps guest memory properly,
VM does not have an IOMMU, legacy guest can just work.


Yes, guest may not set IOMMU_PLATFORM.



Care explaining what's wrong with this picture?


The problem is virtio_vdpa, without IOMMU_PLATFORM it uses PA which can not
work if IOMMU is enabled.

Thanks

So that's a virtio_vdpa limitation.



Probably not, I think this goes back to the long debate of whether to 
use DMA API unconditionally. If we did that, we can support legacy 
virtio driver.


The vDPA device needs to provide a DMA device and the virtio core and 
perform DMA API with that device which should work for all of the cases.


But a big question is, do upstream care about out of tree virtio drivers?

Thanks



In the same way, if a device
does not have an on-device iommu *and* is not behind an iommu,
then vdpa can't bind to it.

But this virtio_vdpa specific hack does not belong in a generic vdpa code.




So it can only work for modern device ...

Thanks






___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 31/38] virtio_fs: convert to LE accessors

2020-08-06 Thread Vivek Goyal
On Wed, Aug 05, 2020 at 09:44:39AM -0400, Michael S. Tsirkin wrote:
> Virtio fs is modern-only. Use LE accessors for config space.
> 
> Signed-off-by: Michael S. Tsirkin 

Acked-by: Vivek Goyal 

Vivek

> ---
>  fs/fuse/virtio_fs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
> index 4c4ef5d69298..104f35de5270 100644
> --- a/fs/fuse/virtio_fs.c
> +++ b/fs/fuse/virtio_fs.c
> @@ -606,8 +606,8 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev,
>   unsigned int i;
>   int ret = 0;
>  
> - virtio_cread(vdev, struct virtio_fs_config, num_request_queues,
> -  &fs->num_request_queues);
> + virtio_cread_le(vdev, struct virtio_fs_config, num_request_queues,
> + &fs->num_request_queues);
>   if (fs->num_request_queues == 0)
>   return -EINVAL;
>  
> -- 
> MST
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH][next] vdpa/mlx5: fix memory allocation failure checks

2020-08-06 Thread Colin King
From: Colin Ian King 

The memory allocation failure checking for in and out is currently
checking if the pointers are valid rather than the contents of what
they point to. Hence the null check on failed memory allocations is
incorrect.  Fix this by adding the missing indirection in the check.
Also for the default case, just set the *in and *out to null as
these don't have any thing allocated to kfree. Finally remove the
redundant *in and *out check as these have been already done on each
allocation in the case statement.

Addresses-Coverity: ("Null pointer dereference")
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: Colin Ian King 
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 3ec44a4f0e45..55bc58e1dae9 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -867,7 +867,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
*outlen = MLX5_ST_SZ_BYTES(qp_2rst_out);
*in = kzalloc(*inlen, GFP_KERNEL);
*out = kzalloc(*outlen, GFP_KERNEL);
-   if (!in || !out)
+   if (!*in || !*out)
goto outerr;
 
MLX5_SET(qp_2rst_in, *in, opcode, cmd);
@@ -879,7 +879,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
*outlen = MLX5_ST_SZ_BYTES(rst2init_qp_out);
*in = kzalloc(*inlen, GFP_KERNEL);
*out = kzalloc(MLX5_ST_SZ_BYTES(rst2init_qp_out), GFP_KERNEL);
-   if (!in || !out)
+   if (!*in || !*out)
goto outerr;
 
MLX5_SET(rst2init_qp_in, *in, opcode, cmd);
@@ -896,7 +896,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
*outlen = MLX5_ST_SZ_BYTES(init2rtr_qp_out);
*in = kzalloc(*inlen, GFP_KERNEL);
*out = kzalloc(MLX5_ST_SZ_BYTES(init2rtr_qp_out), GFP_KERNEL);
-   if (!in || !out)
+   if (!*in || !*out)
goto outerr;
 
MLX5_SET(init2rtr_qp_in, *in, opcode, cmd);
@@ -914,7 +914,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
*outlen = MLX5_ST_SZ_BYTES(rtr2rts_qp_out);
*in = kzalloc(*inlen, GFP_KERNEL);
*out = kzalloc(MLX5_ST_SZ_BYTES(rtr2rts_qp_out), GFP_KERNEL);
-   if (!in || !out)
+   if (!*in || !*out)
goto outerr;
 
MLX5_SET(rtr2rts_qp_in, *in, opcode, cmd);
@@ -927,16 +927,15 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int 
cmd, void **in, int *inl
MLX5_SET(qpc, qpc, rnr_retry, 7);
break;
default:
-   goto outerr;
+   goto outerr_nullify;
}
-   if (!*in || !*out)
-   goto outerr;
 
return;
 
 outerr:
kfree(*in);
kfree(*out);
+outerr_nullify:
*in = NULL;
*out = NULL;
 }
-- 
2.27.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 0/1] s390: virtio-ccw: PV needs VIRTIO I/O device protection

2020-08-06 Thread Cornelia Huck
On Thu,  6 Aug 2020 16:23:01 +0200
Pierre Morel  wrote:

> Hi all,
> 
> In another series I proposed to add an architecture specific
> callback to fail feature negociation on architecture need.
> 
> In VIRTIO, we already have an entry to reject the features on the
> transport basis.
> 
> Transport is not architecture so I send a separate series in which
> we fail the feature negociation inside virtio_ccw_finalize_features,
> the virtio_config_ops.finalize_features for S390 CCW transport,
> when the device do not propose the VIRTIO_F_IOMMU_PLATFORM.
> 
> This solves the problem of crashing QEMU when this one is not using
> a CCW device with iommu_platform=on in S390.

This does work, and I'm tempted to queue this patch, but I'm wondering
whether we need to give up on a cross-architecture solution already
(especially keeping in mind that ccw is the only transport that is
really architecture-specific).

I know that we've gone through a few rounds already, and I'm not sure
whether we've been there already, but:

Could virtio_finalize_features() call an optional
arch_has_restricted_memory_access() function and do the enforcing of
IOMMU_PLATFORM? That would catch all transports, and things should work
once an architecture opts in. That direction also shouldn't be a
problem if virtio is a module.

> 
> Regards,
> Pierre
> 
> Regards,
> Pierre
> 
> Pierre Morel (1):
>   s390: virtio-ccw: PV needs VIRTIO I/O device protection
> 
>  drivers/s390/virtio/virtio_ccw.c | 24 +++-
>  1 file changed, 19 insertions(+), 5 deletions(-)
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v1 1/1] s390: virtio-ccw: PV needs VIRTIO I/O device protection

2020-08-06 Thread Pierre Morel
If protected virtualization is active on s390, the virtio queues are
not accessible to the host, unless VIRTIO_F_IOMMU_PLATFORM has been
negotiated. Use ccw_transport_features() to fail feature negociation
and consequently probe if that's not the case, preventing a host
error on access attempt.

Signed-off-by: Pierre Morel 
---
 drivers/s390/virtio/virtio_ccw.c | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 5730572b52cd..cc8d8064c6c4 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -803,11 +803,23 @@ static u64 virtio_ccw_get_features(struct virtio_device 
*vdev)
return rc;
 }
 
-static void ccw_transport_features(struct virtio_device *vdev)
+static int ccw_transport_features(struct virtio_device *vdev)
 {
-   /*
-* Currently nothing to do here.
-*/
+   if (!is_prot_virt_guest())
+   return 0;
+
+   if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) {
+   dev_warn(&vdev->dev,
+"device must provide VIRTIO_F_VERSION_1\n");
+   return -ENODEV;
+   }
+
+   if (!virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
+   dev_warn(&vdev->dev,
+"device must provide VIRTIO_F_IOMMU_PLATFORM\n");
+   return -ENODEV;
+   }
+   return 0;
 }
 
 static int virtio_ccw_finalize_features(struct virtio_device *vdev)
@@ -837,7 +849,9 @@ static int virtio_ccw_finalize_features(struct 
virtio_device *vdev)
vring_transport_features(vdev);
 
/* Give virtio_ccw a chance to accept features. */
-   ccw_transport_features(vdev);
+   ret = ccw_transport_features(vdev);
+   if (ret)
+   goto out_free;
 
features->index = 0;
features->features = cpu_to_le32((u32)vdev->features);
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v1 0/1] s390: virtio-ccw: PV needs VIRTIO I/O device protection

2020-08-06 Thread Pierre Morel
Hi all,

In another series I proposed to add an architecture specific
callback to fail feature negociation on architecture need.

In VIRTIO, we already have an entry to reject the features on the
transport basis.

Transport is not architecture so I send a separate series in which
we fail the feature negociation inside virtio_ccw_finalize_features,
the virtio_config_ops.finalize_features for S390 CCW transport,
when the device do not propose the VIRTIO_F_IOMMU_PLATFORM.

This solves the problem of crashing QEMU when this one is not using
a CCW device with iommu_platform=on in S390.

Regards,
Pierre

Regards,
Pierre

Pierre Morel (1):
  s390: virtio-ccw: PV needs VIRTIO I/O device protection

 drivers/s390/virtio/virtio_ccw.c | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 4/6] mm/page_isolation: cleanup set_migratetype_isolate()

2020-08-06 Thread David Hildenbrand
On 06.08.20 15:35, Vlastimil Babka wrote:
> On 7/30/20 11:34 AM, David Hildenbrand wrote:
>> Let's clean it up a bit, simplifying error handling and getting rid of
>> the label.
> 
> Nit: the label was already removed by patch 1/6?
> 

Ack, leftover from reshuffling - thanks!

-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v7 2/2] s390: virtio: PV needs VIRTIO I/O device protection

2020-08-06 Thread Pierre Morel




On 2020-07-30 13:31, Pierre Morel wrote:
...snip...

What bothers me here is that arch code depends on virtio now.
It works even with a modular virtio when functions are inline,
but it seems fragile: e.g. it breaks virtio as an out of tree module,
since layout of struct virtio_device can change.



The code was only called from virtio.c so it should be fine.

And my understanding is that we don't need to care about the kABI issue
during upstream development?

Thanks


No, but so far it has been convenient at least for me, for development,
to just be able to unload all of virtio and load a different version.






I'm not sure what to do with this yet, will try to think about it
over the weekend. Thanks!


After reflection, I am not sure that this problem must be treated on the 
architecture level or inside the VIRTIO transport.

Consequently, I will propose another patch series based on CCW transport.
This also should be more convenient for core development.

Regards,
Pierre

--
Pierre Morel
IBM Lab Boeblingen
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 4/6] mm/page_isolation: cleanup set_migratetype_isolate()

2020-08-06 Thread Vlastimil Babka
On 7/30/20 11:34 AM, David Hildenbrand wrote:
> Let's clean it up a bit, simplifying error handling and getting rid of
> the label.

Nit: the label was already removed by patch 1/6?

> Reviewed-by: Baoquan He 
> Reviewed-by: Pankaj Gupta 
> Cc: Andrew Morton 
> Cc: Michal Hocko 
> Cc: Michael S. Tsirkin 
> Cc: Mike Kravetz 
> Signed-off-by: David Hildenbrand 
> ---
>  mm/page_isolation.c | 17 +++--
>  1 file changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index d099aac479601..e65fe5d770849 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -17,12 +17,9 @@
>  
>  static int set_migratetype_isolate(struct page *page, int migratetype, int 
> isol_flags)
>  {
> - struct page *unmovable = NULL;
> - struct zone *zone;
> + struct zone *zone = page_zone(page);
> + struct page *unmovable;
>   unsigned long flags;
> - int ret = -EBUSY;
> -
> - zone = page_zone(page);
>  
>   spin_lock_irqsave(&zone->lock, flags);
>  
> @@ -51,13 +48,13 @@ static int set_migratetype_isolate(struct page *page, int 
> migratetype, int isol_
>   NULL);
>  
>   __mod_zone_freepage_state(zone, -nr_pages, mt);
> - ret = 0;
> + spin_unlock_irqrestore(&zone->lock, flags);
> + drain_all_pages(zone);
> + return 0;
>   }
>  
>   spin_unlock_irqrestore(&zone->lock, flags);
> - if (!ret) {
> - drain_all_pages(zone);
> - } else if ((isol_flags & REPORT_FAILURE) && unmovable) {
> + if (isol_flags & REPORT_FAILURE) {
>   /*
>* printk() with zone->lock held will likely trigger a
>* lockdep splat, so defer it here.
> @@ -65,7 +62,7 @@ static int set_migratetype_isolate(struct page *page, int 
> migratetype, int isol_
>   dump_page(unmovable, "unmovable page");
>   }
>  
> - return ret;
> + return -EBUSY;
>  }
>  
>  static void unset_migratetype_isolate(struct page *page, unsigned 
> migratetype)
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/4] vdpa: introduce config op to get valid iova range

2020-08-06 Thread Michael S. Tsirkin
On Thu, Aug 06, 2020 at 03:03:55PM +0300, Eli Cohen wrote:
> On Wed, Aug 05, 2020 at 08:51:56AM -0400, Michael S. Tsirkin wrote:
> > On Wed, Jun 17, 2020 at 11:29:44AM +0800, Jason Wang wrote:
> > > This patch introduce a config op to get valid iova range from the vDPA
> > > device.
> > > 
> > > Signed-off-by: Jason Wang 
> > > ---
> > >  include/linux/vdpa.h | 14 ++
> > >  1 file changed, 14 insertions(+)
> > > 
> > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > > index 239db794357c..b7633ed2500c 100644
> > > --- a/include/linux/vdpa.h
> > > +++ b/include/linux/vdpa.h
> > > @@ -41,6 +41,16 @@ struct vdpa_device {
> > >   unsigned int index;
> > >  };
> > >  
> > > +/**
> > > + * vDPA IOVA range - the IOVA range support by the device
> > > + * @start: start of the IOVA range
> > > + * @end: end of the IOVA range
> > > + */
> > > +struct vdpa_iova_range {
> > > + u64 start;
> > > + u64 end;
> > > +};
> > > +
> > 
> > 
> > This is ambiguous. Is end in the range or just behind it?
> > How about first/last?
> 
> It is customary in the kernel to use start-end where end corresponds to
> the byte following the last in the range. See struct vm_area_struct
> vm_start and vm_end fields

Exactly my point:

include/linux/mm_types.h:   unsigned long vm_end;   /* The first 
byte after our end address

in this case Jason wants it to be the last byte, not one behind.


> > 
> > 
> > 
> > >  /**
> > >   * vDPA_config_ops - operations for configuring a vDPA device.
> > >   * Note: vDPA device drivers are required to implement all of the
> > > @@ -134,6 +144,9 @@ struct vdpa_device {
> > >   * @get_generation:  Get device config generation (optional)
> > >   *   @vdev: vdpa device
> > >   *   Returns u32: device generation
> > > + * @get_iova_range:  Get supported iova range (on-chip IOMMU)
> > > + *   @vdev: vdpa device
> > > + *   Returns the iova range supported by the 
> > > device
> > >   * @set_map: Set device memory mapping (optional)
> > >   *   Needed for device that using device
> > >   *   specific DMA translation (on-chip IOMMU)
> > > @@ -195,6 +208,7 @@ struct vdpa_config_ops {
> > >   void (*set_config)(struct vdpa_device *vdev, unsigned int offset,
> > >  const void *buf, unsigned int len);
> > >   u32 (*get_generation)(struct vdpa_device *vdev);
> > > + struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev);
> > >  
> > >   /* DMA ops */
> > >   int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> > > -- 
> > > 2.20.1
> > 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers

2020-08-06 Thread peterz
On Thu, Aug 06, 2020 at 09:47:23AM +0200, Marco Elver wrote:
> Testing my hypothesis that raw then nested non-raw
> local_irq_save/restore() breaks IRQ state tracking -- see the reproducer
> below. This is at least 1 case I can think of that we're bound to hit.

Aaargh!

> diff --git a/init/main.c b/init/main.c
> index 15bd0efff3df..0873319dcff4 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -1041,6 +1041,22 @@ asmlinkage __visible void __init start_kernel(void)
>   sfi_init_late();
>   kcsan_init();
>  
> + /* DEBUG CODE */
> + lockdep_assert_irqs_enabled(); /* Pass. */
> + {
> + unsigned long flags1;
> + raw_local_irq_save(flags1);

This disables IRQs but doesn't trace..

> + {
> + unsigned long flags2;
> + lockdep_assert_irqs_enabled(); /* Pass - expectedly 
> blind. */

Indeed, we didn't trace the above disable, so software state is still
on.

> + local_irq_save(flags2);

So here we save IRQ state, and unconditionally disable IRQs and trace
them disabled.

> + lockdep_assert_irqs_disabled(); /* Pass. */
> + local_irq_restore(flags2);

But here, we restore IRQ state to 'disabled' and explicitly trace it
disabled *again* (which is a bit daft, but whatever).

> + }
> + raw_local_irq_restore(flags1);

This then restores the IRQ state to enable, but no tracing.

> + }
> + lockdep_assert_irqs_enabled(); /* FAIL! */

And we're out of sync... :/

/me goes ponder things...

How's something like this then?

---
 include/linux/sched.h |  3 ---
 kernel/kcsan/core.c   | 62 ---
 2 files changed, 44 insertions(+), 21 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 06ec60462af0..2f5aef57e687 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1193,9 +1193,6 @@ struct task_struct {
 
 #ifdef CONFIG_KCSAN
struct kcsan_ctxkcsan_ctx;
-#ifdef CONFIG_TRACE_IRQFLAGS
-   struct irqtrace_events  kcsan_save_irqtrace;
-#endif
 #endif
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c
index 9147ff6a12e5..9c4436bf0561 100644
--- a/kernel/kcsan/core.c
+++ b/kernel/kcsan/core.c
@@ -291,17 +291,50 @@ static inline unsigned int get_delay(void)
0);
 }
 
-void kcsan_save_irqtrace(struct task_struct *task)
+/*
+ * KCSAN hooks are everywhere, which means they're NMI like for interrupt
+ * tracing. In order to present a 'normal' as possible context to the code
+ * called by KCSAN when reporting errors we need to update the irq-tracing
+ * state.
+ *
+ * Save and restore the IRQ state trace touched by KCSAN, since KCSAN's
+ * runtime is entered for every memory access, and potentially useful
+ * information is lost if dirtied by KCSAN.
+ */
+
+struct kcsan_irq_state {
+   unsigned long   flags;
+#ifdef CONFIG_TRACE_IRQFLAGS
+   int hardirqs;
+   struct irqtrace_events  irqtrace;
+#endif
+};
+
+void kcsan_save_irqtrace(struct kcsan_irq_state *irq_state)
 {
 #ifdef CONFIG_TRACE_IRQFLAGS
-   task->kcsan_save_irqtrace = task->irqtrace;
+   irq_state->irqtrace = task->irqtrace;
+   irq_state->hardirq = lockdep_hardirqs_enabled();
 #endif
+   if (!kcsan_interrupt_watcher) {
+   raw_local_irq_save(irq_state->flags);
+   lockdep_hardirqs_off(CALLER_ADDR0);
+   }
 }
 
-void kcsan_restore_irqtrace(struct task_struct *task)
+void kcsan_restore_irqtrace(struct kcsan_irq_state *irq_state)
 {
+   if (!kcsan_interrupt_watcher) {
+#ifdef CONFIG_TRACE_IRQFLAGS
+   if (irq_state->hardirqs) {
+   lockdep_hardirqs_on_prepare(CALLER_ADDR0);
+   lockdep_hardirqs_on(CALLER_ADDR0);
+   }
+#endif
+   raw_local_irq_restore(irq_state->flags);
+   }
 #ifdef CONFIG_TRACE_IRQFLAGS
-   task->irqtrace = task->kcsan_save_irqtrace;
+   task->irqtrace = irq_state->irqtrace;
 #endif
 }
 
@@ -350,11 +383,13 @@ static noinline void kcsan_found_watchpoint(const 
volatile void *ptr,
flags = user_access_save();
 
if (consumed) {
-   kcsan_save_irqtrace(current);
+   struct kcsan_irq_state irqstate;
+
+   kcsan_save_irqtrace(&irqstate);
kcsan_report(ptr, size, type, KCSAN_VALUE_CHANGE_MAYBE,
 KCSAN_REPORT_CONSUMED_WATCHPOINT,
 watchpoint - watchpoints);
-   kcsan_restore_irqtrace(current);
+   kcsan_restore_irqtrace(&irqstate);
} else {
/*
 * The other thread may not print any diagnostics, as it has
@@ -387,7 +422,7 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t 
size, int type)
unsigned long access_mask;
e

Re: [PATCH v2 19/24] vdpa: make sure set_features in invoked for legacy

2020-08-06 Thread Michael S. Tsirkin
On Thu, Aug 06, 2020 at 03:27:38PM +0800, Jason Wang wrote:
> 
> On 2020/8/6 下午1:53, Michael S. Tsirkin wrote:
> > On Thu, Aug 06, 2020 at 11:23:05AM +0800, Jason Wang wrote:
> > > On 2020/8/5 下午7:40, Michael S. Tsirkin wrote:
> > > > On Wed, Aug 05, 2020 at 02:14:07PM +0800, Jason Wang wrote:
> > > > > On 2020/8/4 上午5:00, Michael S. Tsirkin wrote:
> > > > > > Some legacy guests just assume features are 0 after reset.
> > > > > > We detect that config space is accessed before features are
> > > > > > set and set features to 0 automatically.
> > > > > > Note: some legacy guests might not even access config space, if 
> > > > > > this is
> > > > > > reported in the field we might need to catch a kick to handle these.
> > > > > I wonder whether it's easier to just support modern device?
> > > > > 
> > > > > Thanks
> > > > Well hardware vendors are I think interested in supporting legacy
> > > > guests. Limiting vdpa to modern only would make it uncompetitive.
> > > 
> > > My understanding is that, IOMMU_PLATFORM is mandatory for hardware vDPA to
> > > work.
> > Hmm I don't really see why. Assume host maps guest memory properly,
> > VM does not have an IOMMU, legacy guest can just work.
> 
> 
> Yes, guest may not set IOMMU_PLATFORM.
> 
> 
> > 
> > Care explaining what's wrong with this picture?
> 
> 
> The problem is virtio_vdpa, without IOMMU_PLATFORM it uses PA which can not
> work if IOMMU is enabled.
> 
> Thanks

So that's a virtio_vdpa limitation. In the same way, if a device
does not have an on-device iommu *and* is not behind an iommu,
then vdpa can't bind to it.

But this virtio_vdpa specific hack does not belong in a generic vdpa code.

> 
> > 
> > 
> > > So it can only work for modern device ...
> > > 
> > > Thanks
> > > 
> > > 
> > > > 
> > > > 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2 19/24] vdpa: make sure set_features in invoked for legacy

2020-08-06 Thread Jason Wang


On 2020/8/6 下午1:53, Michael S. Tsirkin wrote:

On Thu, Aug 06, 2020 at 11:23:05AM +0800, Jason Wang wrote:

On 2020/8/5 下午7:40, Michael S. Tsirkin wrote:

On Wed, Aug 05, 2020 at 02:14:07PM +0800, Jason Wang wrote:

On 2020/8/4 上午5:00, Michael S. Tsirkin wrote:

Some legacy guests just assume features are 0 after reset.
We detect that config space is accessed before features are
set and set features to 0 automatically.
Note: some legacy guests might not even access config space, if this is
reported in the field we might need to catch a kick to handle these.

I wonder whether it's easier to just support modern device?

Thanks

Well hardware vendors are I think interested in supporting legacy
guests. Limiting vdpa to modern only would make it uncompetitive.


My understanding is that, IOMMU_PLATFORM is mandatory for hardware vDPA to
work.

Hmm I don't really see why. Assume host maps guest memory properly,
VM does not have an IOMMU, legacy guest can just work.



Yes, guest may not set IOMMU_PLATFORM.




Care explaining what's wrong with this picture?



The problem is virtio_vdpa, without IOMMU_PLATFORM it uses PA which can 
not work if IOMMU is enabled.


Thanks






So it can only work for modern device ...

Thanks







___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization