Re: [PATCH v3] drm/virtio: fix missing dma_fence_put() in virtio_gpu_execbuffer_ioctl()
On Fri, Aug 07, 2020 at 11:00:11AM +0800, 何鑫 wrote: > Xin He 于2020年7月21日周二 下午6:17写道: > > > > From: Qi Liu > > > > We should put the reference count of the fence after calling > > virtio_gpu_cmd_submit(). So add the missing dma_fence_put(). > > > > Fixes: 2cd7b6f08bc4 ("drm/virtio: add in/out fence support for explicit > > synchronization") > > Co-developed-by: Xin He > > Signed-off-by: Xin He > > Signed-off-by: Qi Liu > > Reviewed-by: Muchun Song > > --- > > > > changelog in v3: > > 1) Change the subject from "drm/virtio: fixed memory leak in > > virtio_gpu_execbuffer_ioctl()" to > >"drm/virtio: fix missing dma_fence_put() in > > virtio_gpu_execbuffer_ioctl()" > > 2) Rework the commit log > > > > changelog in v2: > > 1) Add a change description > > > > drivers/gpu/drm/virtio/virtgpu_ioctl.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > index 5df722072ba0..19c5bc01eb79 100644 > > --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > @@ -179,6 +179,7 @@ static int virtio_gpu_execbuffer_ioctl(struct > > drm_device *dev, void *data, > > > > virtio_gpu_cmd_submit(vgdev, buf, exbuf->size, > > vfpriv->ctx_id, buflist, out_fence); > > + dma_fence_put(&out_fence->f); > > virtio_gpu_notify(vgdev); > > return 0; > > > > -- > > 2.21.1 (Apple Git-122.3) > > > > cc Greg Why? $ ./scripts/get_maintainer.pl --file drivers/gpu/drm/virtio/virtgpu_ioctl.c David Airlie (maintainer:VIRTIO GPU DRIVER) Gerd Hoffmann (maintainer:VIRTIO GPU DRIVER) Daniel Vetter (maintainer:DRM DRIVERS) Sumit Semwal (maintainer:DMA BUFFER SHARING FRAMEWORK) "Christian König" (maintainer:DMA BUFFER SHARING FRAMEWORK) dri-de...@lists.freedesktop.org (open list:VIRTIO GPU DRIVER) virtualization@lists.linux-foundation.org (open list:VIRTIO GPU DRIVER) linux-ker...@vger.kernel.org (open list) linux-me...@vger.kernel.org (open list:DMA BUFFER SHARING FRAMEWORK) linaro-mm-...@lists.linaro.org (moderated list:DMA BUFFER SHARING FRAMEWORK) ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] vdpa/mlx5: Fix erroneous null pointer checks
On 2020/8/7 上午11:37, Jason Wang wrote: On 2020/8/7 上午3:18, Alex Dewar wrote: In alloc_inout() in net/mlx5_vnet.c, there are a few places where memory is allocated to *in and *out, but only the values of in and out are null-checked (i.e. there is a missing dereference). Fix this. Addresses-Coverity: ("CID 1496603: (REVERSE_INULL)") Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Alex Dewar Acked-by: Jason Wang Colin posted something similar: [PATCH][next] vdpa/mlx5: fix memory allocation failure checks And I think his fix is better since it prevent raw pointers to be freed. Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH][next] vdpa/mlx5: fix memory allocation failure checks
On 2020/8/7 上午12:08, Colin King wrote: From: Colin Ian King The memory allocation failure checking for in and out is currently checking if the pointers are valid rather than the contents of what they point to. Hence the null check on failed memory allocations is incorrect. Fix this by adding the missing indirection in the check. Also for the default case, just set the *in and *out to null as these don't have any thing allocated to kfree. Finally remove the redundant *in and *out check as these have been already done on each allocation in the case statement. Addresses-Coverity: ("Null pointer dereference") Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Colin Ian King Acked-by: Jason Wang --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 3ec44a4f0e45..55bc58e1dae9 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -867,7 +867,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(qp_2rst_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(*outlen, GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(qp_2rst_in, *in, opcode, cmd); @@ -879,7 +879,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rst2init_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rst2init_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rst2init_qp_in, *in, opcode, cmd); @@ -896,7 +896,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(init2rtr_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(init2rtr_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(init2rtr_qp_in, *in, opcode, cmd); @@ -914,7 +914,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rtr2rts_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rtr2rts_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rtr2rts_qp_in, *in, opcode, cmd); @@ -927,16 +927,15 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl MLX5_SET(qpc, qpc, rnr_retry, 7); break; default: - goto outerr; + goto outerr_nullify; } - if (!*in || !*out) - goto outerr; return; outerr: kfree(*in); kfree(*out); +outerr_nullify: *in = NULL; *out = NULL; } ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] vdpa/mlx5: Fix uninitialised variable in core/mr.c
On 2020/8/7 上午2:56, Alex Dewar wrote: If the kernel is unable to allocate memory for the variable dmr then err will be returned without being set. Set err to -ENOMEM in this case. Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code") Addresses-Coverity: ("Uninitialized variables") Signed-off-by: Alex Dewar Acked-by: Jason Wang --- drivers/vdpa/mlx5/core/mr.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c index f5dec0274133..ef1c550f8266 100644 --- a/drivers/vdpa/mlx5/core/mr.c +++ b/drivers/vdpa/mlx5/core/mr.c @@ -319,8 +319,10 @@ static int add_direct_chain(struct mlx5_vdpa_dev *mvdev, u64 start, u64 size, u8 while (size) { sz = (u32)min_t(u64, MAX_KLM_SIZE, size); dmr = kzalloc(sizeof(*dmr), GFP_KERNEL); - if (!dmr) + if (!dmr) { + err = -ENOMEM; goto err_alloc; + } dmr->start = st; dmr->end = st + sz; -- 2.28.0 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 1/2] vdpa: ifcvf: return err when fail to request config irq
On 2020/7/23 下午5:12, Jason Wang wrote: We ignore the err of requesting config interrupt, fix this. Fixes: e7991f376a4d ("ifcvf: implement config interrupt in IFCVF") Cc: Zhu Lingshan Signed-off-by: Jason Wang --- drivers/vdpa/ifcvf/ifcvf_main.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c index f5a60c14b979..ae7110955a44 100644 --- a/drivers/vdpa/ifcvf/ifcvf_main.c +++ b/drivers/vdpa/ifcvf/ifcvf_main.c @@ -76,6 +76,10 @@ static int ifcvf_request_irq(struct ifcvf_adapter *adapter) ret = devm_request_irq(&pdev->dev, irq, ifcvf_config_changed, 0, vf->config_msix_name, vf); + if (ret) { + IFCVF_ERR(pdev, "Failed to request config irq\n"); + return ret; + } for (i = 0; i < IFCVF_MAX_QUEUE_PAIRS * 2; i++) { snprintf(vf->vring[i].msix_name, 256, "ifcvf[%s]-%d\n", Hi Michael: Any comments on this series? Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] vdpa/mlx5: Fix erroneous null pointer checks
On 2020/8/7 上午3:18, Alex Dewar wrote: In alloc_inout() in net/mlx5_vnet.c, there are a few places where memory is allocated to *in and *out, but only the values of in and out are null-checked (i.e. there is a missing dereference). Fix this. Addresses-Coverity: ("CID 1496603: (REVERSE_INULL)") Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Alex Dewar Acked-by: Jason Wang --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 3ec44a4f0e45..bcb6600c2839 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -867,7 +867,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(qp_2rst_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(*outlen, GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(qp_2rst_in, *in, opcode, cmd); @@ -879,7 +879,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rst2init_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rst2init_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rst2init_qp_in, *in, opcode, cmd); @@ -896,7 +896,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(init2rtr_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(init2rtr_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(init2rtr_qp_in, *in, opcode, cmd); @@ -914,7 +914,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rtr2rts_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rtr2rts_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rtr2rts_qp_in, *in, opcode, cmd); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 03/24] virtio: allow __virtioXX, __leXX in config space
On 2020/8/6 下午1:58, Michael S. Tsirkin wrote: On Thu, Aug 06, 2020 at 11:37:38AM +0800, Jason Wang wrote: On 2020/8/5 下午7:45, Michael S. Tsirkin wrote: #define virtio_cread(vdev, structname, member, ptr) \ do {\ might_sleep(); \ /* Must match the member's type, and be integer */ \ - if (!typecheck(typeofstructname*)0)->member)), *(ptr))) \ + if (!__virtio_typecheck(structname, member, *(ptr)))\ (*ptr) = 1; \ A silly question, compare to using set()/get() directly, what's the value of the accessors macro here? Thanks get/set don't convert to the native endian, I guess that's why drivers use cread/cwrite. It is also nice that there's type safety, checking the correct integer width is used. Yes, but this is simply because a macro is used here, how about just doing things similar like virtio_cread_bytes(): static inline void virtio_cread(struct virtio_device *vdev, unsigned int offset, void *buf, size_t len) And do the endian conversion inside? Thanks Then you lose type safety. It's very easy to have an le32 field and try to read it into a u16 by mistake. These macros are all about preventing bugs: and the whole patchset is about several bugs sparse found - that is what prompted me to make type checks more strict. Yes, but we need to do the macro with compiler extensions. I wonder whether or not the kernel has already had something since this request here is pretty common? Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 1/4] vdpa: introduce config op to get valid iova range
On 2020/8/6 下午8:29, Michael S. Tsirkin wrote: On Thu, Aug 06, 2020 at 03:03:55PM +0300, Eli Cohen wrote: On Wed, Aug 05, 2020 at 08:51:56AM -0400, Michael S. Tsirkin wrote: On Wed, Jun 17, 2020 at 11:29:44AM +0800, Jason Wang wrote: This patch introduce a config op to get valid iova range from the vDPA device. Signed-off-by: Jason Wang --- include/linux/vdpa.h | 14 ++ 1 file changed, 14 insertions(+) diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 239db794357c..b7633ed2500c 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -41,6 +41,16 @@ struct vdpa_device { unsigned int index; }; +/** + * vDPA IOVA range - the IOVA range support by the device + * @start: start of the IOVA range + * @end: end of the IOVA range + */ +struct vdpa_iova_range { + u64 start; + u64 end; +}; + This is ambiguous. Is end in the range or just behind it? How about first/last? It is customary in the kernel to use start-end where end corresponds to the byte following the last in the range. See struct vm_area_struct vm_start and vm_end fields Exactly my point: include/linux/mm_types.h: unsigned long vm_end; /* The first byte after our end address in this case Jason wants it to be the last byte, not one behind. Ok, I somehow recall the reason :) See: struct iommu_domain_geometry { dma_addr_t aperture_start; /* First address that can be mapped */ dma_addr_t aperture_end; /* Last address that can be mapped */ bool force_aperture; /* DMA only allowed in mappable range? */ }; So what I proposed here is to be consistent with it. Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 1/4] vdpa: introduce config op to get valid iova range
On 2020/8/6 下午8:10, Eli Cohen wrote: On Wed, Jun 17, 2020 at 06:29:44AM +0300, Jason Wang wrote: This patch introduce a config op to get valid iova range from the vDPA device. Signed-off-by: Jason Wang --- include/linux/vdpa.h | 14 ++ 1 file changed, 14 insertions(+) diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 239db794357c..b7633ed2500c 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -41,6 +41,16 @@ struct vdpa_device { unsigned int index; }; +/** + * vDPA IOVA range - the IOVA range support by the device + * @start: start of the IOVA range + * @end: end of the IOVA range + */ +struct vdpa_iova_range { + u64 start; + u64 end; +}; + What do you do with this information? Suppose some device tells you it supports some limited range, say, from 0x4000 to 0x8000. What does qemu do with this information? For qemu, when qemu will fail the vDPA device creation when: 1) vIOMMU is not enabled and GPA is out of this range 2) vIOMMU is enabled but it can't report such range to guest For other userspace application, it will know it can only use this range as its IOVA. Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 19/24] vdpa: make sure set_features in invoked for legacy
On 2020/8/6 下午6:00, Michael S. Tsirkin wrote: On Thu, Aug 06, 2020 at 03:27:38PM +0800, Jason Wang wrote: On 2020/8/6 下午1:53, Michael S. Tsirkin wrote: On Thu, Aug 06, 2020 at 11:23:05AM +0800, Jason Wang wrote: On 2020/8/5 下午7:40, Michael S. Tsirkin wrote: On Wed, Aug 05, 2020 at 02:14:07PM +0800, Jason Wang wrote: On 2020/8/4 上午5:00, Michael S. Tsirkin wrote: Some legacy guests just assume features are 0 after reset. We detect that config space is accessed before features are set and set features to 0 automatically. Note: some legacy guests might not even access config space, if this is reported in the field we might need to catch a kick to handle these. I wonder whether it's easier to just support modern device? Thanks Well hardware vendors are I think interested in supporting legacy guests. Limiting vdpa to modern only would make it uncompetitive. My understanding is that, IOMMU_PLATFORM is mandatory for hardware vDPA to work. Hmm I don't really see why. Assume host maps guest memory properly, VM does not have an IOMMU, legacy guest can just work. Yes, guest may not set IOMMU_PLATFORM. Care explaining what's wrong with this picture? The problem is virtio_vdpa, without IOMMU_PLATFORM it uses PA which can not work if IOMMU is enabled. Thanks So that's a virtio_vdpa limitation. Probably not, I think this goes back to the long debate of whether to use DMA API unconditionally. If we did that, we can support legacy virtio driver. The vDPA device needs to provide a DMA device and the virtio core and perform DMA API with that device which should work for all of the cases. But a big question is, do upstream care about out of tree virtio drivers? Thanks In the same way, if a device does not have an on-device iommu *and* is not behind an iommu, then vdpa can't bind to it. But this virtio_vdpa specific hack does not belong in a generic vdpa code. So it can only work for modern device ... Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 31/38] virtio_fs: convert to LE accessors
On Wed, Aug 05, 2020 at 09:44:39AM -0400, Michael S. Tsirkin wrote: > Virtio fs is modern-only. Use LE accessors for config space. > > Signed-off-by: Michael S. Tsirkin Acked-by: Vivek Goyal Vivek > --- > fs/fuse/virtio_fs.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c > index 4c4ef5d69298..104f35de5270 100644 > --- a/fs/fuse/virtio_fs.c > +++ b/fs/fuse/virtio_fs.c > @@ -606,8 +606,8 @@ static int virtio_fs_setup_vqs(struct virtio_device *vdev, > unsigned int i; > int ret = 0; > > - virtio_cread(vdev, struct virtio_fs_config, num_request_queues, > - &fs->num_request_queues); > + virtio_cread_le(vdev, struct virtio_fs_config, num_request_queues, > + &fs->num_request_queues); > if (fs->num_request_queues == 0) > return -EINVAL; > > -- > MST > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH][next] vdpa/mlx5: fix memory allocation failure checks
From: Colin Ian King The memory allocation failure checking for in and out is currently checking if the pointers are valid rather than the contents of what they point to. Hence the null check on failed memory allocations is incorrect. Fix this by adding the missing indirection in the check. Also for the default case, just set the *in and *out to null as these don't have any thing allocated to kfree. Finally remove the redundant *in and *out check as these have been already done on each allocation in the case statement. Addresses-Coverity: ("Null pointer dereference") Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Colin Ian King --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 3ec44a4f0e45..55bc58e1dae9 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -867,7 +867,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(qp_2rst_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(*outlen, GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(qp_2rst_in, *in, opcode, cmd); @@ -879,7 +879,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rst2init_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rst2init_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rst2init_qp_in, *in, opcode, cmd); @@ -896,7 +896,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(init2rtr_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(init2rtr_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(init2rtr_qp_in, *in, opcode, cmd); @@ -914,7 +914,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rtr2rts_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rtr2rts_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rtr2rts_qp_in, *in, opcode, cmd); @@ -927,16 +927,15 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl MLX5_SET(qpc, qpc, rnr_retry, 7); break; default: - goto outerr; + goto outerr_nullify; } - if (!*in || !*out) - goto outerr; return; outerr: kfree(*in); kfree(*out); +outerr_nullify: *in = NULL; *out = NULL; } -- 2.27.0 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1 0/1] s390: virtio-ccw: PV needs VIRTIO I/O device protection
On Thu, 6 Aug 2020 16:23:01 +0200 Pierre Morel wrote: > Hi all, > > In another series I proposed to add an architecture specific > callback to fail feature negociation on architecture need. > > In VIRTIO, we already have an entry to reject the features on the > transport basis. > > Transport is not architecture so I send a separate series in which > we fail the feature negociation inside virtio_ccw_finalize_features, > the virtio_config_ops.finalize_features for S390 CCW transport, > when the device do not propose the VIRTIO_F_IOMMU_PLATFORM. > > This solves the problem of crashing QEMU when this one is not using > a CCW device with iommu_platform=on in S390. This does work, and I'm tempted to queue this patch, but I'm wondering whether we need to give up on a cross-architecture solution already (especially keeping in mind that ccw is the only transport that is really architecture-specific). I know that we've gone through a few rounds already, and I'm not sure whether we've been there already, but: Could virtio_finalize_features() call an optional arch_has_restricted_memory_access() function and do the enforcing of IOMMU_PLATFORM? That would catch all transports, and things should work once an architecture opts in. That direction also shouldn't be a problem if virtio is a module. > > Regards, > Pierre > > Regards, > Pierre > > Pierre Morel (1): > s390: virtio-ccw: PV needs VIRTIO I/O device protection > > drivers/s390/virtio/virtio_ccw.c | 24 +++- > 1 file changed, 19 insertions(+), 5 deletions(-) > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH v1 1/1] s390: virtio-ccw: PV needs VIRTIO I/O device protection
If protected virtualization is active on s390, the virtio queues are not accessible to the host, unless VIRTIO_F_IOMMU_PLATFORM has been negotiated. Use ccw_transport_features() to fail feature negociation and consequently probe if that's not the case, preventing a host error on access attempt. Signed-off-by: Pierre Morel --- drivers/s390/virtio/virtio_ccw.c | 24 +++- 1 file changed, 19 insertions(+), 5 deletions(-) diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c index 5730572b52cd..cc8d8064c6c4 100644 --- a/drivers/s390/virtio/virtio_ccw.c +++ b/drivers/s390/virtio/virtio_ccw.c @@ -803,11 +803,23 @@ static u64 virtio_ccw_get_features(struct virtio_device *vdev) return rc; } -static void ccw_transport_features(struct virtio_device *vdev) +static int ccw_transport_features(struct virtio_device *vdev) { - /* -* Currently nothing to do here. -*/ + if (!is_prot_virt_guest()) + return 0; + + if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) { + dev_warn(&vdev->dev, +"device must provide VIRTIO_F_VERSION_1\n"); + return -ENODEV; + } + + if (!virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) { + dev_warn(&vdev->dev, +"device must provide VIRTIO_F_IOMMU_PLATFORM\n"); + return -ENODEV; + } + return 0; } static int virtio_ccw_finalize_features(struct virtio_device *vdev) @@ -837,7 +849,9 @@ static int virtio_ccw_finalize_features(struct virtio_device *vdev) vring_transport_features(vdev); /* Give virtio_ccw a chance to accept features. */ - ccw_transport_features(vdev); + ret = ccw_transport_features(vdev); + if (ret) + goto out_free; features->index = 0; features->features = cpu_to_le32((u32)vdev->features); -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH v1 0/1] s390: virtio-ccw: PV needs VIRTIO I/O device protection
Hi all, In another series I proposed to add an architecture specific callback to fail feature negociation on architecture need. In VIRTIO, we already have an entry to reject the features on the transport basis. Transport is not architecture so I send a separate series in which we fail the feature negociation inside virtio_ccw_finalize_features, the virtio_config_ops.finalize_features for S390 CCW transport, when the device do not propose the VIRTIO_F_IOMMU_PLATFORM. This solves the problem of crashing QEMU when this one is not using a CCW device with iommu_platform=on in S390. Regards, Pierre Regards, Pierre Pierre Morel (1): s390: virtio-ccw: PV needs VIRTIO I/O device protection drivers/s390/virtio/virtio_ccw.c | 24 +++- 1 file changed, 19 insertions(+), 5 deletions(-) -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 4/6] mm/page_isolation: cleanup set_migratetype_isolate()
On 06.08.20 15:35, Vlastimil Babka wrote: > On 7/30/20 11:34 AM, David Hildenbrand wrote: >> Let's clean it up a bit, simplifying error handling and getting rid of >> the label. > > Nit: the label was already removed by patch 1/6? > Ack, leftover from reshuffling - thanks! -- Thanks, David / dhildenb ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v7 2/2] s390: virtio: PV needs VIRTIO I/O device protection
On 2020-07-30 13:31, Pierre Morel wrote: ...snip... What bothers me here is that arch code depends on virtio now. It works even with a modular virtio when functions are inline, but it seems fragile: e.g. it breaks virtio as an out of tree module, since layout of struct virtio_device can change. The code was only called from virtio.c so it should be fine. And my understanding is that we don't need to care about the kABI issue during upstream development? Thanks No, but so far it has been convenient at least for me, for development, to just be able to unload all of virtio and load a different version. I'm not sure what to do with this yet, will try to think about it over the weekend. Thanks! After reflection, I am not sure that this problem must be treated on the architecture level or inside the VIRTIO transport. Consequently, I will propose another patch series based on CCW transport. This also should be more convenient for core development. Regards, Pierre -- Pierre Morel IBM Lab Boeblingen ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 4/6] mm/page_isolation: cleanup set_migratetype_isolate()
On 7/30/20 11:34 AM, David Hildenbrand wrote: > Let's clean it up a bit, simplifying error handling and getting rid of > the label. Nit: the label was already removed by patch 1/6? > Reviewed-by: Baoquan He > Reviewed-by: Pankaj Gupta > Cc: Andrew Morton > Cc: Michal Hocko > Cc: Michael S. Tsirkin > Cc: Mike Kravetz > Signed-off-by: David Hildenbrand > --- > mm/page_isolation.c | 17 +++-- > 1 file changed, 7 insertions(+), 10 deletions(-) > > diff --git a/mm/page_isolation.c b/mm/page_isolation.c > index d099aac479601..e65fe5d770849 100644 > --- a/mm/page_isolation.c > +++ b/mm/page_isolation.c > @@ -17,12 +17,9 @@ > > static int set_migratetype_isolate(struct page *page, int migratetype, int > isol_flags) > { > - struct page *unmovable = NULL; > - struct zone *zone; > + struct zone *zone = page_zone(page); > + struct page *unmovable; > unsigned long flags; > - int ret = -EBUSY; > - > - zone = page_zone(page); > > spin_lock_irqsave(&zone->lock, flags); > > @@ -51,13 +48,13 @@ static int set_migratetype_isolate(struct page *page, int > migratetype, int isol_ > NULL); > > __mod_zone_freepage_state(zone, -nr_pages, mt); > - ret = 0; > + spin_unlock_irqrestore(&zone->lock, flags); > + drain_all_pages(zone); > + return 0; > } > > spin_unlock_irqrestore(&zone->lock, flags); > - if (!ret) { > - drain_all_pages(zone); > - } else if ((isol_flags & REPORT_FAILURE) && unmovable) { > + if (isol_flags & REPORT_FAILURE) { > /* >* printk() with zone->lock held will likely trigger a >* lockdep splat, so defer it here. > @@ -65,7 +62,7 @@ static int set_migratetype_isolate(struct page *page, int > migratetype, int isol_ > dump_page(unmovable, "unmovable page"); > } > > - return ret; > + return -EBUSY; > } > > static void unset_migratetype_isolate(struct page *page, unsigned > migratetype) > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 1/4] vdpa: introduce config op to get valid iova range
On Thu, Aug 06, 2020 at 03:03:55PM +0300, Eli Cohen wrote: > On Wed, Aug 05, 2020 at 08:51:56AM -0400, Michael S. Tsirkin wrote: > > On Wed, Jun 17, 2020 at 11:29:44AM +0800, Jason Wang wrote: > > > This patch introduce a config op to get valid iova range from the vDPA > > > device. > > > > > > Signed-off-by: Jason Wang > > > --- > > > include/linux/vdpa.h | 14 ++ > > > 1 file changed, 14 insertions(+) > > > > > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h > > > index 239db794357c..b7633ed2500c 100644 > > > --- a/include/linux/vdpa.h > > > +++ b/include/linux/vdpa.h > > > @@ -41,6 +41,16 @@ struct vdpa_device { > > > unsigned int index; > > > }; > > > > > > +/** > > > + * vDPA IOVA range - the IOVA range support by the device > > > + * @start: start of the IOVA range > > > + * @end: end of the IOVA range > > > + */ > > > +struct vdpa_iova_range { > > > + u64 start; > > > + u64 end; > > > +}; > > > + > > > > > > This is ambiguous. Is end in the range or just behind it? > > How about first/last? > > It is customary in the kernel to use start-end where end corresponds to > the byte following the last in the range. See struct vm_area_struct > vm_start and vm_end fields Exactly my point: include/linux/mm_types.h: unsigned long vm_end; /* The first byte after our end address in this case Jason wants it to be the last byte, not one behind. > > > > > > > > > /** > > > * vDPA_config_ops - operations for configuring a vDPA device. > > > * Note: vDPA device drivers are required to implement all of the > > > @@ -134,6 +144,9 @@ struct vdpa_device { > > > * @get_generation: Get device config generation (optional) > > > * @vdev: vdpa device > > > * Returns u32: device generation > > > + * @get_iova_range: Get supported iova range (on-chip IOMMU) > > > + * @vdev: vdpa device > > > + * Returns the iova range supported by the > > > device > > > * @set_map: Set device memory mapping (optional) > > > * Needed for device that using device > > > * specific DMA translation (on-chip IOMMU) > > > @@ -195,6 +208,7 @@ struct vdpa_config_ops { > > > void (*set_config)(struct vdpa_device *vdev, unsigned int offset, > > > const void *buf, unsigned int len); > > > u32 (*get_generation)(struct vdpa_device *vdev); > > > + struct vdpa_iova_range (*get_iova_range)(struct vdpa_device *vdev); > > > > > > /* DMA ops */ > > > int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb); > > > -- > > > 2.20.1 > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] x86/paravirt: Add missing noinstr to arch_local*() helpers
On Thu, Aug 06, 2020 at 09:47:23AM +0200, Marco Elver wrote: > Testing my hypothesis that raw then nested non-raw > local_irq_save/restore() breaks IRQ state tracking -- see the reproducer > below. This is at least 1 case I can think of that we're bound to hit. Aaargh! > diff --git a/init/main.c b/init/main.c > index 15bd0efff3df..0873319dcff4 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -1041,6 +1041,22 @@ asmlinkage __visible void __init start_kernel(void) > sfi_init_late(); > kcsan_init(); > > + /* DEBUG CODE */ > + lockdep_assert_irqs_enabled(); /* Pass. */ > + { > + unsigned long flags1; > + raw_local_irq_save(flags1); This disables IRQs but doesn't trace.. > + { > + unsigned long flags2; > + lockdep_assert_irqs_enabled(); /* Pass - expectedly > blind. */ Indeed, we didn't trace the above disable, so software state is still on. > + local_irq_save(flags2); So here we save IRQ state, and unconditionally disable IRQs and trace them disabled. > + lockdep_assert_irqs_disabled(); /* Pass. */ > + local_irq_restore(flags2); But here, we restore IRQ state to 'disabled' and explicitly trace it disabled *again* (which is a bit daft, but whatever). > + } > + raw_local_irq_restore(flags1); This then restores the IRQ state to enable, but no tracing. > + } > + lockdep_assert_irqs_enabled(); /* FAIL! */ And we're out of sync... :/ /me goes ponder things... How's something like this then? --- include/linux/sched.h | 3 --- kernel/kcsan/core.c | 62 --- 2 files changed, 44 insertions(+), 21 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 06ec60462af0..2f5aef57e687 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1193,9 +1193,6 @@ struct task_struct { #ifdef CONFIG_KCSAN struct kcsan_ctxkcsan_ctx; -#ifdef CONFIG_TRACE_IRQFLAGS - struct irqtrace_events kcsan_save_irqtrace; -#endif #endif #ifdef CONFIG_FUNCTION_GRAPH_TRACER diff --git a/kernel/kcsan/core.c b/kernel/kcsan/core.c index 9147ff6a12e5..9c4436bf0561 100644 --- a/kernel/kcsan/core.c +++ b/kernel/kcsan/core.c @@ -291,17 +291,50 @@ static inline unsigned int get_delay(void) 0); } -void kcsan_save_irqtrace(struct task_struct *task) +/* + * KCSAN hooks are everywhere, which means they're NMI like for interrupt + * tracing. In order to present a 'normal' as possible context to the code + * called by KCSAN when reporting errors we need to update the irq-tracing + * state. + * + * Save and restore the IRQ state trace touched by KCSAN, since KCSAN's + * runtime is entered for every memory access, and potentially useful + * information is lost if dirtied by KCSAN. + */ + +struct kcsan_irq_state { + unsigned long flags; +#ifdef CONFIG_TRACE_IRQFLAGS + int hardirqs; + struct irqtrace_events irqtrace; +#endif +}; + +void kcsan_save_irqtrace(struct kcsan_irq_state *irq_state) { #ifdef CONFIG_TRACE_IRQFLAGS - task->kcsan_save_irqtrace = task->irqtrace; + irq_state->irqtrace = task->irqtrace; + irq_state->hardirq = lockdep_hardirqs_enabled(); #endif + if (!kcsan_interrupt_watcher) { + raw_local_irq_save(irq_state->flags); + lockdep_hardirqs_off(CALLER_ADDR0); + } } -void kcsan_restore_irqtrace(struct task_struct *task) +void kcsan_restore_irqtrace(struct kcsan_irq_state *irq_state) { + if (!kcsan_interrupt_watcher) { +#ifdef CONFIG_TRACE_IRQFLAGS + if (irq_state->hardirqs) { + lockdep_hardirqs_on_prepare(CALLER_ADDR0); + lockdep_hardirqs_on(CALLER_ADDR0); + } +#endif + raw_local_irq_restore(irq_state->flags); + } #ifdef CONFIG_TRACE_IRQFLAGS - task->irqtrace = task->kcsan_save_irqtrace; + task->irqtrace = irq_state->irqtrace; #endif } @@ -350,11 +383,13 @@ static noinline void kcsan_found_watchpoint(const volatile void *ptr, flags = user_access_save(); if (consumed) { - kcsan_save_irqtrace(current); + struct kcsan_irq_state irqstate; + + kcsan_save_irqtrace(&irqstate); kcsan_report(ptr, size, type, KCSAN_VALUE_CHANGE_MAYBE, KCSAN_REPORT_CONSUMED_WATCHPOINT, watchpoint - watchpoints); - kcsan_restore_irqtrace(current); + kcsan_restore_irqtrace(&irqstate); } else { /* * The other thread may not print any diagnostics, as it has @@ -387,7 +422,7 @@ kcsan_setup_watchpoint(const volatile void *ptr, size_t size, int type) unsigned long access_mask; e
Re: [PATCH v2 19/24] vdpa: make sure set_features in invoked for legacy
On Thu, Aug 06, 2020 at 03:27:38PM +0800, Jason Wang wrote: > > On 2020/8/6 下午1:53, Michael S. Tsirkin wrote: > > On Thu, Aug 06, 2020 at 11:23:05AM +0800, Jason Wang wrote: > > > On 2020/8/5 下午7:40, Michael S. Tsirkin wrote: > > > > On Wed, Aug 05, 2020 at 02:14:07PM +0800, Jason Wang wrote: > > > > > On 2020/8/4 上午5:00, Michael S. Tsirkin wrote: > > > > > > Some legacy guests just assume features are 0 after reset. > > > > > > We detect that config space is accessed before features are > > > > > > set and set features to 0 automatically. > > > > > > Note: some legacy guests might not even access config space, if > > > > > > this is > > > > > > reported in the field we might need to catch a kick to handle these. > > > > > I wonder whether it's easier to just support modern device? > > > > > > > > > > Thanks > > > > Well hardware vendors are I think interested in supporting legacy > > > > guests. Limiting vdpa to modern only would make it uncompetitive. > > > > > > My understanding is that, IOMMU_PLATFORM is mandatory for hardware vDPA to > > > work. > > Hmm I don't really see why. Assume host maps guest memory properly, > > VM does not have an IOMMU, legacy guest can just work. > > > Yes, guest may not set IOMMU_PLATFORM. > > > > > > Care explaining what's wrong with this picture? > > > The problem is virtio_vdpa, without IOMMU_PLATFORM it uses PA which can not > work if IOMMU is enabled. > > Thanks So that's a virtio_vdpa limitation. In the same way, if a device does not have an on-device iommu *and* is not behind an iommu, then vdpa can't bind to it. But this virtio_vdpa specific hack does not belong in a generic vdpa code. > > > > > > > > So it can only work for modern device ... > > > > > > Thanks > > > > > > > > > > > > > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 19/24] vdpa: make sure set_features in invoked for legacy
On 2020/8/6 下午1:53, Michael S. Tsirkin wrote: On Thu, Aug 06, 2020 at 11:23:05AM +0800, Jason Wang wrote: On 2020/8/5 下午7:40, Michael S. Tsirkin wrote: On Wed, Aug 05, 2020 at 02:14:07PM +0800, Jason Wang wrote: On 2020/8/4 上午5:00, Michael S. Tsirkin wrote: Some legacy guests just assume features are 0 after reset. We detect that config space is accessed before features are set and set features to 0 automatically. Note: some legacy guests might not even access config space, if this is reported in the field we might need to catch a kick to handle these. I wonder whether it's easier to just support modern device? Thanks Well hardware vendors are I think interested in supporting legacy guests. Limiting vdpa to modern only would make it uncompetitive. My understanding is that, IOMMU_PLATFORM is mandatory for hardware vDPA to work. Hmm I don't really see why. Assume host maps guest memory properly, VM does not have an IOMMU, legacy guest can just work. Yes, guest may not set IOMMU_PLATFORM. Care explaining what's wrong with this picture? The problem is virtio_vdpa, without IOMMU_PLATFORM it uses PA which can not work if IOMMU is enabled. Thanks So it can only work for modern device ... Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization