Re: [PATCH] virtio-blk: support polling I/O

2022-03-15 Thread Jason Wang
On Tue, Mar 15, 2022 at 10:43 PM Suwan Kim  wrote:
>
> On Tue, Mar 15, 2022 at 04:59:23PM +0800, Jason Wang wrote:
> > On Mon, Mar 14, 2022 at 8:33 PM Suwan Kim  wrote:
> >
> > > On Mon, Mar 14, 2022 at 02:14:53PM +0800, Jason Wang wrote:
> > > >
> > > > 在 2022/3/11 下午11:28, Suwan Kim 写道:
> > > > > diff --git a/include/uapi/linux/virtio_blk.h
> > > b/include/uapi/linux/virtio_blk.h
> > > > > index d888f013d9ff..3fcaf937afe1 100644
> > > > > --- a/include/uapi/linux/virtio_blk.h
> > > > > +++ b/include/uapi/linux/virtio_blk.h
> > > > > @@ -119,8 +119,9 @@ struct virtio_blk_config {
> > > > >  * deallocation of one or more of the sectors.
> > > > >  */
> > > > > __u8 write_zeroes_may_unmap;
> > > > > +   __u8 unused1;
> > > > > -   __u8 unused1[3];
> > > > > +   __virtio16 num_poll_queues;
> > > > >   } __attribute__((packed));
> > > >
> > > >
> > > > This looks like a implementation specific (virtio-blk-pci) optimization,
> > > how
> > > > about other implementation like vhost-user-blk?
> > >
> > > I didn’t consider vhost-user-blk yet. But does vhost-user-blk also
> > > use vritio_blk_config as kernel-qemu interface?
> > >
> >
> > Yes, but see below.
> >
> >
> > >
> > > Does vhost-user-blk need additional modification to support polling
> > > in kernel side?
> > >
> >
> >
> > No, but the issue is, things like polling looks not a good candidate for
> > the attributes belonging to the device but the driver. So I have more
> > questions:
> >
> > 1) what does it really mean for hardware virtio block devices?
> > 2) Does driver polling help for the qemu implementation without polling?
> > 3) Using blk_config means we can only get the benefit from the new device
>
> 1) what does it really mean for hardware virtio block devices?
> 3) Using blk_config means we can only get the benefit from the new device
>
> This patch adds dedicated HW queue for polling purpose to virtio
> block device.
>
> So I think it can be a new hw feature. And it can be a new device
> that supports hw poll queue.

One possible issue is that the "poll" looks more like a
software/driver concept other than the device/hardware.

>
> BTW, I have other idea about it.
>
> How about adding “num-poll-queues" property as a driver parameter
> like NVMe driver, not to QEMU virtio-blk-pci property?

It should be fine, but we need to listen to others.

>
> If then, we don’t need to modify virtio_blk_config.
> And we can apply the polling feature only to virtio-blk-pci.
> But can QEMU pass “num-poll-queues" to virtio-blk driver param?

As Michael said we can leave this to guest kernel / administrator.

>
>
>
> 2) Does driver polling help for the qemu implementation without polling?
>
> Sorry, I didn't understand your question. Could you please explain more about?

I mean does the polling work for the ordinary qemu block device
without busy polling?

Thanks

>
> Regards,
> Suwan Kim
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v4 05/11] PCI: Use driver_set_override() instead of open-coding

2022-03-15 Thread Andy Shevchenko
On Sat, Mar 12, 2022 at 4:09 PM Krzysztof Kozlowski
 wrote:
>
> Use a helper to set driver_override to reduce amount of duplicated code.

the amount

> Make the driver_override field const char, because it is not modified by
> the core and it matches other subsystems.


Seems like mine #4 here
https://gist.github.com/andy-shev/a2cb1ee4767d6d2f5d20db53ecb9aabc :-)
Reviewed-by: Andy Shevchenko 
Thanks!

> Signed-off-by: Krzysztof Kozlowski 
> Acked-by: Bjorn Helgaas 
> ---
>  drivers/pci/pci-sysfs.c | 28 
>  include/linux/pci.h |  6 +-
>  2 files changed, 9 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 602f0fb0b007..5c42965c32c2 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -567,31 +567,11 @@ static ssize_t driver_override_store(struct device *dev,
>  const char *buf, size_t count)
>  {
> struct pci_dev *pdev = to_pci_dev(dev);
> -   char *driver_override, *old, *cp;
> -
> -   /* We need to keep extra room for a newline */
> -   if (count >= (PAGE_SIZE - 1))
> -   return -EINVAL;
> -
> -   driver_override = kstrndup(buf, count, GFP_KERNEL);
> -   if (!driver_override)
> -   return -ENOMEM;
> -
> -   cp = strchr(driver_override, '\n');
> -   if (cp)
> -   *cp = '\0';
> -
> -   device_lock(dev);
> -   old = pdev->driver_override;
> -   if (strlen(driver_override)) {
> -   pdev->driver_override = driver_override;
> -   } else {
> -   kfree(driver_override);
> -   pdev->driver_override = NULL;
> -   }
> -   device_unlock(dev);
> +   int ret;
>
> -   kfree(old);
> +   ret = driver_set_override(dev, >driver_override, buf, count);
> +   if (ret)
> +   return ret;
>
> return count;
>  }
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 60d423d8f0c4..415491fb85f4 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -516,7 +516,11 @@ struct pci_dev {
> u16 acs_cap;/* ACS Capability offset */
> phys_addr_t rom;/* Physical address if not from BAR */
> size_t  romlen; /* Length if not from BAR */
> -   char*driver_override; /* Driver name to force a match */
> +   /*
> +* Driver name to force a match.  Do not set directly, because core
> +* frees it.  Use driver_set_override() to set or clear it.
> +*/
> +   const char  *driver_override;
>
> unsigned long   priv_flags; /* Private flags for the PCI driver */
>
> --
> 2.32.0
>


-- 
With Best Regards,
Andy Shevchenko
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 01/11] driver: platform: Add helper for safer setting of driver_override

2022-03-15 Thread Andy Shevchenko
On Sat, Mar 12, 2022 at 5:16 PM Krzysztof Kozlowski
 wrote:
>
> Several core drivers and buses expect that driver_override is a
> dynamically allocated memory thus later they can kfree() it.
>
> However such assumption is not documented, there were in the past and
> there are already users setting it to a string literal. This leads to
> kfree() of static memory during device release (e.g. in error paths or
> during unbind):
>
> kernel BUG at ../mm/slub.c:3960!
> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
> ...
> (kfree) from [] (platform_device_release+0x88/0xb4)
> (platform_device_release) from [] (device_release+0x2c/0x90)
> (device_release) from [] (kobject_put+0xec/0x20c)
> (kobject_put) from [] (exynos5_clk_probe+0x154/0x18c)
> (exynos5_clk_probe) from [] (platform_drv_probe+0x6c/0xa4)
> (platform_drv_probe) from [] (really_probe+0x280/0x414)
> (really_probe) from [] (driver_probe_device+0x78/0x1c4)
> (driver_probe_device) from [] (bus_for_each_drv+0x74/0xb8)
> (bus_for_each_drv) from [] (__device_attach+0xd4/0x16c)
> (__device_attach) from [] (bus_probe_device+0x88/0x90)
> (bus_probe_device) from [] (device_add+0x3dc/0x62c)
> (device_add) from [] (of_platform_device_create_pdata+0x94/0xbc)
> (of_platform_device_create_pdata) from [] 
> (of_platform_bus_create+0x1a8/0x4fc)
> (of_platform_bus_create) from [] 
> (of_platform_bus_create+0x20c/0x4fc)
> (of_platform_bus_create) from [] 
> (of_platform_populate+0x84/0x118)
> (of_platform_populate) from [] 
> (of_platform_default_populate_init+0xa0/0xb8)
> (of_platform_default_populate_init) from [] 
> (do_one_initcall+0x8c/0x404)

> (do_one_initcall) from [] (kernel_init_freeable+0x3d0/0x4d8)
> (kernel_init_freeable) from [] (kernel_init+0x8/0x114)
> (kernel_init) from [] (ret_from_fork+0x14/0x20)

I believe you may remove these three.

> Provide a helper which clearly documents the usage of driver_override.
> This will allow later to reuse the helper and reduce amount of

the amount

> duplicated code.

> Convert the platform driver to use new helper and make the

a new

> driver_override field const char (it is not modified by the core).

...

> +/**
> + * driver_set_override() - Helper to set or clear driver override.
> + * @dev: Device to change
> + * @override: Address of string to change (e.g. >driver_override);
> + *The contents will be freed and hold newly allocated override.
> + * @s: NUL terminated string, new driver name to force a match, pass empty

NUL-terminated? (44 vs 115 occurrences)

> + * string to clear it
> + * @len: length of @s
> + *
> + * Helper to set or clear driver override in a device, intended for the cases
> + * when the driver_override field is allocated by driver/bus code.
> + *
> + * Returns: 0 on success or a negative error code on failure.
> + */
> +int driver_set_override(struct device *dev, const char **override,
> +   const char *s, size_t len)
> +{
> +   const char *new, *old;
> +   char *cp;
> +
> +   if (!dev || !override || !s)
> +   return -EINVAL;
> +
> +   /*
> +* The stored value will be used in sysfs show callback 
> (sysfs_emit()),
> +* which has a length limit of PAGE_SIZE and adds a trailing newline.
> +* Thus we can store one character less to avoid truncation during 
> sysfs
> +* show.
> +*/
> +   if (len >= (PAGE_SIZE - 1))
> +   return -EINVAL;
> +
> +   new = kstrndup(s, len, GFP_KERNEL);
> +   if (!new)
> +   return -ENOMEM;
> +
> +   cp = strchr(new, '\n');
> +   if (cp)
> +   *cp = '\0';

AFAIU you may reduce memory footprint by

cp = strnchr(new, len, '\n');
if (cp)
  len = s - cp;

new = kstrndup(...);

> +   device_lock(dev);
> +   old = *override;
> +   if (cp != new) {
> +   *override = new;
> +   } else {
> +   kfree(new);
> +   *override = NULL;
> +   }
> +   device_unlock(dev);
> +
> +   kfree(old);
> +
> +   return 0;
> +}

-- 
With Best Regards,
Andy Shevchenko
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [GIT PULL] virtio: a last minute regression fix

2022-03-15 Thread pr-tracker-bot
The pull request you sent on Mon, 14 Mar 2022 07:59:51 -0400:

> https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/6665ca15746dc34606b5d79fae278a101a368437

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vDPA/ifcvf: match pointer check to use

2022-03-15 Thread Michael S. Tsirkin
On Tue, Mar 15, 2022 at 08:03:26AM -0700, Tom Rix wrote:
> 
> On 3/15/22 6:28 AM, Michael S. Tsirkin wrote:
> > On Tue, Mar 15, 2022 at 05:41:30AM -0700, t...@redhat.com wrote:
> > > From: Tom Rix 
> > > 
> > > Clang static analysis reports this issue
> > > ifcvf_main.c:49:4: warning: Called function
> > >pointer is null (null dereference)
> > >vf->vring->cb.callback(vring->cb.private);
> > >^
> > > 
> > > The check
> > >vring = >vring[i];
> > >if (vring->cb.callback)
> > > 
> > > Does not match the use.  Change dereference so they match.
> > > 
> > > Fixes: 79333575b8bd ("vDPA/ifcvf: implement shared IRQ feature")
> > Thanks a lot! I squashed this into the offending patch - no point in
> > breaking bisect. Pushed to linux. However I'm now
> > having second thoughts about applying that patchset - I'd like
> > soma analysis explaining how this got through testing.
> 
> static analysis is something i do treewide.
> 
> There are currently ~2500 issues in linux-next, do not panic! many are false
> positives.
> 
> It is pretty easy to setup and once you have a baseline you can filter only
> your files.
> 
> Tom

Thanks for that info! I was actually directing this question to the
contributor since the code does not look like it could have ever
worked. I don't have the hardware in question myself.


> > > Signed-off-by: Tom Rix 
> > > ---
> > >   drivers/vdpa/ifcvf/ifcvf_main.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c 
> > > b/drivers/vdpa/ifcvf/ifcvf_main.c
> > > index 3b48e717e89f7..4366320fb68d3 100644
> > > --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> > > +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> > > @@ -46,7 +46,7 @@ static irqreturn_t ifcvf_vqs_reused_intr_handler(int 
> > > irq, void *arg)
> > >   for (i = 0; i < vf->nr_vring; i++) {
> > >   vring = >vring[i];
> > >   if (vring->cb.callback)
> > > - vf->vring->cb.callback(vring->cb.private);
> > > + vring->cb.callback(vring->cb.private);
> > >   }
> > >   return IRQ_HANDLED;
> > > -- 
> > > 2.26.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio-blk: support polling I/O

2022-03-15 Thread Michael S. Tsirkin
On Tue, Mar 15, 2022 at 11:43:18PM +0900, Suwan Kim wrote:
> On Tue, Mar 15, 2022 at 04:59:23PM +0800, Jason Wang wrote:
> > On Mon, Mar 14, 2022 at 8:33 PM Suwan Kim  wrote:
> > 
> > > On Mon, Mar 14, 2022 at 02:14:53PM +0800, Jason Wang wrote:
> > > >
> > > > 在 2022/3/11 下午11:28, Suwan Kim 写道:
> > > > > diff --git a/include/uapi/linux/virtio_blk.h
> > > b/include/uapi/linux/virtio_blk.h
> > > > > index d888f013d9ff..3fcaf937afe1 100644
> > > > > --- a/include/uapi/linux/virtio_blk.h
> > > > > +++ b/include/uapi/linux/virtio_blk.h
> > > > > @@ -119,8 +119,9 @@ struct virtio_blk_config {
> > > > >  * deallocation of one or more of the sectors.
> > > > >  */
> > > > > __u8 write_zeroes_may_unmap;
> > > > > +   __u8 unused1;
> > > > > -   __u8 unused1[3];
> > > > > +   __virtio16 num_poll_queues;
> > > > >   } __attribute__((packed));
> > > >
> > > >
> > > > This looks like a implementation specific (virtio-blk-pci) optimization,
> > > how
> > > > about other implementation like vhost-user-blk?
> > >
> > > I didn’t consider vhost-user-blk yet. But does vhost-user-blk also
> > > use vritio_blk_config as kernel-qemu interface?
> > >
> > 
> > Yes, but see below.
> > 
> > 
> > >
> > > Does vhost-user-blk need additional modification to support polling
> > > in kernel side?
> > >
> > 
> > 
> > No, but the issue is, things like polling looks not a good candidate for
> > the attributes belonging to the device but the driver. So I have more
> > questions:
> > 
> > 1) what does it really mean for hardware virtio block devices?
> > 2) Does driver polling help for the qemu implementation without polling?
> > 3) Using blk_config means we can only get the benefit from the new device
> 
> 1) what does it really mean for hardware virtio block devices?
> 3) Using blk_config means we can only get the benefit from the new device
> 
> This patch adds dedicated HW queue for polling purpose to virtio
> block device.
> 
> So I think it can be a new hw feature. And it can be a new device
> that supports hw poll queue.
> 
> BTW, I have other idea about it.
> 
> How about adding “num-poll-queues" property as a driver parameter
> like NVMe driver, not to QEMU virtio-blk-pci property?
> 
> If then, we don’t need to modify virtio_blk_config.
> And we can apply the polling feature only to virtio-blk-pci.
> But can QEMU pass “num-poll-queues" to virtio-blk driver param?

Same as any other driver parameter, pass it on kernel command line.

> 
> 
> 2) Does driver polling help for the qemu implementation without polling?
> 
> Sorry, I didn't understand your question. Could you please explain more about?
> 
> Regards,
> Suwan Kim

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] vDPA/ifcvf: match pointer check to use

2022-03-15 Thread Michael S. Tsirkin
On Tue, Mar 15, 2022 at 05:41:30AM -0700, t...@redhat.com wrote:
> From: Tom Rix 
> 
> Clang static analysis reports this issue
> ifcvf_main.c:49:4: warning: Called function
>   pointer is null (null dereference)
>   vf->vring->cb.callback(vring->cb.private);
>   ^
> 
> The check
>   vring = >vring[i];
>   if (vring->cb.callback)
> 
> Does not match the use.  Change dereference so they match.
> 
> Fixes: 79333575b8bd ("vDPA/ifcvf: implement shared IRQ feature")

Thanks a lot! I squashed this into the offending patch - no point in
breaking bisect. Pushed to linux. However I'm now
having second thoughts about applying that patchset - I'd like
soma analysis explaining how this got through testing.

> Signed-off-by: Tom Rix 

> ---
>  drivers/vdpa/ifcvf/ifcvf_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
> index 3b48e717e89f7..4366320fb68d3 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> @@ -46,7 +46,7 @@ static irqreturn_t ifcvf_vqs_reused_intr_handler(int irq, 
> void *arg)
>   for (i = 0; i < vf->nr_vring; i++) {
>   vring = >vring[i];
>   if (vring->cb.callback)
> - vf->vring->cb.callback(vring->cb.private);
> + vring->cb.callback(vring->cb.private);
>   }
>  
>   return IRQ_HANDLED;
> -- 
> 2.26.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vDPA/ifcvf: match pointer check to use

2022-03-15 Thread Michael S. Tsirkin
On Tue, Mar 15, 2022 at 05:41:30AM -0700, t...@redhat.com wrote:
> From: Tom Rix 
> 
> Clang static analysis reports this issue
> ifcvf_main.c:49:4: warning: Called function
>   pointer is null (null dereference)
>   vf->vring->cb.callback(vring->cb.private);
>   ^
> 
> The check
>   vring = >vring[i];
>   if (vring->cb.callback)
> 
> Does not match the use.  Change dereference so they match.
> 
> Fixes: 79333575b8bd ("vDPA/ifcvf: implement shared IRQ feature")
> Signed-off-by: Tom Rix 
> ---
>  drivers/vdpa/ifcvf/ifcvf_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
> index 3b48e717e89f7..4366320fb68d3 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> @@ -46,7 +46,7 @@ static irqreturn_t ifcvf_vqs_reused_intr_handler(int irq, 
> void *arg)
>   for (i = 0; i < vf->nr_vring; i++) {
>   vring = >vring[i];
>   if (vring->cb.callback)
> - vf->vring->cb.callback(vring->cb.private);
> + vring->cb.callback(vring->cb.private);
>   }
>  
>   return IRQ_HANDLED;


Oh, absolutely. In fact vf->vring->cb.callback is just
vf->vring[0].cb.callback so it's wrong for any ring except 0. Does not
make sense.

So how did it work in testing then? No idea.
Zhu Lingshan, care to comment?


> -- 
> 2.26.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v1 3/3] af_vsock: SOCK_SEQPACKET broken buffer test

2022-03-15 Thread Stefano Garzarella

On Tue, Mar 15, 2022 at 12:43:13PM +, Krasnov Arseniy Vladimirovich wrote:

On 15.03.2022 12:35, Arseniy Krasnov wrote:

On 15.03.2022 11:36, Stefano Garzarella wrote:

On Fri, Mar 11, 2022 at 10:58:32AM +, Krasnov Arseniy Vladimirovich wrote:

Add test where sender sends two message, each with own
data pattern. Reader tries to read first to broken buffer:
it has three pages size, but middle page is unmapped. Then,
reader tries to read second message to valid buffer. Test
checks, that uncopied part of first message was dropped
and thus not copied as part of second message.

Signed-off-by: Arseniy Krasnov 
---
tools/testing/vsock/vsock_test.c | 121 +++
1 file changed, 121 insertions(+)

diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index aa2de27d0f77..686af712b4ad 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -16,6 +16,7 @@
#include 
#include 
#include 
+#include 

#include "timeout.h"
#include "control.h"
@@ -435,6 +436,121 @@ static void test_seqpacket_timeout_server(const struct 
test_opts *opts)
close(fd);
}

+#define BUF_PATTERN_1 'a'
+#define BUF_PATTERN_2 'b'
+
+static void test_seqpacket_invalid_rec_buffer_client(const struct test_opts 
*opts)
+{
+    int fd;
+    unsigned char *buf1;
+    unsigned char *buf2;
+    int buf_size = getpagesize() * 3;
+
+    fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
+    if (fd < 0) {
+    perror("connect");
+    exit(EXIT_FAILURE);
+    }
+
+    buf1 = malloc(buf_size);
+    if (buf1 == NULL) {
+    perror("'malloc()' for 'buf1'");
+    exit(EXIT_FAILURE);
+    }
+
+    buf2 = malloc(buf_size);
+    if (buf2 == NULL) {
+    perror("'malloc()' for 'buf2'");
+    exit(EXIT_FAILURE);
+    }
+
+    memset(buf1, BUF_PATTERN_1, buf_size);
+    memset(buf2, BUF_PATTERN_2, buf_size);
+
+    if (send(fd, buf1, buf_size, 0) != buf_size) {
+    perror("send failed");
+    exit(EXIT_FAILURE);
+    }
+
+    if (send(fd, buf2, buf_size, 0) != buf_size) {
+    perror("send failed");
+    exit(EXIT_FAILURE);
+    }
+
+    close(fd);
+}
+
+static void test_seqpacket_invalid_rec_buffer_server(const struct test_opts 
*opts)
+{
+    int fd;
+    unsigned char *broken_buf;
+    unsigned char *valid_buf;
+    int page_size = getpagesize();
+    int buf_size = page_size * 3;
+    ssize_t res;
+    int prot = PROT_READ | PROT_WRITE;
+    int flags = MAP_PRIVATE | MAP_ANONYMOUS;
+    int i;
+
+    fd = vsock_seqpacket_accept(VMADDR_CID_ANY, 1234, NULL);
+    if (fd < 0) {
+    perror("accept");
+    exit(EXIT_FAILURE);
+    }
+
+    /* Setup first buffer. */
+    broken_buf = mmap(NULL, buf_size, prot, flags, -1, 0);
+    if (broken_buf == MAP_FAILED) {
+    perror("mmap for 'broken_buf'");
+    exit(EXIT_FAILURE);
+    }
+
+    /* Unmap "hole" in buffer. */
+    if (munmap(broken_buf + page_size, page_size)) {
+    perror("'broken_buf' setup");
+    exit(EXIT_FAILURE);
+    }
+
+    valid_buf = mmap(NULL, buf_size, prot, flags, -1, 0);
+    if (valid_buf == MAP_FAILED) {
+    perror("mmap for 'valid_buf'");
+    exit(EXIT_FAILURE);
+    }
+
+    /* Try to fill buffer with unmapped middle. */
+    res = read(fd, broken_buf, buf_size);
+    if (res != -1) {
+    perror("invalid read result of 'broken_buf'");


if `res` is valid, errno is not set, better to use fprintf(stderr, ...) 
printing the expected and received result.
Take a look at test_stream_connection_reset()


Ack, fix it in v2




+    exit(EXIT_FAILURE);
+    }
+
+    if (errno != ENOMEM) {
+    perror("invalid errno of 'broken_buf'");


Instead of "invalid", I would say "unexpected".


Ack




+    exit(EXIT_FAILURE);
+    }




+
+    /* Try to fill valid buffer. */
+    res = read(fd, valid_buf, buf_size);
+    if (res != buf_size) {
+    perror("invalid read result of 'valid_buf'");


I would split in 2 checks:
- (res < 0) then use perror()
- (res != buf_size) then use fprintf(stderr, ...) printing the expected   and 
received result.


Ack




+    exit(EXIT_FAILURE);
+    }
+
+    for (i = 0; i < buf_size; i++) {
+    if (valid_buf[i] != BUF_PATTERN_2) {
+    perror("invalid pattern for valid buf");


errno is not set here, better to use fprintf(stderr, ...)


Ack




+    exit(EXIT_FAILURE);
+    }
+    }


What about replace this for with a memcmp()?


memcmp() will require special buffer with BUF_PATTERN_2 data as
second argument. I think 'if()' condition is better here, as we
compare each element of buffer with single byte. Anyway, 'memcmp()'
does the same things inside itself.


Ah, I see. Okay, I agree on for()/if(), maybe we can also print more 
info (index, expected value, current value).






Ack




+
+
+    /* Unmap buffers. */
+    munmap(broken_buf, page_size);
+    munmap(broken_buf + page_size * 2, page_size);
+    munmap(valid_buf, buf_size);
+    close(fd);
+}
+

Re: [PATCH v2 0/8] Add memory shrinker to VirtIO-GPU DRM driver

2022-03-15 Thread Emil Velikov
On Mon, 14 Mar 2022 at 22:44, Dmitry Osipenko
 wrote:

> Dmitry Osipenko (8):
>   drm/virtio: Correct drm_gem_shmem_get_sg_table() error handling
>   drm/virtio: Check whether transferred 2D BO is shmem
>   drm/virtio: Unlock GEM reservations in error code path

These three are legitimate fixes that we want regardless of the shrinker.

Please add the respective "Fixes" tag, so they find their way in the
stable trees. With that, them 3 are:
Reviewed-by: Emil Velikov 

HTH
Emil
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 7/8] drm/virtio: Support memory shrinking

2022-03-15 Thread Emil Velikov
Greetings everyone,

Food for thought: Would it make sense to have the madvise ioctl as
generic DRM one?
Looking around - i915, msm & panfrost already have one and the virtio
implementation [below] seems as generic as it gets.

On Mon, 14 Mar 2022 at 22:44, Dmitry Osipenko
 wrote:

> +#define VIRTGPU_MADV_WILLNEED 0
> +#define VIRTGPU_MADV_DONTNEED 1
> +struct drm_virtgpu_madvise {
> +   __u32 bo_handle;
> +   __u32 retained; /* out, non-zero if BO can be used */
> +   __u32 madv;
> +   __u32 pad;

This seems to be based on panfrost/msm yet names (bo_handle vs
handle), layout and documentation varies.
Why is that - copy/paste is cheap :-P

HTH

-Emil
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v1 3/3] af_vsock: SOCK_SEQPACKET broken buffer test

2022-03-15 Thread Stefano Garzarella
On Tue, Mar 15, 2022 at 09:34:35AM +, Krasnov Arseniy Vladimirovich 
wrote:

On 15.03.2022 11:36, Stefano Garzarella wrote:


Is this the right behavior? If read() fails because the buffer is invalid, do 
we throw out the whole packet?

I was expecting the packet not to be consumed, have you tried AF_UNIX, does it 
have the same behavior?


I've just checked AF_UNIX implementation of SEQPACKET receive in 
net/unix/af_unix.c. So, if 'skb_copy_datagram_msg()'
fails, it calls 'skb_free_datagram()'. I think this means that whole sk buff 
will be dropped, but anyway, i'll check
this behaviour in practice. See '__unix_dgram_recvmsg()' in net/unix/af_unix.c.



Yep. you are right it seems to be discarded but I don't know that
code very well, so better to test as you said ;-)

Thanks,
Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 3/3] vdpa: support exposing the count of vqs to userspace

2022-03-15 Thread Stefano Garzarella

On Tue, Mar 15, 2022 at 11:25:53AM +0800, Longpeng(Mike) wrote:

From: Longpeng 

- GET_VQS_COUNT: the count of virtqueues that exposed

Signed-off-by: Longpeng 
---
drivers/vhost/vdpa.c   | 13 +
include/uapi/linux/vhost.h |  3 +++
2 files changed, 16 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 0c82eb5..69b3f05 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -370,6 +370,16 @@ static long vhost_vdpa_get_config_size(struct vhost_vdpa 
*v, u32 __user *argp)
return 0;
}

+static long vhost_vdpa_get_vqs_count(struct vhost_vdpa *v, u32 __user *argp)
+{
+   struct vdpa_device *vdpa = v->vdpa;
+
+   if (copy_to_user(argp, >nvqs, sizeof(vdpa->nvqs)))
+   return -EFAULT;
+
+   return 0;
+}
+
static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
   void __user *argp)
{
@@ -510,6 +520,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
case VHOST_VDPA_GET_CONFIG_SIZE:
r = vhost_vdpa_get_config_size(v, argp);
break;
+   case VHOST_VDPA_GET_VQS_COUNT:
+   r = vhost_vdpa_get_vqs_count(v, argp);
+   break;
default:
r = vhost_dev_ioctl(>vdev, cmd, argp);
if (r == -ENOIOCTLCMD)
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index bc74e95..5d99e7c 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -154,4 +154,7 @@
/* Get the config size */
#define VHOST_VDPA_GET_CONFIG_SIZE  _IOR(VHOST_VIRTIO, 0x79, __u32)

+/* Get the count of all virtqueues */
+#define VHOST_VDPA_GET_VQS_COUNT   _IOR(VHOST_VIRTIO, 0x80, __u32)


I'd prefer VHOST_VDPA_GET_NUM_QUEUES, since we use "num_queues" also in 
the spec [1].


But I'm fine also with this:

Reviewed-by: Stefano Garzarella 

[1] 
https://docs.oasis-open.org/virtio/virtio/v1.1/csprd01/virtio-v1.1-csprd01.html#x1-1120003


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/3] vdpa: change the type of nvqs to u32

2022-03-15 Thread Stefano Garzarella

On Tue, Mar 15, 2022 at 11:25:52AM +0800, Longpeng(Mike) wrote:

From: Longpeng 

Change vdpa_device.nvqs and vhost_vdpa.nvqs to use u32

Signed-off-by: Longpeng 
---
drivers/vdpa/vdpa.c  |  6 +++---
drivers/vhost/vdpa.c | 10 ++
include/linux/vdpa.h |  6 +++---
3 files changed, 12 insertions(+), 10 deletions(-)


Reviewed-by: Stefano Garzarella 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 1/3] vdpa: support exposing the config size to userspace

2022-03-15 Thread Stefano Garzarella

On Tue, Mar 15, 2022 at 11:25:51AM +0800, Longpeng(Mike) wrote:

From: Longpeng 

- GET_CONFIG_SIZE: return the size of the virtio config space.

The size contains the fields which are conditional on feature
bits.

Acked-by: Jason Wang 
Signed-off-by: Longpeng 
---
drivers/vhost/vdpa.c   | 17 +
include/linux/vdpa.h   |  3 ++-
include/uapi/linux/vhost.h |  4 
3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index ec5249e..605c7ae 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -355,6 +355,20 @@ static long vhost_vdpa_get_iova_range(struct vhost_vdpa 
*v, u32 __user *argp)
return 0;
}

+static long vhost_vdpa_get_config_size(struct vhost_vdpa *v, u32 __user *argp)
+{
+   struct vdpa_device *vdpa = v->vdpa;
+   const struct vdpa_config_ops *ops = vdpa->config;
+   u32 size;
+
+   size = ops->get_config_size(vdpa);


get_config_size() returns a size_t, perhaps we could have a comment here 
where we say we don't expect there to be an overflow.


I don't have a strong opinion on this, and I wouldn't want to get you to 
repin just for that, so:


Reviewed-by: Stefano Garzarella 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [bug report] vDPA/ifcvf: implement shared IRQ feature

2022-03-15 Thread Dan Carpenter
What I really want is to re-enable GCC's uninitialized variable warning.

$ rm drivers/vdpa/ifcvf/ifcvf_main.o
$ make W=2 drivers/vdpa/ifcvf/ifcvf_main.o

It prints a ton of output but this is the relevant bit.

drivers/vdpa/ifcvf/ifcvf_main.c: In function ‘ifcvf_vdpa_set_status’:
drivers/vdpa/ifcvf/ifcvf_main.c:291:6: warning: ‘config_vector’ may be used 
uninitialized in this function [-Wmaybe-uninitialized]
  int config_vector, ret;
  ^

regards,
dan carpenter


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio-blk: support polling I/O

2022-03-15 Thread Jason Wang
On Mon, Mar 14, 2022 at 8:33 PM Suwan Kim  wrote:

> On Mon, Mar 14, 2022 at 02:14:53PM +0800, Jason Wang wrote:
> >
> > 在 2022/3/11 下午11:28, Suwan Kim 写道:
> > > diff --git a/include/uapi/linux/virtio_blk.h
> b/include/uapi/linux/virtio_blk.h
> > > index d888f013d9ff..3fcaf937afe1 100644
> > > --- a/include/uapi/linux/virtio_blk.h
> > > +++ b/include/uapi/linux/virtio_blk.h
> > > @@ -119,8 +119,9 @@ struct virtio_blk_config {
> > >  * deallocation of one or more of the sectors.
> > >  */
> > > __u8 write_zeroes_may_unmap;
> > > +   __u8 unused1;
> > > -   __u8 unused1[3];
> > > +   __virtio16 num_poll_queues;
> > >   } __attribute__((packed));
> >
> >
> > This looks like a implementation specific (virtio-blk-pci) optimization,
> how
> > about other implementation like vhost-user-blk?
>
> I didn’t consider vhost-user-blk yet. But does vhost-user-blk also
> use vritio_blk_config as kernel-qemu interface?
>

Yes, but see below.


>
> Does vhost-user-blk need additional modification to support polling
> in kernel side?
>


No, but the issue is, things like polling looks not a good candidate for
the attributes belonging to the device but the driver. So I have more
questions:

1) what does it really mean for hardware virtio block devices?
2) Does driver polling help for the qemu implementation without polling?
3) Using blk_config means we can only get the benefit from the new device

Thanks



> Regards,
> Suwan Kim
>
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [bug report] vDPA/ifcvf: implement shared IRQ feature

2022-03-15 Thread Dan Carpenter
On Tue, Mar 15, 2022 at 10:27:35AM +0800, Zhu, Lingshan wrote:
> 
> 
> On 3/14/2022 6:37 PM, Dan Carpenter wrote:
> > On Mon, Mar 14, 2022 at 10:22:03AM +0800, Zhu, Lingshan wrote:
> > > Hello Dan,
> > > 
> > > Thanks for your suggestions and this auto-testing efforts!
> > > On handling the vector for device config interrupt, there are three
> > > possibilities:
> > > (1)it has a dedicated vector(2)it shares a vector with datapath(3)no
> > > vectors.
> > > 
> > > So in these code below, it handles the three cases, or it should be 
> > > -EINVAL,
> > > so IMHO we don't need
> > > an else there, just leave it -EINVAL.
> > I'm confused about why you're talking about -EINVAL...  There is no
> > -EINVAL in this function.
> Oh, sorry I didn't explain this well. It is a series of patches, in other
> patches, we initialize hw->config_irq = -EINVAL
> as a safe default value. In this function ifcvf_request_config_irq(),
> hw->config_irq is generated by request_config_irq().
> 
> Then config_vector matters, there are only three possible values the
> config_vector can be(listed above),
> depending on vf->msix_vector_status. And vf->msix_vector_status depends on
> how many MSI vectors we got,
> 
> so there are only three values it can take, IMHO, it is a complete set of
> the possibilities, we already
> handled "else" in ifcvf_request_irq().

Alright, fine.  Yes, I verified this and it's a false positive.  But GCC
will also warn about this code if we manage to enable the GCC warning
again.

If we make a chart like this:

  [looks correct] [ looks buggy ]
[ no warnings] 1 2
  [ warnings ] 3 4


Where you want to be is in box 1 where it looks correct and has no
warnings.  Boxes 2 and 3 are a second best option because if it doesn't
have static checker warnings, people won't notice it.  Or if the
warnings are obvious false postives they can skip over it quickly.

But this code is box 4 where it is a big waste of time for anyone
running a static checker.

I almost prefer actual bugs to code which is deliberately written to
fit in box 4.

regards,
dan carpenter
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v1 1/3] af_vsock: add two new tests for SOCK_SEQPACKET

2022-03-15 Thread Stefano Garzarella

Hi Arseniy,

On Fri, Mar 11, 2022 at 10:52:36AM +, Krasnov Arseniy Vladimirovich wrote:

This adds two tests: for receive timeout and reading to invalid
buffer provided by user. I forgot to put both patches to main
patchset.

Arseniy Krasnov(2):

af_vsock: SOCK_SEQPACKET receive timeout test
af_vsock: SOCK_SEQPACKET broken buffer test

tools/testing/vsock/vsock_test.c | 170 +++
1 file changed, 170 insertions(+)


Thank you for these tests!

I left a few comments and I'm not sure about the 'broken buffer test' 
behavior.


About the series, it sounds like something is wrong with your setup, 
usually the cover letter is "patch" 0. In this case I would have 
expected:


[0/2] af_vsock: add two new tests for SOCK_SEQPACKET
[1/2] af_vsock: SOCK_SEQPACKET receive timeout test
[2/2] af_vsock: SOCK_SEQPACKET broken buffer test

Are you using `git send-email` or `git publish`?


When you will remove the RFC, please add `net-next` label:
[PATCH net-next 0/2], etc..

Thanks,
Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v1 3/3] af_vsock: SOCK_SEQPACKET broken buffer test

2022-03-15 Thread Stefano Garzarella

On Fri, Mar 11, 2022 at 10:58:32AM +, Krasnov Arseniy Vladimirovich wrote:

Add test where sender sends two message, each with own
data pattern. Reader tries to read first to broken buffer:
it has three pages size, but middle page is unmapped. Then,
reader tries to read second message to valid buffer. Test
checks, that uncopied part of first message was dropped
and thus not copied as part of second message.

Signed-off-by: Arseniy Krasnov 
---
tools/testing/vsock/vsock_test.c | 121 +++
1 file changed, 121 insertions(+)

diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index aa2de27d0f77..686af712b4ad 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -16,6 +16,7 @@
#include 
#include 
#include 
+#include 

#include "timeout.h"
#include "control.h"
@@ -435,6 +436,121 @@ static void test_seqpacket_timeout_server(const struct 
test_opts *opts)
close(fd);
}

+#define BUF_PATTERN_1 'a'
+#define BUF_PATTERN_2 'b'
+
+static void test_seqpacket_invalid_rec_buffer_client(const struct test_opts 
*opts)
+{
+   int fd;
+   unsigned char *buf1;
+   unsigned char *buf2;
+   int buf_size = getpagesize() * 3;
+
+   fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
+   if (fd < 0) {
+   perror("connect");
+   exit(EXIT_FAILURE);
+   }
+
+   buf1 = malloc(buf_size);
+   if (buf1 == NULL) {
+   perror("'malloc()' for 'buf1'");
+   exit(EXIT_FAILURE);
+   }
+
+   buf2 = malloc(buf_size);
+   if (buf2 == NULL) {
+   perror("'malloc()' for 'buf2'");
+   exit(EXIT_FAILURE);
+   }
+
+   memset(buf1, BUF_PATTERN_1, buf_size);
+   memset(buf2, BUF_PATTERN_2, buf_size);
+
+   if (send(fd, buf1, buf_size, 0) != buf_size) {
+   perror("send failed");
+   exit(EXIT_FAILURE);
+   }
+
+   if (send(fd, buf2, buf_size, 0) != buf_size) {
+   perror("send failed");
+   exit(EXIT_FAILURE);
+   }
+
+   close(fd);
+}
+
+static void test_seqpacket_invalid_rec_buffer_server(const struct test_opts 
*opts)
+{
+   int fd;
+   unsigned char *broken_buf;
+   unsigned char *valid_buf;
+   int page_size = getpagesize();
+   int buf_size = page_size * 3;
+   ssize_t res;
+   int prot = PROT_READ | PROT_WRITE;
+   int flags = MAP_PRIVATE | MAP_ANONYMOUS;
+   int i;
+
+   fd = vsock_seqpacket_accept(VMADDR_CID_ANY, 1234, NULL);
+   if (fd < 0) {
+   perror("accept");
+   exit(EXIT_FAILURE);
+   }
+
+   /* Setup first buffer. */
+   broken_buf = mmap(NULL, buf_size, prot, flags, -1, 0);
+   if (broken_buf == MAP_FAILED) {
+   perror("mmap for 'broken_buf'");
+   exit(EXIT_FAILURE);
+   }
+
+   /* Unmap "hole" in buffer. */
+   if (munmap(broken_buf + page_size, page_size)) {
+   perror("'broken_buf' setup");
+   exit(EXIT_FAILURE);
+   }
+
+   valid_buf = mmap(NULL, buf_size, prot, flags, -1, 0);
+   if (valid_buf == MAP_FAILED) {
+   perror("mmap for 'valid_buf'");
+   exit(EXIT_FAILURE);
+   }
+
+   /* Try to fill buffer with unmapped middle. */
+   res = read(fd, broken_buf, buf_size);
+   if (res != -1) {
+   perror("invalid read result of 'broken_buf'");


if `res` is valid, errno is not set, better to use fprintf(stderr, ...) 
printing the expected and received result.

Take a look at test_stream_connection_reset()


+   exit(EXIT_FAILURE);
+   }
+
+   if (errno != ENOMEM) {
+   perror("invalid errno of 'broken_buf'");


Instead of "invalid", I would say "unexpected".


+   exit(EXIT_FAILURE);
+   }




+
+   /* Try to fill valid buffer. */
+   res = read(fd, valid_buf, buf_size);
+   if (res != buf_size) {
+   perror("invalid read result of 'valid_buf'");


I would split in 2 checks:
- (res < 0) then use perror()
- (res != buf_size) then use fprintf(stderr, ...) printing the expected 
  and received result.



+   exit(EXIT_FAILURE);
+   }
+
+   for (i = 0; i < buf_size; i++) {
+   if (valid_buf[i] != BUF_PATTERN_2) {
+   perror("invalid pattern for valid buf");


errno is not set here, better to use fprintf(stderr, ...)


+   exit(EXIT_FAILURE);
+   }
+   }


What about replace this for with a memcmp()?


+
+
+   /* Unmap buffers. */
+   munmap(broken_buf, page_size);
+   munmap(broken_buf + page_size * 2, page_size);
+   munmap(valid_buf, buf_size);
+   close(fd);
+}
+
static struct test_case test_cases[] = {
{
.name = "SOCK_STREAM connection reset",
@@ -480,6 +596,11 @@ static struct test_case test_cases[] = {

Re: [RFC PATCH v1 2/3] af_vsock: SOCK_SEQPACKET receive timeout test

2022-03-15 Thread Stefano Garzarella

On Fri, Mar 11, 2022 at 10:55:42AM +, Krasnov Arseniy Vladimirovich wrote:

Test for receive timeout check: connection is established,
receiver sets timeout, but sender does nothing. Receiver's
'read()' call must return EAGAIN.

Signed-off-by: Arseniy Krasnov 
---
tools/testing/vsock/vsock_test.c | 49 
1 file changed, 49 insertions(+)

diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 2a3638c0a008..aa2de27d0f77 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -391,6 +391,50 @@ static void test_seqpacket_msg_trunc_server(const struct 
test_opts *opts)
close(fd);
}

+static void test_seqpacket_timeout_client(const struct test_opts *opts)
+{
+   int fd;
+   struct timeval tv;
+   char dummy;
+
+   fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
+   if (fd < 0) {
+   perror("connect");
+   exit(EXIT_FAILURE);
+   }
+
+   tv.tv_sec = 1;
+   tv.tv_usec = 0;
+
+   if (setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, (void *), sizeof(tv)) == 
-1) {
+   perror("setsockopt 'SO_RCVTIMEO'");
+   exit(EXIT_FAILURE);
+   }
+
+   if ((read(fd, , sizeof(dummy)) != -1) ||
+   (errno != EAGAIN)) {
+   perror("EAGAIN expected");
+   exit(EXIT_FAILURE);
+   }


The patch LGTM, maybe the only thing I would add here is a check on the 
time spent in the read(), to see that it is approximately the timeout we 
have set.


Thanks,
Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 1/2] vdpa: Add support for querying vendor statistics

2022-03-15 Thread Si-Wei Liu



On 3/13/2022 11:25 PM, Jason Wang wrote:



On Sun, Mar 13, 2022 at 11:26 PM Eli Cohen  wrote:

> On 3/8/2022 9:07 PM, Eli Cohen wrote:
> >
> >> -Original Message-
> >> From: Si-Wei Liu 
> >> Sent: Wednesday, March 9, 2022 5:33 AM
> >> To: Eli Cohen 
> >> Cc: m...@redhat.com; jasow...@redhat.com;
virtualization@lists.linux-
> >> foundation.org

;
epere...@redhat.com; amore...@redhat.com;
> >> lviv...@redhat.com; sgarz...@redhat.com; Parav Pandit

> >> Subject: Re: [PATCH v1 1/2] vdpa: Add support for querying
vendor statistics
> >>
> >>
> >>
> >> On 3/8/2022 6:13 AM, Eli Cohen wrote:
>  -Original Message-
>  From: Si-Wei Liu 
>  Sent: Tuesday, March 8, 2022 8:16 AM
>  To: Eli Cohen 
>  Cc: m...@redhat.com; jasow...@redhat.com;
virtualization@lists.linux-
>  foundation.org

;
epere...@redhat.com; amore...@redhat.com;
>  lviv...@redhat.com; sgarz...@redhat.com; Parav Pandit
>  
>  Subject: Re: [PATCH v1 1/2] vdpa: Add support for querying
vendor
>  statistics
> 
> 
> 
>  On 3/6/2022 11:57 PM, Eli Cohen wrote:
> >> -Original Message-
> >> From: Si-Wei Liu 
> >> Sent: Saturday, March 5, 2022 12:34 AM
> >> To: Eli Cohen 
> >> Cc: m...@redhat.com; jasow...@redhat.com;
> >> virtualization@lists.linux- foundation.org

;
epere...@redhat.com;
> >> amore...@redhat.com; lviv...@redhat.com; sgarz...@redhat.com;
> >> Parav
> >> Pandit 
> >> Subject: Re: [PATCH v1 1/2] vdpa: Add support for
querying vendor
> >> statistics
> >>
> >> Sorry, I somehow missed this after my break. Please see
comments in
> >> line.
> >> On 2/16/2022 10:46 PM, Eli Cohen wrote:
> >>> On Wed, Feb 16, 2022 at 10:49:26AM -0800, Si-Wei Liu wrote:
>  On 2/16/2022 12:00 AM, Eli Cohen wrote:
> > Allows to read vendor statistics of a vdpa device. The
specific
> > statistics data is received by the upstream driver in
the form
> > of an (attribute name, attribute value) pairs.
> >
> > An example of statistics for mlx5_vdpa device are:
> >
> > received_desc - number of descriptors received by the
virtqueue
> > completed_desc - number of descriptors completed by the
> > virtqueue
> >
> > A descriptor using indirect buffers is still counted
as 1. In
> > addition, N chained descriptors are counted correctly
N times as
> > one
> >> would expect.
> > A new callback was added to vdpa_config_ops which
provides the
> > means for the vdpa driver to return statistics results.
> >
> > The interface allows for reading all the supported
virtqueues,
> > including the control virtqueue if it exists.
> >
> > Below are some examples taken from mlx5_vdpa which are
> > introduced in the following patch:
> >
> > 1. Read statistics for the virtqueue at index 1
> >
> > $ vdpa dev vstats show vdpa-a qidx 1
> > vdpa-a:
> > queue_type tx queue_index 1 received_desc 3844836
> >> completed_desc
> > 3844836
> >
> > 2. Read statistics for the virtqueue at index 32 $
vdpa dev
> > vstats show vdpa-a qidx 32
> > vdpa-a:
> > queue_type control_vq queue_index 32 received_desc 62
> > completed_desc
> > 62
> >
> > 3. Read statisitics for the virtqueue at index 0 with json
> > output $ vdpa -j dev vstats show vdpa-a qidx 0
> > {"vstats":{"vdpa-a":{
> >
> >>
"queue_type":"rx","queue_index":0,"name":"received_desc","value":41
> >> 77
> >> 76,\
> >  "name":"completed_desc","value":417548}}}
> >
> > 4. Read statistics for the virtqueue at index 0 with
preety json
> > output $ vdpa -jp dev vstats show vdpa-a qidx 0 {
> >          "vstats": {
> > "vdpa-a": {
> >
> > "queue_type": "rx",
>  I wonder where this info can be inferred? I don't see
relevant
>  

Re: [PATCH v1 1/2] vdpa: Add support for querying vendor statistics

2022-03-15 Thread Si-Wei Liu



On 3/13/2022 8:26 AM, Eli Cohen wrote:

On 3/8/2022 9:07 PM, Eli Cohen wrote:

-Original Message-
From: Si-Wei Liu 
Sent: Wednesday, March 9, 2022 5:33 AM
To: Eli Cohen 
Cc: m...@redhat.com; jasow...@redhat.com; virtualization@lists.linux-
foundation.org; epere...@redhat.com; amore...@redhat.com;
lviv...@redhat.com; sgarz...@redhat.com; Parav Pandit 
Subject: Re: [PATCH v1 1/2] vdpa: Add support for querying vendor statistics



On 3/8/2022 6:13 AM, Eli Cohen wrote:

-Original Message-
From: Si-Wei Liu 
Sent: Tuesday, March 8, 2022 8:16 AM
To: Eli Cohen 
Cc: m...@redhat.com; jasow...@redhat.com; virtualization@lists.linux-
foundation.org; epere...@redhat.com; amore...@redhat.com;
lviv...@redhat.com; sgarz...@redhat.com; Parav Pandit

Subject: Re: [PATCH v1 1/2] vdpa: Add support for querying vendor
statistics



On 3/6/2022 11:57 PM, Eli Cohen wrote:

-Original Message-
From: Si-Wei Liu 
Sent: Saturday, March 5, 2022 12:34 AM
To: Eli Cohen 
Cc: m...@redhat.com; jasow...@redhat.com;
virtualization@lists.linux- foundation.org; epere...@redhat.com;
amore...@redhat.com; lviv...@redhat.com; sgarz...@redhat.com;

Parav

Pandit 
Subject: Re: [PATCH v1 1/2] vdpa: Add support for querying vendor
statistics

Sorry, I somehow missed this after my break. Please see comments in

line.

On 2/16/2022 10:46 PM, Eli Cohen wrote:

On Wed, Feb 16, 2022 at 10:49:26AM -0800, Si-Wei Liu wrote:

On 2/16/2022 12:00 AM, Eli Cohen wrote:

Allows to read vendor statistics of a vdpa device. The specific
statistics data is received by the upstream driver in the form
of an (attribute name, attribute value) pairs.

An example of statistics for mlx5_vdpa device are:

received_desc - number of descriptors received by the virtqueue
completed_desc - number of descriptors completed by the
virtqueue

A descriptor using indirect buffers is still counted as 1. In
addition, N chained descriptors are counted correctly N times as
one

would expect.

A new callback was added to vdpa_config_ops which provides the
means for the vdpa driver to return statistics results.

The interface allows for reading all the supported virtqueues,
including the control virtqueue if it exists.

Below are some examples taken from mlx5_vdpa which are
introduced in the following patch:

1. Read statistics for the virtqueue at index 1

$ vdpa dev vstats show vdpa-a qidx 1
vdpa-a:
queue_type tx queue_index 1 received_desc 3844836

completed_desc

3844836

2. Read statistics for the virtqueue at index 32 $ vdpa dev
vstats show vdpa-a qidx 32
vdpa-a:
queue_type control_vq queue_index 32 received_desc 62
completed_desc
62

3. Read statisitics for the virtqueue at index 0 with json
output $ vdpa -j dev vstats show vdpa-a qidx 0
{"vstats":{"vdpa-a":{


"queue_type":"rx","queue_index":0,"name":"received_desc","value":41
77
76,\

   "name":"completed_desc","value":417548}}}

4. Read statistics for the virtqueue at index 0 with preety json
output $ vdpa -jp dev vstats show vdpa-a qidx 0 {
  "vstats": {
  "vdpa-a": {

  "queue_type": "rx",

I wonder where this info can be inferred? I don't see relevant
change in the patch series that helps gather the

VDPA_ATTR_DEV_QUEUE_TYPE?

Is this an arbitrary string defined by the vendor as well? If so,
how does the user expect to consume it?

The queue tupe is deduced from the index and whether we have a
virtqueue. Even numbers are rx, odd numbers are tx and if there is
CVQ, the last one is CVQ.

OK, then VDPA_ATTR_DEV_QUEUE_TYPE attribute introduced in this
patch might not be useful at all?

Right, will remove.


And how do you determine in the vdpa tool if CVQ is negotiated or
not?

I make a netlink call to get the same information as " vdpa dev config

show"

retrieves. I use the negotiated features to determine if a CVQ is
available. If it is, the number of VQs equals the control VQ index.
So there are two netlink calls under the hood.
The lock vdpa_dev_mutex won't hold across the two separate netlink
calls, and it may end up with inconsistent state - theoretically
things could happen like that the first call gets CVQ negotiated, but
the later call for
get_vendor_vq_stats() on the cvq might get -EINVAL due to device
reset. Can the negotiated status and stat query be done within one single

netlink call?

I see your concern.
The only reason I do the extra call is to know if we have a control VQ
and what index it is, just to print a descriptive string telling if it's a 
either rx,

tx or control VQ.

So the cure can be simple. Let's have a new attribute that returns the
type of virtqueue.

I am not sure I follow the cure. Wouldn't it be possible to get both negotiated
status and the queue stat in vdpa_nl_cmd_dev_stats_get_doit() under the
same vdpa_dev_mutex lock?

Yes we can, but I suggested to get only the type of the queue as a new 
attribute.
The kernel will do the digest and decide per a given VQ if it's rx, tx or 
control and
return the result in that new attribute.