Re: [PATCH 1/1] vhost: Provide a kernel warning if mutex is held whilst clean-up in progress

2022-03-03 Thread Stefano Garzarella

On Thu, Mar 03, 2022 at 04:01:06PM -0500, Michael S. Tsirkin wrote:

On Thu, Mar 03, 2022 at 09:14:36PM +0200, Leon Romanovsky wrote:

On Thu, Mar 03, 2022 at 03:19:29PM +, Lee Jones wrote:
> All workers/users should be halted before any clean-up should take place.
>
> Suggested-by:  Michael S. Tsirkin 
> Signed-off-by: Lee Jones 
> ---
>  drivers/vhost/vhost.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index bbaff6a5e21b8..d935d2506963f 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -693,6 +693,9 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
>int i;
>
>for (i = 0; i < dev->nvqs; ++i) {
> +  /* Ideally all workers should be stopped prior to clean-up */
> +  WARN_ON(mutex_is_locked(>vqs[i]->mutex));
> +
>mutex_lock(>vqs[i]->mutex);

I know nothing about vhost, but this construction and patch looks
strange to me.

If all workers were stopped, you won't need mutex_lock(). The mutex_lock
here suggests to me that workers can still run here.

Thanks



"Ideally" here is misleading, we need a bigger detailed comment
along the lines of:

/*
* By design, no workers can run here. But if there's a bug and the
* driver did not flush all work properly then they might, and we
* encountered such bugs in the past.  With no proper flush guest won't
* work correctly but avoiding host memory corruption in this case
* sounds like a good idea.
*/


Can we use vhost_vq_get_backend() to check this situation?

IIUC all the vhost devices clear the backend to stop the workers.
This is not racy (if we do after the mutex_lock) and should cover all 
cases.


Thanks,
Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/1] vhost: Provide a kernel warning if mutex is held whilst clean-up in progress

2022-03-03 Thread Leon Romanovsky
On Thu, Mar 03, 2022 at 04:01:06PM -0500, Michael S. Tsirkin wrote:
> On Thu, Mar 03, 2022 at 09:14:36PM +0200, Leon Romanovsky wrote:
> > On Thu, Mar 03, 2022 at 03:19:29PM +, Lee Jones wrote:
> > > All workers/users should be halted before any clean-up should take place.
> > > 
> > > Suggested-by:  Michael S. Tsirkin 
> > > Signed-off-by: Lee Jones 
> > > ---
> > >  drivers/vhost/vhost.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > > index bbaff6a5e21b8..d935d2506963f 100644
> > > --- a/drivers/vhost/vhost.c
> > > +++ b/drivers/vhost/vhost.c
> > > @@ -693,6 +693,9 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
> > >   int i;
> > >  
> > >   for (i = 0; i < dev->nvqs; ++i) {
> > > + /* Ideally all workers should be stopped prior to clean-up */
> > > + WARN_ON(mutex_is_locked(>vqs[i]->mutex));
> > > +
> > >   mutex_lock(>vqs[i]->mutex);
> > 
> > I know nothing about vhost, but this construction and patch looks
> > strange to me.
> > 
> > If all workers were stopped, you won't need mutex_lock(). The mutex_lock
> > here suggests to me that workers can still run here.
> > 
> > Thanks
> 
> 
> "Ideally" here is misleading, we need a bigger detailed comment
> along the lines of:
> 
> /* 
>  * By design, no workers can run here. But if there's a bug and the
>  * driver did not flush all work properly then they might, and we
>  * encountered such bugs in the past.  With no proper flush guest won't
>  * work correctly but avoiding host memory corruption in this case
>  * sounds like a good idea.
>  */

This description looks better, but the check is inherently racy.
Why don't you add a comment and mutex_lock()? The WARN_ON here is
more distraction than actual help.

Thanks

> 
> > >   if (dev->vqs[i]->error_ctx)
> > >   eventfd_ctx_put(dev->vqs[i]->error_ctx);
> > > -- 
> > > 2.35.1.574.g5d30c73bfb-goog
> > > 
> 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/1] vhost: Protect the virtqueue from being cleared whilst still in use

2022-03-03 Thread Michael S. Tsirkin
On Wed, Mar 02, 2022 at 07:54:21AM +, Lee Jones wrote:
> vhost_vsock_handle_tx_kick() already holds the mutex during its call
> to vhost_get_vq_desc().  All we have to do is take the same lock
> during virtqueue clean-up and we mitigate the reported issues.
> 
> Link: https://syzkaller.appspot.com/bug?extid=279432d30d825e63ba00
> 
> Cc: 
> Reported-by: syzbot+adc3cb32385586bec...@syzkaller.appspotmail.com
> Signed-off-by: Lee Jones 

So combine with the warning patch and update description with
the comment I posted, explaining it's more a just in case thing.

> ---
>  drivers/vhost/vhost.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 59edb5a1ffe28..bbaff6a5e21b8 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -693,6 +693,7 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
>   int i;
>  
>   for (i = 0; i < dev->nvqs; ++i) {
> + mutex_lock(>vqs[i]->mutex);
>   if (dev->vqs[i]->error_ctx)
>   eventfd_ctx_put(dev->vqs[i]->error_ctx);
>   if (dev->vqs[i]->kick)
> @@ -700,6 +701,7 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
>   if (dev->vqs[i]->call_ctx.ctx)
>   eventfd_ctx_put(dev->vqs[i]->call_ctx.ctx);
>   vhost_vq_reset(dev, dev->vqs[i]);
> + mutex_unlock(>vqs[i]->mutex);
>   }
>   vhost_dev_free_iovecs(dev);
>   if (dev->log_ctx)
> -- 
> 2.35.1.574.g5d30c73bfb-goog

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 09/14] vhost: Add VhostIOVATree

2022-03-03 Thread Jason Wang
On Fri, Mar 4, 2022 at 12:33 AM Eugenio Perez Martin
 wrote:
>
> On Mon, Feb 28, 2022 at 8:06 AM Jason Wang  wrote:
> >
> >
> > 在 2022/2/27 下午9:41, Eugenio Pérez 写道:
> > > This tree is able to look for a translated address from an IOVA address.
> > >
> > > At first glance it is similar to util/iova-tree. However, SVQ working on
> > > devices with limited IOVA space need more capabilities, like allocating
> > > IOVA chunks or performing reverse translations (qemu addresses to iova).
> > >
> > > The allocation capability, as "assign a free IOVA address to this chunk
> > > of memory in qemu's address space" allows shadow virtqueue to create a
> > > new address space that is not restricted by guest's addressable one, so
> > > we can allocate shadow vqs vrings outside of it.
> > >
> > > It duplicates the tree so it can search efficiently in both directions,
> > > and it will signal overlap if iova or the translated address is present
> > > in any tree.
> > >
> > > Signed-off-by: Eugenio Pérez 
> > > ---
> > >   hw/virtio/vhost-iova-tree.h |  27 +++
> > >   hw/virtio/vhost-iova-tree.c | 155 
> > >   hw/virtio/meson.build   |   2 +-
> > >   3 files changed, 183 insertions(+), 1 deletion(-)
> > >   create mode 100644 hw/virtio/vhost-iova-tree.h
> > >   create mode 100644 hw/virtio/vhost-iova-tree.c
> > >
> > > diff --git a/hw/virtio/vhost-iova-tree.h b/hw/virtio/vhost-iova-tree.h
> > > new file mode 100644
> > > index 00..6a4f24e0f9
> > > --- /dev/null
> > > +++ b/hw/virtio/vhost-iova-tree.h
> > > @@ -0,0 +1,27 @@
> > > +/*
> > > + * vhost software live migration iova tree
> > > + *
> > > + * SPDX-FileCopyrightText: Red Hat, Inc. 2021
> > > + * SPDX-FileContributor: Author: Eugenio Pérez 
> > > + *
> > > + * SPDX-License-Identifier: GPL-2.0-or-later
> > > + */
> > > +
> > > +#ifndef HW_VIRTIO_VHOST_IOVA_TREE_H
> > > +#define HW_VIRTIO_VHOST_IOVA_TREE_H
> > > +
> > > +#include "qemu/iova-tree.h"
> > > +#include "exec/memory.h"
> > > +
> > > +typedef struct VhostIOVATree VhostIOVATree;
> > > +
> > > +VhostIOVATree *vhost_iova_tree_new(uint64_t iova_first, uint64_t 
> > > iova_last);
> > > +void vhost_iova_tree_delete(VhostIOVATree *iova_tree);
> > > +G_DEFINE_AUTOPTR_CLEANUP_FUNC(VhostIOVATree, vhost_iova_tree_delete);
> > > +
> > > +const DMAMap *vhost_iova_tree_find_iova(const VhostIOVATree *iova_tree,
> > > +const DMAMap *map);
> > > +int vhost_iova_tree_map_alloc(VhostIOVATree *iova_tree, DMAMap *map);
> > > +void vhost_iova_tree_remove(VhostIOVATree *iova_tree, const DMAMap *map);
> > > +
> > > +#endif
> > > diff --git a/hw/virtio/vhost-iova-tree.c b/hw/virtio/vhost-iova-tree.c
> > > new file mode 100644
> > > index 00..03496ac075
> > > --- /dev/null
> > > +++ b/hw/virtio/vhost-iova-tree.c
> > > @@ -0,0 +1,155 @@
> > > +/*
> > > + * vhost software live migration iova tree
> > > + *
> > > + * SPDX-FileCopyrightText: Red Hat, Inc. 2021
> > > + * SPDX-FileContributor: Author: Eugenio Pérez 
> > > + *
> > > + * SPDX-License-Identifier: GPL-2.0-or-later
> > > + */
> > > +
> > > +#include "qemu/osdep.h"
> > > +#include "qemu/iova-tree.h"
> > > +#include "vhost-iova-tree.h"
> > > +
> > > +#define iova_min_addr qemu_real_host_page_size
> > > +
> > > +/**
> > > + * VhostIOVATree, able to:
> > > + * - Translate iova address
> > > + * - Reverse translate iova address (from translated to iova)
> > > + * - Allocate IOVA regions for translated range (linear operation)
> > > + */
> > > +struct VhostIOVATree {
> > > +/* First addressable iova address in the device */
> > > +uint64_t iova_first;
> > > +
> > > +/* Last addressable iova address in the device */
> > > +uint64_t iova_last;
> > > +
> > > +/* IOVA address to qemu memory maps. */
> > > +IOVATree *iova_taddr_map;
> > > +
> > > +/* QEMU virtual memory address to iova maps */
> > > +GTree *taddr_iova_map;
> > > +};
> > > +
> > > +static gint vhost_iova_tree_cmp_taddr(gconstpointer a, gconstpointer b,
> > > +  gpointer data)
> > > +{
> > > +const DMAMap *m1 = a, *m2 = b;
> > > +
> > > +if (m1->translated_addr > m2->translated_addr + m2->size) {
> > > +return 1;
> > > +}
> > > +
> > > +if (m1->translated_addr + m1->size < m2->translated_addr) {
> > > +return -1;
> > > +}
> > > +
> > > +/* Overlapped */
> > > +return 0;
> > > +}
> > > +
> > > +/**
> > > + * Create a new IOVA tree
> > > + *
> > > + * Returns the new IOVA tree
> > > + */
> > > +VhostIOVATree *vhost_iova_tree_new(hwaddr iova_first, hwaddr iova_last)
> > > +{
> > > +VhostIOVATree *tree = g_new(VhostIOVATree, 1);
> > > +
> > > +/* Some devices do not like 0 addresses */
> > > +tree->iova_first = MAX(iova_first, iova_min_addr);
> > > +tree->iova_last = iova_last;
> > > +
> > > +tree->iova_taddr_map = iova_tree_new();
> > > +tree->taddr_iova_map = 

Re: [PATCH v2 02/14] vhost: Add Shadow VirtQueue kick forwarding capabilities

2022-03-03 Thread Jason Wang
On Thu, Mar 3, 2022 at 5:25 PM Eugenio Perez Martin  wrote:
>
> On Thu, Mar 3, 2022 at 8:12 AM Jason Wang  wrote:
> >
> >
> > 在 2022/3/2 上午2:49, Eugenio Perez Martin 写道:
> > > On Mon, Feb 28, 2022 at 3:57 AM Jason Wang  wrote:
> > >> 在 2022/2/27 下午9:40, Eugenio Pérez 写道:
> > >>> At this mode no buffer forwarding will be performed in SVQ mode: Qemu
> > >>> will just forward the guest's kicks to the device.
> > >>>
> > >>> Host memory notifiers regions are left out for simplicity, and they will
> > >>> not be addressed in this series.
> > >>>
> > >>> Signed-off-by: Eugenio Pérez
> > >>> ---
> > >>>hw/virtio/vhost-shadow-virtqueue.h |  14 +++
> > >>>include/hw/virtio/vhost-vdpa.h |   4 +
> > >>>hw/virtio/vhost-shadow-virtqueue.c |  52 +++
> > >>>hw/virtio/vhost-vdpa.c | 145 
> > >>> -
> > >>>4 files changed, 213 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
> > >>> b/hw/virtio/vhost-shadow-virtqueue.h
> > >>> index f1519e3c7b..1cbc87d5d8 100644
> > >>> --- a/hw/virtio/vhost-shadow-virtqueue.h
> > >>> +++ b/hw/virtio/vhost-shadow-virtqueue.h
> > >>> @@ -18,8 +18,22 @@ typedef struct VhostShadowVirtqueue {
> > >>>EventNotifier hdev_kick;
> > >>>/* Shadow call notifier, sent to vhost */
> > >>>EventNotifier hdev_call;
> > >>> +
> > >>> +/*
> > >>> + * Borrowed virtqueue's guest to host notifier. To borrow it in 
> > >>> this event
> > >>> + * notifier allows to recover the VhostShadowVirtqueue from the 
> > >>> event loop
> > >>> + * easily. If we use the VirtQueue's one, we don't have an easy 
> > >>> way to
> > >>> + * retrieve VhostShadowVirtqueue.
> > >>> + *
> > >>> + * So shadow virtqueue must not clean it, or we would lose 
> > >>> VirtQueue one.
> > >>> + */
> > >>> +EventNotifier svq_kick;
> > >>>} VhostShadowVirtqueue;
> > >>>
> > >>> +void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int 
> > >>> svq_kick_fd);
> > >>> +
> > >>> +void vhost_svq_stop(VhostShadowVirtqueue *svq);
> > >>> +
> > >>>VhostShadowVirtqueue *vhost_svq_new(void);
> > >>>
> > >>>void vhost_svq_free(gpointer vq);
> > >>> diff --git a/include/hw/virtio/vhost-vdpa.h 
> > >>> b/include/hw/virtio/vhost-vdpa.h
> > >>> index 3ce79a646d..009a9f3b6b 100644
> > >>> --- a/include/hw/virtio/vhost-vdpa.h
> > >>> +++ b/include/hw/virtio/vhost-vdpa.h
> > >>> @@ -12,6 +12,8 @@
> > >>>#ifndef HW_VIRTIO_VHOST_VDPA_H
> > >>>#define HW_VIRTIO_VHOST_VDPA_H
> > >>>
> > >>> +#include 
> > >>> +
> > >>>#include "hw/virtio/virtio.h"
> > >>>#include "standard-headers/linux/vhost_types.h"
> > >>>
> > >>> @@ -27,6 +29,8 @@ typedef struct vhost_vdpa {
> > >>>bool iotlb_batch_begin_sent;
> > >>>MemoryListener listener;
> > >>>struct vhost_vdpa_iova_range iova_range;
> > >>> +bool shadow_vqs_enabled;
> > >>> +GPtrArray *shadow_vqs;
> > >>>struct vhost_dev *dev;
> > >>>VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > >>>} VhostVDPA;
> > >>> diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
> > >>> b/hw/virtio/vhost-shadow-virtqueue.c
> > >>> index 019cf1950f..a5d0659f86 100644
> > >>> --- a/hw/virtio/vhost-shadow-virtqueue.c
> > >>> +++ b/hw/virtio/vhost-shadow-virtqueue.c
> > >>> @@ -11,6 +11,56 @@
> > >>>#include "hw/virtio/vhost-shadow-virtqueue.h"
> > >>>
> > >>>#include "qemu/error-report.h"
> > >>> +#include "qemu/main-loop.h"
> > >>> +#include "linux-headers/linux/vhost.h"
> > >>> +
> > >>> +/** Forward guest notifications */
> > >>> +static void vhost_handle_guest_kick(EventNotifier *n)
> > >>> +{
> > >>> +VhostShadowVirtqueue *svq = container_of(n, VhostShadowVirtqueue,
> > >>> + svq_kick);
> > >>> +event_notifier_test_and_clear(n);
> > >>> +event_notifier_set(>hdev_kick);
> > >>> +}
> > >>> +
> > >>> +/**
> > >>> + * Set a new file descriptor for the guest to kick the SVQ and notify 
> > >>> for avail
> > >>> + *
> > >>> + * @svq  The svq
> > >>> + * @svq_kick_fd  The svq kick fd
> > >>> + *
> > >>> + * Note that the SVQ will never close the old file descriptor.
> > >>> + */
> > >>> +void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int 
> > >>> svq_kick_fd)
> > >>> +{
> > >>> +EventNotifier *svq_kick = >svq_kick;
> > >>> +bool poll_stop = VHOST_FILE_UNBIND != 
> > >>> event_notifier_get_fd(svq_kick);
> > >> I wonder if this is robust. E.g is there any chance that may end up with
> > >> both poll_stop and poll_start are false?
> > >>
> > > I cannot make that happen in qemu, but the function supports that case
> > > well: It will do nothing. It's more or less the same code as used in
> > > the vhost kernel, and is the expected behaviour if you send two
> > > VHOST_FILE_UNBIND one right after another to me.
> >
> >
> > I would think it's just stop twice.
> >
> >
> > >
> > >> If not, can we simple 

Re: [PATCH 1/1] vhost: Provide a kernel warning if mutex is held whilst clean-up in progress

2022-03-03 Thread Michael S. Tsirkin
On Thu, Mar 03, 2022 at 09:14:36PM +0200, Leon Romanovsky wrote:
> On Thu, Mar 03, 2022 at 03:19:29PM +, Lee Jones wrote:
> > All workers/users should be halted before any clean-up should take place.
> > 
> > Suggested-by:  Michael S. Tsirkin 
> > Signed-off-by: Lee Jones 
> > ---
> >  drivers/vhost/vhost.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index bbaff6a5e21b8..d935d2506963f 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -693,6 +693,9 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
> > int i;
> >  
> > for (i = 0; i < dev->nvqs; ++i) {
> > +   /* Ideally all workers should be stopped prior to clean-up */
> > +   WARN_ON(mutex_is_locked(>vqs[i]->mutex));
> > +
> > mutex_lock(>vqs[i]->mutex);
> 
> I know nothing about vhost, but this construction and patch looks
> strange to me.
> 
> If all workers were stopped, you won't need mutex_lock(). The mutex_lock
> here suggests to me that workers can still run here.
> 
> Thanks


"Ideally" here is misleading, we need a bigger detailed comment
along the lines of:

/* 
 * By design, no workers can run here. But if there's a bug and the
 * driver did not flush all work properly then they might, and we
 * encountered such bugs in the past.  With no proper flush guest won't
 * work correctly but avoiding host memory corruption in this case
 * sounds like a good idea.
 */

> > if (dev->vqs[i]->error_ctx)
> > eventfd_ctx_put(dev->vqs[i]->error_ctx);
> > -- 
> > 2.35.1.574.g5d30c73bfb-goog
> > 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/1] vhost: Provide a kernel warning if mutex is held whilst clean-up in progress

2022-03-03 Thread Lee Jones
On Thu, 03 Mar 2022, Leon Romanovsky wrote:

> On Thu, Mar 03, 2022 at 03:19:29PM +, Lee Jones wrote:
> > All workers/users should be halted before any clean-up should take place.
> > 
> > Suggested-by:  Michael S. Tsirkin 
> > Signed-off-by: Lee Jones 
> > ---
> >  drivers/vhost/vhost.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index bbaff6a5e21b8..d935d2506963f 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -693,6 +693,9 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
> > int i;
> >  
> > for (i = 0; i < dev->nvqs; ++i) {
> > +   /* Ideally all workers should be stopped prior to clean-up */
> > +   WARN_ON(mutex_is_locked(>vqs[i]->mutex));
> > +
> > mutex_lock(>vqs[i]->mutex);
> 
> I know nothing about vhost, but this construction and patch looks
> strange to me.
> 
> If all workers were stopped, you won't need mutex_lock(). The mutex_lock
> here suggests to me that workers can still run here.

The suggestion for this patch came from the maintainer.

Please see the conversation here:

https://lore.kernel.org/all/20220302082021-mutt-send-email-...@kernel.org/

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 1/1] vhost: Provide a kernel warning if mutex is held whilst clean-up in progress

2022-03-03 Thread Leon Romanovsky
On Thu, Mar 03, 2022 at 03:19:29PM +, Lee Jones wrote:
> All workers/users should be halted before any clean-up should take place.
> 
> Suggested-by:  Michael S. Tsirkin 
> Signed-off-by: Lee Jones 
> ---
>  drivers/vhost/vhost.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index bbaff6a5e21b8..d935d2506963f 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -693,6 +693,9 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
>   int i;
>  
>   for (i = 0; i < dev->nvqs; ++i) {
> + /* Ideally all workers should be stopped prior to clean-up */
> + WARN_ON(mutex_is_locked(>vqs[i]->mutex));
> +
>   mutex_lock(>vqs[i]->mutex);

I know nothing about vhost, but this construction and patch looks
strange to me.

If all workers were stopped, you won't need mutex_lock(). The mutex_lock
here suggests to me that workers can still run here.

Thanks

>   if (dev->vqs[i]->error_ctx)
>   eventfd_ctx_put(dev->vqs[i]->error_ctx);
> -- 
> 2.35.1.574.g5d30c73bfb-goog
> 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Call for Posters - ACM HPDC 2022

2022-03-03 Thread Ali Anwar
==
  CALL FOR POSTERS
  The 31st International ACM Symposium on
High-Performance Parallel and Distributed Computing (HPDC’22)
 Minneapolis, Minnesota, United States, June 27 - July 1, 2022
 http://www.hpdc.org/2022/posters/call-for-posters/
==

=== Overview ===
The ACM International Symposium on High-Performance Parallel and
Distributed Computing (HPDC) will include a poster session with an
associated 2-page extended abstract, which will be included in the HPDC
2022 proceedings. The topics of interest follow the regular paper
submission and include (but are not limited to): cloud computing, cluster
computing, edge computing, big data, massively multicore, AI for systems,
and extreme-scale computing systems. Accepted posters will be given the
opportunity to present their work in a dedicated lightning-/blitz-session
(using in-person presentation or virtual presentation) during the main
conference.

=== Submission Guidelines ===
We invite authors to submit poster contributions in the form of a two-page
extended abstract and a poster draft. Abstracts should be formatted in the
ACM Proceedings Style and submitted via the poster submission web site (to
be announced). Each submission should include a title, all author names and
affiliations, notes whether any of the authors are students, and an
indication of which author is corresponding/presenting the material. Poster
abstracts will be reviewed by the poster program committee through a
single-blind process and will be evaluated based on the following criteria:
• Submissions must describe new and interesting ideas related to HPDC
topics of interest
• Submissions can present work in progress, but authors are strongly
encouraged to include preliminary experimental results, if available
• Student submissions meeting the above criteria will be given preference

=== Poster Chair ===
Dong Li, University of California, Merced, USA

=== Poster PC Members ===
Bogdan Nicolae, Argonne National Lab, USA
Min Si, Facebook, USA
Kento Sato, RIKEN Center for Computational Science (RIKEN R-CCS), Japan

=== Submission ===
Submissions are now open here (hpdc22-posters.hotcrp.com).

=== Deadlines (Times are AOE) ===
Abstracts due: April 4th, 2022
Author notifications: April 19th, 2022
Camera-ready Upload: April 21st, 2022
Poster date: June 27th, 2022
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 1/1] vhost: Provide a kernel warning if mutex is held whilst clean-up in progress

2022-03-03 Thread Lee Jones
All workers/users should be halted before any clean-up should take place.

Suggested-by:  Michael S. Tsirkin 
Signed-off-by: Lee Jones 
---
 drivers/vhost/vhost.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index bbaff6a5e21b8..d935d2506963f 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -693,6 +693,9 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
int i;
 
for (i = 0; i < dev->nvqs; ++i) {
+   /* Ideally all workers should be stopped prior to clean-up */
+   WARN_ON(mutex_is_locked(>vqs[i]->mutex));
+
mutex_lock(>vqs[i]->mutex);
if (dev->vqs[i]->error_ctx)
eventfd_ctx_put(dev->vqs[i]->error_ctx);
-- 
2.35.1.574.g5d30c73bfb-goog

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/1] vhost: Protect the virtqueue from being cleared whilst still in use

2022-03-03 Thread Lee Jones
On Wed, 02 Mar 2022, Stefano Garzarella wrote:

> On Wed, Mar 02, 2022 at 04:49:17PM +, Lee Jones wrote:
> > On Wed, 02 Mar 2022, Michael S. Tsirkin wrote:
> > 
> > > On Wed, Mar 02, 2022 at 05:28:31PM +0100, Stefano Garzarella wrote:
> > > > On Wed, Mar 2, 2022 at 3:57 PM Lee Jones  wrote:
> > > > >
> > > > > On Wed, 02 Mar 2022, Michael S. Tsirkin wrote:
> > > > >
> > > > > > On Wed, Mar 02, 2022 at 01:56:35PM +, Lee Jones wrote:
> > > > > > > On Wed, 02 Mar 2022, Michael S. Tsirkin wrote:
> > > > > > >
> > > > > > > > On Wed, Mar 02, 2022 at 07:54:21AM +, Lee Jones wrote:
> > > > > > > > > vhost_vsock_handle_tx_kick() already holds the mutex during 
> > > > > > > > > its call
> > > > > > > > > to vhost_get_vq_desc().  All we have to do is take the same 
> > > > > > > > > lock
> > > > > > > > > during virtqueue clean-up and we mitigate the reported issues.
> > > > > > > > >
> > > > > > > > > Link: 
> > > > > > > > > https://syzkaller.appspot.com/bug?extid=279432d30d825e63ba00
> > > > > > > > >
> > > > > > > > > Cc: 
> > > > > > > > > Reported-by: 
> > > > > > > > > syzbot+adc3cb32385586bec...@syzkaller.appspotmail.com
> > > > > > > > > Signed-off-by: Lee Jones 
> > > > > > > > > ---
> > > > > > > > >  drivers/vhost/vhost.c | 2 ++
> > > > > > > > >  1 file changed, 2 insertions(+)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > > > > > > > > index 59edb5a1ffe28..bbaff6a5e21b8 100644
> > > > > > > > > --- a/drivers/vhost/vhost.c
> > > > > > > > > +++ b/drivers/vhost/vhost.c
> > > > > > > > > @@ -693,6 +693,7 @@ void vhost_dev_cleanup(struct vhost_dev 
> > > > > > > > > *dev)
> > > > > > > > > int i;
> > > > > > > > >
> > > > > > > > > for (i = 0; i < dev->nvqs; ++i) {
> > > > > > > > > +   mutex_lock(>vqs[i]->mutex);
> > > > > > > > > if (dev->vqs[i]->error_ctx)
> > > > > > > > > 
> > > > > > > > > eventfd_ctx_put(dev->vqs[i]->error_ctx);
> > > > > > > > > if (dev->vqs[i]->kick)
> > > > > > > > > @@ -700,6 +701,7 @@ void vhost_dev_cleanup(struct vhost_dev 
> > > > > > > > > *dev)
> > > > > > > > > if (dev->vqs[i]->call_ctx.ctx)
> > > > > > > > > 
> > > > > > > > > eventfd_ctx_put(dev->vqs[i]->call_ctx.ctx);
> > > > > > > > > vhost_vq_reset(dev, dev->vqs[i]);
> > > > > > > > > +   mutex_unlock(>vqs[i]->mutex);
> > > > > > > > > }
> > > > > > > >
> > > > > > > > So this is a mitigation plan but the bug is still there though
> > > > > > > > we don't know exactly what it is.  I would prefer adding 
> > > > > > > > something like
> > > > > > > > WARN_ON(mutex_is_locked(vqs[i]->mutex) here - does this make 
> > > > > > > > sense?
> > > > > > >
> > > > > > > As a rework to this, or as a subsequent patch?
> > > > > >
> > > > > > Can be a separate patch.
> > > > > >
> > > > > > > Just before the first lock I assume?
> > > > > >
> > > > > > I guess so, yes.
> > > > >
> > > > > No problem.  Patch to follow.
> > > > >
> > > > > I'm also going to attempt to debug the root cause, but I'm new to this
> > > > > subsystem to it might take a while for me to get my head around.
> > > >
> > > > IIUC the root cause should be the same as the one we solved here:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a58da53ffd70294ebea8ecd0eb45fd0d74add9f9
> > > >
> > > > The worker was not stopped before calling vhost_dev_cleanup(). So while
> > > > the worker was still running we were going to free memory or initialize
> > > > fields while it was still using virtqueue.
> > > 
> > > Right, and I agree but it's not the root though, we do attempt to stop 
> > > all workers.
> > 
> > Exactly.  This is what happens, but the question I'm going to attempt
> > to answer is *why* does this happen.
> 
> IIUC the worker was still running because the /dev/vhost-vsock file was not
> explicitly closed, so vhost_vsock_dev_release() was called in the do_exit()
> of the process.
> 
> In that case there was the issue, because vhost_dev_check_owner() returned
> false in vhost_vsock_stop() since current->mm was NULL.
> So it returned earlier, without calling vhost_vq_set_backend(vq, NULL).
> 
> This did not stop the worker from continuing to run, causing the multiple
> issues we are seeing.
> 
> current->mm was NULL, because in the do_exit() the address space is cleaned
> in the exit_mm(), which is called before releasing the files into the
> exit_task_work().
> 
> This can be seen from the logs, where we see first the warnings printed by
> vhost_dev_cleanup() and then the panic in the worker (e.g. here
> https://syzkaller.appspot.com/text?tag=CrashLog=16a61fce70)
> 
> Mike also added a few more helpful details in this thread:
> https://lore.kernel.org/virtualization/20220221100500.2x3s2sddqahgdfyt@sgarzare-redhat/T/#ree61316eac63245c9ba3050b44330e4034282cc2

I guess that about sums it up. :)

Re: [virtio-comment] [PATCH v5 1/2] virtio-vsock: add description for datagram type

2022-03-03 Thread Michael S. Tsirkin
On Thu, Mar 03, 2022 at 03:29:31AM +, Bobby Eshleman wrote:
> On Wed, Mar 02, 2022 at 05:09:58PM +0100, Stefano Garzarella wrote:
> > Hi Bobby,
> > Sorry for the delay, but I saw these patches today.
> > Please, can you keep me in CC?
> > 
> 
> Hey Stefano, sorry about that. I'm not sure how I lost your CC on this
> one. I'll make sure you are there moving forward.
> 
> I want to mention that I'm taking a look at
> https://gitlab.com/vsock/vsock/-/issues/1 in parallel with my dgram work
> here. After sending out this series we noticed potential overlap between
> the two issues. The additional dgram queues may become redundant if a
> fairness mechanism that solves issue #1 above also applies to
> connection-less protocols (similar to how the TC subsystem works). I've
> just begun sorting out potential solutions so no hard results yet. Just
> putting on your radar that the proposal here in v5 may be impacted if my
> investigation into issue #1 yields something adequate.


With respect to datagram, there is actually another issue which also
exists for stream but is smaller there - per message overhead is not
accounted for.  For stream we can work around that by copying payload
data.  For datagram we can't as we need to preserve message boundaries.
One way to address that is to add config for host/guest per-message
overhead, and have sender decrement fwd counter by that value for
each message sent.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [virtio-comment] [PATCH v5 1/2] virtio-vsock: add description for datagram type

2022-03-03 Thread Stefano Garzarella

On Thu, Mar 03, 2022 at 03:29:31AM +, Bobby Eshleman wrote:

On Wed, Mar 02, 2022 at 05:09:58PM +0100, Stefano Garzarella wrote:

Hi Bobby,
Sorry for the delay, but I saw these patches today.
Please, can you keep me in CC?



Hey Stefano, sorry about that. I'm not sure how I lost your CC on this
one. I'll make sure you are there moving forward.


No problem :-)



I want to mention that I'm taking a look at
https://gitlab.com/vsock/vsock/-/issues/1 in parallel with my dgram work
here. After sending out this series we noticed potential overlap between
the two issues. The additional dgram queues may become redundant if a
fairness mechanism that solves issue #1 above also applies to
connection-less protocols (similar to how the TC subsystem works). I've
just begun sorting out potential solutions so no hard results yet. Just
putting on your radar that the proposal here in v5 may be impacted if my
investigation into issue #1 yields something adequate.


Oh, this would be great!




On Thu, Feb 24, 2022 at 10:15:46PM +, beshleman.dev...@gmail.com wrote:
> From: Jiang Wang 
>


... snip ...


>
> virtio-vsock.tex | 63 +++-
> 1 file changed, 53 insertions(+), 10 deletions(-)
>
> diff --git a/virtio-vsock.tex b/virtio-vsock.tex
> index d79984d..1a66a1b 100644
> --- a/virtio-vsock.tex
> +++ b/virtio-vsock.tex
> @@ -9,11 +9,26 @@ \subsection{Device ID}\label{sec:Device Types / Socket 
Device / Device ID}
>
> \subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues}
> \begin{description}
> -\item[0] rx
> -\item[1] tx
> +\item[0] stream rx
> +\item[1] stream tx
> +\item[2] datagram rx
> +\item[3] datagram tx
> +\item[4] event
> +\end{description}
> +The virtio socket device uses 5 queues if feature bit VIRTIO_VSOCK_F_DRGAM 
is set. Otherwise, it
> +only uses 3 queues, as the following.

We are also adding a new flag (VIRTIO_VSOCK_F_NO_IMPLIED_STREAM) to
provide the possibility to support for example only dgrams.

So I think we should consider the case where we have only DGRAM queues
(and it will be similar to the stream only case so 3 queues).

Maybe we could describe this part better and say that if we have both
STREAM (or SEQPACKET) and DGRAM set we have 5 queues, otherwise
only 3 queues.



Roger that.


> \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket 
Device / Device Operation / Buffer Space Management}
> \field{buf_alloc} and \field{fwd_cnt} are used for buffer space management of
> stream sockets. The guest and the device publish how much buffer space is
> @@ -170,7 +193,7 @@ \subsubsection{Buffer Space Management}\label{sec:Device 
Types / Socket Device /
> u32 peer_free = peer_buf_alloc - (tx_cnt - peer_fwd_cnt);
> \end{lstlisting}
>
> -If there is insufficient buffer space, the sender waits until virtqueue 
buffers
> +For stream sockets, if there is insufficient buffer space, the sender waits 
until virtqueue buffers

stream and seqpacket

> are returned and checks \field{buf_alloc} and \field{fwd_cnt} again. Sending
> the VIRTIO_VSOCK_OP_CREDIT_REQUEST packet queries how much buffer space is
> available. The reply to this query is a VIRTIO_VSOCK_OP_CREDIT_UPDATE packet.
> @@ -178,22 +201,32 @@ \subsubsection{Buffer Space 
Management}\label{sec:Device Types / Socket Device /
> previously receiving a VIRTIO_VSOCK_OP_CREDIT_REQUEST packet. This allows
> communicating updates any time a change in buffer space occurs.
>
> +Unlike stream sockets, dgram sockets do not use VIRTIO_VSOCK_OP_CREDIT_UPDATE
> +or VIRTIO_VSOCK_OP_CREDIT_REQUEST packets. The dgram buffer management is 
split
> +into two parts: senders and receivers. For senders, the packet is dropped if 
the
> +virtqueue is full. For receivers, the packet is dropped if there is no space
> +in the receive buffer.
> +
> \drivernormative{\paragraph}{Device Operation: Buffer Space 
Management}{Device Types / Socket Device / Device Operation / Buffer Space 
Management}
> -VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has
> -sufficient free buffer space for the payload.
> +For stream sockets, VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted 
when the peer has

stream and seqpacket

> +sufficient free buffer space for the payload. For dgram sockets, 
VIRTIO_VSOCK_OP_RW data packets
> +MAY be transmitted when the peer rx buffer is full. Then the packet will be 
dropped by the peer,
> +and driver will not get any notification.
>
> All packets associated with a stream flow MUST contain valid information in
> \field{buf_alloc} and \field{fwd_cnt} fields.
>
> \devicenormative{\paragraph}{Device Operation: Buffer Space 
Management}{Device Types / Socket Device / Device Operation / Buffer Space 
Management}
> -VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has
> -sufficient free buffer space for the payload.
> +For stream sockets, VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted 
when the peer has