Re: [PATCH v2] vhost: introduce mdev based hardware backend

2019-10-23 Thread Tiwei Bie
On Wed, Oct 23, 2019 at 06:29:21PM +0800, Jason Wang wrote:
> On 2019/10/23 下午6:11, Tiwei Bie wrote:
> > On Wed, Oct 23, 2019 at 03:25:00PM +0800, Jason Wang wrote:
> > > On 2019/10/23 下午3:07, Tiwei Bie wrote:
> > > > On Wed, Oct 23, 2019 at 01:46:23PM +0800, Jason Wang wrote:
> > > > > On 2019/10/23 上午11:02, Tiwei Bie wrote:
> > > > > > On Tue, Oct 22, 2019 at 09:30:16PM +0800, Jason Wang wrote:
> > > > > > > On 2019/10/22 下午5:52, Tiwei Bie wrote:
> > > > > > > > This patch introduces a mdev based hardware vhost backend.
> > > > > > > > This backend is built on top of the same abstraction used
> > > > > > > > in virtio-mdev and provides a generic vhost interface for
> > > > > > > > userspace to accelerate the virtio devices in guest.
> > > > > > > > 
> > > > > > > > This backend is implemented as a mdev device driver on top
> > > > > > > > of the same mdev device ops used in virtio-mdev but using
> > > > > > > > a different mdev class id, and it will register the device
> > > > > > > > as a VFIO device for userspace to use. Userspace can setup
> > > > > > > > the IOMMU with the existing VFIO container/group APIs and
> > > > > > > > then get the device fd with the device name. After getting
> > > > > > > > the device fd of this device, userspace can use vhost ioctls
> > > > > > > > to setup the backend.
> > > > > > > > 
> > > > > > > > Signed-off-by: Tiwei Bie 
> > > > > > > > ---
> > > > > > > > This patch depends on below series:
> > > > > > > > https://lkml.org/lkml/2019/10/17/286
> > > > > > > > 
> > > > > > > > v1 -> v2:
> > > > > > > > - Replace _SET_STATE with _SET_STATUS (MST);
> > > > > > > > - Check status bits at each step (MST);
> > > > > > > > - Report the max ring size and max number of queues (MST);
> > > > > > > > - Add missing MODULE_DEVICE_TABLE (Jason);
> > > > > > > > - Only support the network backend w/o multiqueue for now;
> > > > > > > Any idea on how to extend it to support devices other than net? I 
> > > > > > > think we
> > > > > > > want a generic API or an API that could be made generic in the 
> > > > > > > future.
> > > > > > > 
> > > > > > > Do we want to e.g having a generic vhost mdev for all kinds of 
> > > > > > > devices or
> > > > > > > introducing e.g vhost-net-mdev and vhost-scsi-mdev?
> > > > > > One possible way is to do what vhost-user does. I.e. Apart from
> > > > > > the generic ring, features, ... related ioctls, we also introduce
> > > > > > device specific ioctls when we need them. As vhost-mdev just needs
> > > > > > to forward configs between parent and userspace and even won't
> > > > > > cache any info when possible,
> > > > > So it looks to me this is only possible if we expose e.g set_config 
> > > > > and
> > > > > get_config to userspace.
> > > > The set_config and get_config interface isn't really everything
> > > > of device specific settings. We also have ctrlq in virtio-net.
> > > 
> > > Yes, but it could be processed by the exist API. Isn't it? Just set ctrl 
> > > vq
> > > address and let parent to deal with that.
> > I mean how to expose ctrlq related settings to userspace?
> 
> 
> I think it works like:
> 
> 1) userspace find ctrl_vq is supported
> 
> 2) then it can allocate memory for ctrl vq and set its address through
> vhost-mdev
> 
> 3) userspace can populate ctrl vq itself

I see. That is to say, userspace e.g. QEMU will program the
ctrl vq with the existing VHOST_*_VRING_* ioctls, and parent
drivers should know that the addresses used in ctrl vq are
host virtual addresses in vhost-mdev's case.

> 
> 
> > 
> > > 
> > > > > > I think it might be better to do
> > > > > > this in one generic vhost-mdev module.
> > > > > Looking at definitions of VhostUserRequest in qemu, it mixed generic 
> > > > > API
> > > > > with device specific API. If we want go this ways (a generic 
> > > > > vhost-mdev),
> > > > > more questions needs to be answered:
> > > > > 
> > > > > 1) How could userspace know which type of vhost it would use? Do we 
> > > > > need to
> > > > > expose virtio subsystem device in for userspace this case?
> > > > > 
> > > > > 2) That generic vhost-mdev module still need to filter out unsupported
> > > > > ioctls for a specific type. E.g if it probes a net device, it should 
> > > > > refuse
> > > > > API for other type. This in fact a vhost-mdev-net but just not 
> > > > > modularize it
> > > > > on top of vhost-mdev.
> > > > > 
> > > > > 
> > > > > > > > - Some minor fixes and improvements;
> > > > > > > > - Rebase on top of virtio-mdev series v4;
> > > > [...]
> > > > > > > > +
> > > > > > > > +static long vhost_mdev_get_features(struct vhost_mdev *m, u64 
> > > > > > > > __user *featurep)
> > > > > > > > +{
> > > > > > > > +   if (copy_to_user(featurep, &m->features, 
> > > > > > > > sizeof(m->features)))
> > > > > > > > +   return -EFAULT;
> > > > > > > As discussed in previous version do we need to filter out MQ 
> > > > > > > feature here?
> > > > > > I think it's more straightforward to let the parent drive

[PATCH net-next] vringh: fix copy direction of vringh_iov_push_kern()

2019-10-23 Thread Jason Wang
We want to copy from iov to buf, so the direction was wrong.

Note: no real user for the helper, but it will be used by future
features.

Signed-off-by: Jason Wang 
---
 drivers/vhost/vringh.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
index 08ad0d1f0476..a0a2d74967ef 100644
--- a/drivers/vhost/vringh.c
+++ b/drivers/vhost/vringh.c
@@ -852,6 +852,12 @@ static inline int xfer_kern(void *src, void *dst, size_t 
len)
return 0;
 }
 
+static inline int kern_xfer(void *dst, void *src, size_t len)
+{
+   memcpy(dst, src, len);
+   return 0;
+}
+
 /**
  * vringh_init_kern - initialize a vringh for a kernelspace vring.
  * @vrh: the vringh to initialize.
@@ -958,7 +964,7 @@ EXPORT_SYMBOL(vringh_iov_pull_kern);
 ssize_t vringh_iov_push_kern(struct vringh_kiov *wiov,
 const void *src, size_t len)
 {
-   return vringh_iov_xfer(wiov, (void *)src, len, xfer_kern);
+   return vringh_iov_xfer(wiov, (void *)src, len, kern_xfer);
 }
 EXPORT_SYMBOL(vringh_iov_push_kern);
 
-- 
2.19.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V5 4/6] mdev: introduce virtio device and its device ops

2019-10-23 Thread Jason Wang


On 2019/10/24 上午5:57, Alex Williamson wrote:

On Wed, 23 Oct 2019 21:07:50 +0800
Jason Wang  wrote:


This patch implements basic support for mdev driver that supports
virtio transport for kernel virtio driver.

Signed-off-by: Jason Wang 
---
  drivers/vfio/mdev/mdev_core.c|  20 
  drivers/vfio/mdev/mdev_private.h |   2 +
  include/linux/mdev.h |   6 ++
  include/linux/virtio_mdev_ops.h  | 159 +++
  4 files changed, 187 insertions(+)
  create mode 100644 include/linux/virtio_mdev_ops.h

diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index 555bd61d8c38..9b00c3513120 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -76,6 +76,26 @@ const struct vfio_mdev_device_ops *mdev_get_vfio_ops(struct 
mdev_device *mdev)
  }
  EXPORT_SYMBOL(mdev_get_vfio_ops);
  
+/* Specify the virtio device ops for the mdev device, this

+ * must be called during create() callback for virtio mdev device.
+ */
+void mdev_set_virtio_ops(struct mdev_device *mdev,
+const struct virtio_mdev_device_ops *virtio_ops)
+{
+   mdev_set_class(mdev, MDEV_CLASS_ID_VIRTIO);
+   mdev->virtio_ops = virtio_ops;
+}
+EXPORT_SYMBOL(mdev_set_virtio_ops);
+
+/* Get the virtio device ops for the mdev device. */
+const struct virtio_mdev_device_ops *
+mdev_get_virtio_ops(struct mdev_device *mdev)
+{
+   WARN_ON(mdev->class_id != MDEV_CLASS_ID_VIRTIO);
+   return mdev->virtio_ops;
+}
+EXPORT_SYMBOL(mdev_get_virtio_ops);
+
  struct device *mdev_dev(struct mdev_device *mdev)
  {
return &mdev->dev;
diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
index 0770410ded2a..7b47890c34e7 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -11,6 +11,7 @@
  #define MDEV_PRIVATE_H
  
  #include 

+#include 
  
  int  mdev_bus_register(void);

  void mdev_bus_unregister(void);
@@ -38,6 +39,7 @@ struct mdev_device {
u16 class_id;
union {
const struct vfio_mdev_device_ops *vfio_ops;
+   const struct virtio_mdev_device_ops *virtio_ops;
};
  };
  
diff --git a/include/linux/mdev.h b/include/linux/mdev.h

index 4625f1a11014..9b69b0bbebfd 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -17,6 +17,7 @@
  
  struct mdev_device;

  struct vfio_mdev_device_ops;
+struct virtio_mdev_device_ops;
  
  /*

   * Called by the parent device driver to set the device which represents
@@ -112,6 +113,10 @@ void mdev_set_class(struct mdev_device *mdev, u16 id);
  void mdev_set_vfio_ops(struct mdev_device *mdev,
   const struct vfio_mdev_device_ops *vfio_ops);
  const struct vfio_mdev_device_ops *mdev_get_vfio_ops(struct mdev_device 
*mdev);
+void mdev_set_virtio_ops(struct mdev_device *mdev,
+const struct virtio_mdev_device_ops *virtio_ops);
+const struct virtio_mdev_device_ops *
+mdev_get_virtio_ops(struct mdev_device *mdev);
  
  extern struct bus_type mdev_bus_type;
  
@@ -127,6 +132,7 @@ struct mdev_device *mdev_from_dev(struct device *dev);
  
  enum {

MDEV_CLASS_ID_VFIO = 1,
+   MDEV_CLASS_ID_VIRTIO = 2,
/* New entries must be added here */
  };
  
diff --git a/include/linux/virtio_mdev_ops.h b/include/linux/virtio_mdev_ops.h

new file mode 100644
index ..d417b41f2845
--- /dev/null
+++ b/include/linux/virtio_mdev_ops.h
@@ -0,0 +1,159 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Virtio mediated device driver
+ *
+ * Copyright 2019, Red Hat Corp.
+ * Author: Jason Wang 
+ */
+#ifndef _LINUX_VIRTIO_MDEV_H
+#define _LINUX_VIRTIO_MDEV_H
+
+#include 
+#include 
+#include 
+
+#define VIRTIO_MDEV_DEVICE_API_STRING  "virtio-mdev"
+#define VIRTIO_MDEV_F_VERSION_1 0x1
+
+struct virtio_mdev_callback {
+   irqreturn_t (*callback)(void *data);
+   void *private;
+};
+
+/**
+ * struct vfio_mdev_device_ops - Structure to be registered for each
+ * mdev device to register the device for virtio/vhost drivers.
+ *
+ * The device ops that is supported by VIRTIO_MDEV_F_VERSION_1, the
+ * callbacks are mandatory unless explicity mentioned.

If the version of the callbacks is returned by a callback within the
structure defined by the version... isn't that a bit circular?  This
seems redundant to me versus the class id.  The fact that the parent
driver defines the device as MDEV_CLASS_ID_VIRTIO should tell us this
already.  If it was incremented, we'd need an MDEV_CLASS_ID_VIRTIOv2,
which the virtio-mdev bus driver could add to its id table and handle
differently.



My understanding is versions are only allowed to increase monotonically, 
this seems less flexible than features. E.g we have features A, B, C, 
mdev device can choose to support only a subset. E.g when mdev device 
can support dirty page logging, it can add a new feature bit for driver 
to know that it support new device ops. MDEV_CLASS_ID_VIRTIOv2 may only

Re: [PATCH] virtio_ring: fix packed ring event may missing

2019-10-23 Thread Jason Wang


On 2019/10/24 上午11:26, Liu, Yong wrote:



-Original Message-
From: Jason Wang [mailto:jasow...@redhat.com]
Sent: Tuesday, October 22, 2019 9:06 PM
To: Liu, Yong ; m...@redhat.com; Bie, Tiwei

Cc: virtualization@lists.linux-foundation.org
Subject: Re: [PATCH] virtio_ring: fix packed ring event may missing


On 2019/10/22 下午2:48, Liu, Yong wrote:

Hi Jason,
My answers are inline.


-Original Message-
From: Jason Wang [mailto:jasow...@redhat.com]
Sent: Tuesday, October 22, 2019 10:45 AM
To: Liu, Yong ; m...@redhat.com; Bie, Tiwei

Cc: virtualization@lists.linux-foundation.org
Subject: Re: [PATCH] virtio_ring: fix packed ring event may missing


On 2019/10/22 上午1:10, Marvin Liu wrote:

When callback is delayed, virtio expect that vhost will kick when
rolling over event offset. Recheck should be taken as used index may
exceed event offset between status check and driver event update.

However, it is possible that flags was not modified if descriptors are
chained or in_order feature was negotiated. So flags at event offset
may not be valid for descriptor's status checking. Fix it by using last
used index as replacement. Tx queue will be stopped if there's not
enough freed buffers after recheck.

Signed-off-by: Marvin Liu 

diff --git a/drivers/virtio/virtio_ring.c

b/drivers/virtio/virtio_ring.c

index bdc08244a648..a8041e451e9e 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1499,9 +1499,6 @@ static bool

virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)

 * counter first before updating event flags.
 */
virtio_wmb(vq->weak_barriers);
-   } else {
-   used_idx = vq->last_used_idx;
-   wrap_counter = vq->packed.used_wrap_counter;
}

if (vq->packed.event_flags_shadow ==

VRING_PACKED_EVENT_FLAG_DISABLE)

{

@@ -1518,7 +1515,9 @@ static bool

virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)

 */
virtio_mb(vq->weak_barriers);

-   if (is_used_desc_packed(vq, used_idx, wrap_counter)) {
+   if (is_used_desc_packed(vq,
+   vq->last_used_idx,
+   vq->packed.used_wrap_counter)) {
END_USE(vq);
return false;
}

Hi Marvin:

Two questions:

1) Do we support IN_ORDER in kernel driver?


Not support by now. But issue still can be possible if in_direct disabled

and meanwhile descs are chained.

Due to packed ring desc status should check one by one, chose arbitrary

position may cause issue.


Right, then it's better to mention IN_ORDER as future features.



2) Should we check IN_ORDER in this case otherwise we may end up with
interrupt storm when IN_ORDER is not negotiated?

Interrupt number will not increase here, event offset value calculated as

before.

Here just recheck whether new used descs is enough for next around xmit.
If backend was slow, most likely Tx queue will sleep for a while until

used index go over event offset.


Ok, but what if the backend is almost as fast as guest driver? E.g in
virtio-net we had:

      if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
          netif_stop_subqueue(dev, qnum);
          if (!use_napi &&
              unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
              /* More just got used, free them then recheck. */
              free_old_xmit_skbs(sq, false);
              if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
                  netif_start_subqueue(dev, qnum);
                  virtqueue_disable_cb(sq->vq);
              }
          }
      }

I worry that we may end up with toggling queue state in the case
(sq->vq->num_free is near 2 + MAX_SKB_FRAGS).


Yes, at this worst case each packet will add extra twice event flags write. Due 
to backend only read this value, the cost won't too much.



For driver, it means extra overheads, atomics, less batching, stats 
updating etc. For backend, cacheline will bounce between two cpus.




Even we can track down chained descs and figure out whether event offset 
indexed desc is used. There's still possibility that flags is invalid.
One case is that backend can buffer multiple descs by not updating the first 
one. We cannot guarantee that later flags is usable until check from the first 
one.



In this case, since we've stopped tx queue, so there's no new buffers 
added. It doesn't matter we get notified when the 3/4 or all of the 
descriptors has been used.


Thanks




Regards,
Marvin


It looks to me the correct thing to implement is to calculate the head
descriptor of a chain that sits at 3/4.

Thanks



Thanks,
Marvin


Thanks



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V5 2/6] modpost: add support for mdev class id

2019-10-23 Thread Jason Wang


On 2019/10/24 上午5:42, Alex Williamson wrote:

On Wed, 23 Oct 2019 21:07:48 +0800
Jason Wang  wrote:


Add support to parse mdev class id table.

Reviewed-by: Parav Pandit 
Signed-off-by: Jason Wang 
---
  drivers/vfio/mdev/vfio_mdev.c |  2 ++
  scripts/mod/devicetable-offsets.c |  3 +++
  scripts/mod/file2alias.c  | 10 ++
  3 files changed, 15 insertions(+)

diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index 7b24ee9cb8dd..cb701cd646f0 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -125,6 +125,8 @@ static const struct mdev_class_id id_table[] = {
{ 0 },
  };
  
+MODULE_DEVICE_TABLE(mdev, id_table);

+

Two questions, first we have:

#define MODULE_DEVICE_TABLE(type, name) \
extern typeof(name) __mod_##type##__##name##_device_table   \
   __attribute__ ((unused, alias(__stringify(name

Therefore we're defining __mod_mdev__id_table_device_table with alias
id_table.  When the virtio mdev bus driver is added in 5/6 it uses the
same name value.  I see virtio types all register this way (virtio,
id_table), so I assume there's no conflict, but pci types mostly (not
entirely) seem to use unique names.  Is there a preference to one way
or the other or it simply doesn't matter?



It looks to me that those symbol were local, so it doesn't matter. But 
if you wish I can switch to use unique name.






  static struct mdev_driver vfio_mdev_driver = {
.name   = "vfio_mdev",
.probe  = vfio_mdev_probe,
diff --git a/scripts/mod/devicetable-offsets.c 
b/scripts/mod/devicetable-offsets.c
index 054405b90ba4..6cbb1062488a 100644
--- a/scripts/mod/devicetable-offsets.c
+++ b/scripts/mod/devicetable-offsets.c
@@ -231,5 +231,8 @@ int main(void)
DEVID(wmi_device_id);
DEVID_FIELD(wmi_device_id, guid_string);
  
+	DEVID(mdev_class_id);

+   DEVID_FIELD(mdev_class_id, id);
+
return 0;
  }
diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
index c91eba751804..d365dfe7c718 100644
--- a/scripts/mod/file2alias.c
+++ b/scripts/mod/file2alias.c
@@ -1335,6 +1335,15 @@ static int do_wmi_entry(const char *filename, void 
*symval, char *alias)
return 1;
  }
  
+/* looks like: "mdev:cN" */

+static int do_mdev_entry(const char *filename, void *symval, char *alias)
+{
+   DEF_FIELD(symval, mdev_class_id, id);
+
+   sprintf(alias, "mdev:c%02X", id);

A lot of entries call add_wildcard() here, should we?  Sorry for the
basic questions, I haven't played in this code.  Thanks,



It's really good question. My understanding is we won't have a module 
that can deal with all kinds of classes like CLASS_ID_ANY. So there's 
probably no need for the wildcard.


Thanks




Alex


+   return 1;
+}
+
  /* Does namelen bytes of name exactly match the symbol? */
  static bool sym_is(const char *name, unsigned namelen, const char *symbol)
  {
@@ -1407,6 +1416,7 @@ static const struct devtable devtable[] = {
{"typec", SIZE_typec_device_id, do_typec_entry},
{"tee", SIZE_tee_client_device_id, do_tee_entry},
{"wmi", SIZE_wmi_device_id, do_wmi_entry},
+   {"mdev", SIZE_mdev_class_id, do_mdev_entry},
  };
  
  /* Create MODULE_ALIAS() statements.


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V5 1/6] mdev: class id support

2019-10-23 Thread Jason Wang


On 2019/10/24 上午5:42, Alex Williamson wrote:

On Wed, 23 Oct 2019 21:07:47 +0800
Jason Wang  wrote:


Mdev bus only supports vfio driver right now, so it doesn't implement
match method. But in the future, we may add drivers other than vfio,
the first driver could be virtio-mdev. This means we need to add
device class id support in bus match method to pair the mdev device
and mdev driver correctly.

So this patch adds id_table to mdev_driver and class_id for mdev
device with the match method for mdev bus.

Signed-off-by: Jason Wang 
---
  .../driver-api/vfio-mediated-device.rst   |  5 +
  drivers/gpu/drm/i915/gvt/kvmgt.c  |  1 +
  drivers/s390/cio/vfio_ccw_ops.c   |  1 +
  drivers/s390/crypto/vfio_ap_ops.c |  1 +
  drivers/vfio/mdev/mdev_core.c | 18 +++
  drivers/vfio/mdev/mdev_driver.c   | 22 +++
  drivers/vfio/mdev/mdev_private.h  |  1 +
  drivers/vfio/mdev/vfio_mdev.c |  6 +
  include/linux/mdev.h  |  8 +++
  include/linux/mod_devicetable.h   |  8 +++
  samples/vfio-mdev/mbochs.c|  1 +
  samples/vfio-mdev/mdpy.c  |  1 +
  samples/vfio-mdev/mtty.c  |  1 +
  13 files changed, 74 insertions(+)

diff --git a/Documentation/driver-api/vfio-mediated-device.rst 
b/Documentation/driver-api/vfio-mediated-device.rst
index 25eb7d5b834b..6709413bee29 100644
--- a/Documentation/driver-api/vfio-mediated-device.rst
+++ b/Documentation/driver-api/vfio-mediated-device.rst
@@ -102,12 +102,14 @@ structure to represent a mediated device's driver::
* @probe: called when new device created
* @remove: called when device removed
* @driver: device driver structure
+  * @id_table: the ids serviced by this driver
*/
   struct mdev_driver {
 const char *name;
 int  (*probe)  (struct device *dev);
 void (*remove) (struct device *dev);
 struct device_driverdriver;
+const struct mdev_class_id *id_table;
   };
  
  A mediated bus driver for mdev should use this structure in the function calls

@@ -170,6 +172,9 @@ that a driver should use to unregister itself with the mdev 
core driver::
  
  	extern void mdev_unregister_device(struct device *dev);
  
+It is also required to specify the class_id in create() callback through::

+
+   int mdev_set_class(struct mdev_device *mdev, u16 id);
  
  Mediated Device Management Interface Through sysfs

  ==
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 343d79c1cb7e..6420f0dbd31b 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -678,6 +678,7 @@ static int intel_vgpu_create(struct kobject *kobj, struct 
mdev_device *mdev)
 dev_name(mdev_dev(mdev)));
ret = 0;
  
+	mdev_set_class(mdev, MDEV_CLASS_ID_VFIO);

  out:
return ret;
  }
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index f0d71ab77c50..cf2c013ae32f 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -129,6 +129,7 @@ static int vfio_ccw_mdev_create(struct kobject *kobj, 
struct mdev_device *mdev)
   private->sch->schid.ssid,
   private->sch->schid.sch_no);
  
+	mdev_set_class(mdev, MDEV_CLASS_ID_VFIO);

return 0;
  }
  
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c

index 5c0f53c6dde7..07c31070afeb 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -343,6 +343,7 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct 
mdev_device *mdev)
list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
mutex_unlock(&matrix_dev->lock);
  
+	mdev_set_class(mdev, MDEV_CLASS_ID_VFIO);

return 0;
  }
  
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c

index b558d4cfd082..3a9c52d71b4e 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -45,6 +45,16 @@ void mdev_set_drvdata(struct mdev_device *mdev, void *data)
  }
  EXPORT_SYMBOL(mdev_set_drvdata);
  
+/* Specify the class for the mdev device, this must be called during

+ * create() callback.
+ */
+void mdev_set_class(struct mdev_device *mdev, u16 id)
+{
+   WARN_ON(mdev->class_id);
+   mdev->class_id = id;
+}
+EXPORT_SYMBOL(mdev_set_class);
+
  struct device *mdev_dev(struct mdev_device *mdev)
  {
return &mdev->dev;
@@ -135,6 +145,7 @@ static int mdev_device_remove_cb(struct device *dev, void 
*data)
   * mdev_register_device : Register a device
   * @dev: device structure representing parent device.
   * @ops: Parent device operation structure to be registered.
+ * @id: class id.
   *
   * Add device to list 

Re: [PATCH v2 1/3] kcov: remote coverage support

2019-10-23 Thread Andrew Morton
On Wed, 23 Oct 2019 17:24:29 +0200 Andrey Konovalov  
wrote:

> This patch adds background thread coverage collection ability to kcov.
> 
> With KCOV_ENABLE coverage is collected only for syscalls that are issued
> from the current process. With KCOV_REMOTE_ENABLE it's possible to collect
> coverage for arbitrary parts of the kernel code, provided that those parts
> are annotated with kcov_remote_start()/kcov_remote_stop().
> 
> This allows to collect coverage from two types of kernel background
> threads: the global ones, that are spawned during kernel boot and are
> always running (e.g. USB hub_event()); and the local ones, that are
> spawned when a user interacts with some kernel interface (e.g. vhost
> workers).
> 
> To enable collecting coverage from a global background thread, a unique
> global handle must be assigned and passed to the corresponding
> kcov_remote_start() call. Then a userspace process can pass a list of such
> handles to the KCOV_REMOTE_ENABLE ioctl in the handles array field of the
> kcov_remote_arg struct. This will attach the used kcov device to the code
> sections, that are referenced by those handles.
> 
> Since there might be many local background threads spawned from different
> userspace processes, we can't use a single global handle per annotation.
> Instead, the userspace process passes a non-zero handle through the
> common_handle field of the kcov_remote_arg struct. This common handle gets
> saved to the kcov_handle field in the current task_struct and needs to be
> passed to the newly spawned threads via custom annotations. Those threads
> should in turn be annotated with kcov_remote_start()/kcov_remote_stop().
> 
> Internally kcov stores handles as u64 integers. The top byte of a handle
> is used to denote the id of a subsystem that this handle belongs to, and
> the lower 4 bytes are used to denote a handle id within that subsystem.
> A reserved value 0 is used as a subsystem id for common handles as they
> don't belong to a particular subsystem. The bytes 4-7 are currently
> reserved and must be zero. In the future the number of bytes used for the
> subsystem or handle ids might be increased.
> 
> When a particular userspace proccess collects coverage by via a common
> handle, kcov will collect coverage for each code section that is annotated
> to use the common handle obtained as kcov_handle from the current
> task_struct. However non common handles allow to collect coverage
> selectively from different subsystems.
> 
> ...
>
> +static struct kcov_remote *kcov_remote_add(struct kcov *kcov, u64 handle)
> +{
> + struct kcov_remote *remote;
> +
> + if (kcov_remote_find(handle))
> + return ERR_PTR(-EEXIST);
> + remote = kmalloc(sizeof(*remote), GFP_ATOMIC);
> + if (!remote)
> + return ERR_PTR(-ENOMEM);
> + remote->handle = handle;
> + remote->kcov = kcov;
> + hash_add(kcov_remote_map, &remote->hnode, handle);
> + return remote;
> +}
> +
>
> ...
>
> + spin_lock(&kcov_remote_lock);
> + for (i = 0; i < remote_arg->num_handles; i++) {
> + kcov_debug("handle %llx\n", remote_arg->handles[i]);
> + if (!kcov_check_handle(remote_arg->handles[i],
> + false, true, false)) {
> + spin_unlock(&kcov_remote_lock);
> + kcov_disable(t, kcov);
> + return -EINVAL;
> + }
> + remote = kcov_remote_add(kcov, remote_arg->handles[i]);
> + if (IS_ERR(remote)) {
> + spin_unlock(&kcov_remote_lock);
> + kcov_disable(t, kcov);
> + return PTR_ERR(remote);
> + }
> + }

It's worrisome that this code can perform up to 65536 GFP_ATOMIC
allocations without coming up for air.  The possibility of ENOMEM or of
causing collateral problems is significant.  It doesn't look too hard
to change this to use GFP_KERNEL?

> +u64 kcov_common_handle(void)
> +{
> + return current->kcov_handle;
> +}

I don't immediately understand what this "common handle" thing is all about. 
Code is rather lacking in this sort of high-level commentary?


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 0/3] kcov: collect coverage from usb and vhost

2019-10-23 Thread Andrew Morton
On Wed, 23 Oct 2019 17:24:28 +0200 Andrey Konovalov  
wrote:

> This patchset extends kcov to allow collecting coverage from the USB
> subsystem and vhost workers. See the first patch description for details
> about the kcov extension. The other two patches apply this kcov extension
> to USB and vhost.
> 
> These patches have been used to enable coverage-guided USB fuzzing with
> syzkaller for the last few years

I find it surprising that this material is so focused on USB.  Is
there something unique about USB that gave rise to this situation, or
is it expected that the new kcov feature will be used elsewhere in the
kernel?

If the latter, which are the expected subsystems?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V5 4/6] mdev: introduce virtio device and its device ops

2019-10-23 Thread Alex Williamson
On Wed, 23 Oct 2019 21:07:50 +0800
Jason Wang  wrote:

> This patch implements basic support for mdev driver that supports
> virtio transport for kernel virtio driver.
> 
> Signed-off-by: Jason Wang 
> ---
>  drivers/vfio/mdev/mdev_core.c|  20 
>  drivers/vfio/mdev/mdev_private.h |   2 +
>  include/linux/mdev.h |   6 ++
>  include/linux/virtio_mdev_ops.h  | 159 +++
>  4 files changed, 187 insertions(+)
>  create mode 100644 include/linux/virtio_mdev_ops.h
> 
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index 555bd61d8c38..9b00c3513120 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -76,6 +76,26 @@ const struct vfio_mdev_device_ops 
> *mdev_get_vfio_ops(struct mdev_device *mdev)
>  }
>  EXPORT_SYMBOL(mdev_get_vfio_ops);
>  
> +/* Specify the virtio device ops for the mdev device, this
> + * must be called during create() callback for virtio mdev device.
> + */
> +void mdev_set_virtio_ops(struct mdev_device *mdev,
> +  const struct virtio_mdev_device_ops *virtio_ops)
> +{
> + mdev_set_class(mdev, MDEV_CLASS_ID_VIRTIO);
> + mdev->virtio_ops = virtio_ops;
> +}
> +EXPORT_SYMBOL(mdev_set_virtio_ops);
> +
> +/* Get the virtio device ops for the mdev device. */
> +const struct virtio_mdev_device_ops *
> +mdev_get_virtio_ops(struct mdev_device *mdev)
> +{
> + WARN_ON(mdev->class_id != MDEV_CLASS_ID_VIRTIO);
> + return mdev->virtio_ops;
> +}
> +EXPORT_SYMBOL(mdev_get_virtio_ops);
> +
>  struct device *mdev_dev(struct mdev_device *mdev)
>  {
>   return &mdev->dev;
> diff --git a/drivers/vfio/mdev/mdev_private.h 
> b/drivers/vfio/mdev/mdev_private.h
> index 0770410ded2a..7b47890c34e7 100644
> --- a/drivers/vfio/mdev/mdev_private.h
> +++ b/drivers/vfio/mdev/mdev_private.h
> @@ -11,6 +11,7 @@
>  #define MDEV_PRIVATE_H
>  
>  #include 
> +#include 
>  
>  int  mdev_bus_register(void);
>  void mdev_bus_unregister(void);
> @@ -38,6 +39,7 @@ struct mdev_device {
>   u16 class_id;
>   union {
>   const struct vfio_mdev_device_ops *vfio_ops;
> + const struct virtio_mdev_device_ops *virtio_ops;
>   };
>  };
>  
> diff --git a/include/linux/mdev.h b/include/linux/mdev.h
> index 4625f1a11014..9b69b0bbebfd 100644
> --- a/include/linux/mdev.h
> +++ b/include/linux/mdev.h
> @@ -17,6 +17,7 @@
>  
>  struct mdev_device;
>  struct vfio_mdev_device_ops;
> +struct virtio_mdev_device_ops;
>  
>  /*
>   * Called by the parent device driver to set the device which represents
> @@ -112,6 +113,10 @@ void mdev_set_class(struct mdev_device *mdev, u16 id);
>  void mdev_set_vfio_ops(struct mdev_device *mdev,
>  const struct vfio_mdev_device_ops *vfio_ops);
>  const struct vfio_mdev_device_ops *mdev_get_vfio_ops(struct mdev_device 
> *mdev);
> +void mdev_set_virtio_ops(struct mdev_device *mdev,
> +  const struct virtio_mdev_device_ops *virtio_ops);
> +const struct virtio_mdev_device_ops *
> +mdev_get_virtio_ops(struct mdev_device *mdev);
>  
>  extern struct bus_type mdev_bus_type;
>  
> @@ -127,6 +132,7 @@ struct mdev_device *mdev_from_dev(struct device *dev);
>  
>  enum {
>   MDEV_CLASS_ID_VFIO = 1,
> + MDEV_CLASS_ID_VIRTIO = 2,
>   /* New entries must be added here */
>  };
>  
> diff --git a/include/linux/virtio_mdev_ops.h b/include/linux/virtio_mdev_ops.h
> new file mode 100644
> index ..d417b41f2845
> --- /dev/null
> +++ b/include/linux/virtio_mdev_ops.h
> @@ -0,0 +1,159 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Virtio mediated device driver
> + *
> + * Copyright 2019, Red Hat Corp.
> + * Author: Jason Wang 
> + */
> +#ifndef _LINUX_VIRTIO_MDEV_H
> +#define _LINUX_VIRTIO_MDEV_H
> +
> +#include 
> +#include 
> +#include 
> +
> +#define VIRTIO_MDEV_DEVICE_API_STRING"virtio-mdev"
> +#define VIRTIO_MDEV_F_VERSION_1 0x1
> +
> +struct virtio_mdev_callback {
> + irqreturn_t (*callback)(void *data);
> + void *private;
> +};
> +
> +/**
> + * struct vfio_mdev_device_ops - Structure to be registered for each
> + * mdev device to register the device for virtio/vhost drivers.
> + *
> + * The device ops that is supported by VIRTIO_MDEV_F_VERSION_1, the
> + * callbacks are mandatory unless explicity mentioned.

If the version of the callbacks is returned by a callback within the
structure defined by the version... isn't that a bit circular?  This
seems redundant to me versus the class id.  The fact that the parent
driver defines the device as MDEV_CLASS_ID_VIRTIO should tell us this
already.  If it was incremented, we'd need an MDEV_CLASS_ID_VIRTIOv2,
which the virtio-mdev bus driver could add to its id table and handle
differently.

> + *
> + * @set_vq_address:  Set the address of virtqueue
> + *   @mdev: mediated device
> + *   @idx: virtqueue index
> + *   @desc_area: 

Re: [PATCH V5 2/6] modpost: add support for mdev class id

2019-10-23 Thread Alex Williamson
On Wed, 23 Oct 2019 21:07:48 +0800
Jason Wang  wrote:

> Add support to parse mdev class id table.
> 
> Reviewed-by: Parav Pandit 
> Signed-off-by: Jason Wang 
> ---
>  drivers/vfio/mdev/vfio_mdev.c |  2 ++
>  scripts/mod/devicetable-offsets.c |  3 +++
>  scripts/mod/file2alias.c  | 10 ++
>  3 files changed, 15 insertions(+)
> 
> diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
> index 7b24ee9cb8dd..cb701cd646f0 100644
> --- a/drivers/vfio/mdev/vfio_mdev.c
> +++ b/drivers/vfio/mdev/vfio_mdev.c
> @@ -125,6 +125,8 @@ static const struct mdev_class_id id_table[] = {
>   { 0 },
>  };
>  
> +MODULE_DEVICE_TABLE(mdev, id_table);
> +

Two questions, first we have:

#define MODULE_DEVICE_TABLE(type, name) \
extern typeof(name) __mod_##type##__##name##_device_table   \
  __attribute__ ((unused, alias(__stringify(name

Therefore we're defining __mod_mdev__id_table_device_table with alias
id_table.  When the virtio mdev bus driver is added in 5/6 it uses the
same name value.  I see virtio types all register this way (virtio,
id_table), so I assume there's no conflict, but pci types mostly (not
entirely) seem to use unique names.  Is there a preference to one way
or the other or it simply doesn't matter?

>  static struct mdev_driver vfio_mdev_driver = {
>   .name   = "vfio_mdev",
>   .probe  = vfio_mdev_probe,
> diff --git a/scripts/mod/devicetable-offsets.c 
> b/scripts/mod/devicetable-offsets.c
> index 054405b90ba4..6cbb1062488a 100644
> --- a/scripts/mod/devicetable-offsets.c
> +++ b/scripts/mod/devicetable-offsets.c
> @@ -231,5 +231,8 @@ int main(void)
>   DEVID(wmi_device_id);
>   DEVID_FIELD(wmi_device_id, guid_string);
>  
> + DEVID(mdev_class_id);
> + DEVID_FIELD(mdev_class_id, id);
> +
>   return 0;
>  }
> diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
> index c91eba751804..d365dfe7c718 100644
> --- a/scripts/mod/file2alias.c
> +++ b/scripts/mod/file2alias.c
> @@ -1335,6 +1335,15 @@ static int do_wmi_entry(const char *filename, void 
> *symval, char *alias)
>   return 1;
>  }
>  
> +/* looks like: "mdev:cN" */
> +static int do_mdev_entry(const char *filename, void *symval, char *alias)
> +{
> + DEF_FIELD(symval, mdev_class_id, id);
> +
> + sprintf(alias, "mdev:c%02X", id);

A lot of entries call add_wildcard() here, should we?  Sorry for the
basic questions, I haven't played in this code.  Thanks,

Alex

> + return 1;
> +}
> +
>  /* Does namelen bytes of name exactly match the symbol? */
>  static bool sym_is(const char *name, unsigned namelen, const char *symbol)
>  {
> @@ -1407,6 +1416,7 @@ static const struct devtable devtable[] = {
>   {"typec", SIZE_typec_device_id, do_typec_entry},
>   {"tee", SIZE_tee_client_device_id, do_tee_entry},
>   {"wmi", SIZE_wmi_device_id, do_wmi_entry},
> + {"mdev", SIZE_mdev_class_id, do_mdev_entry},
>  };
>  
>  /* Create MODULE_ALIAS() statements.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V5 1/6] mdev: class id support

2019-10-23 Thread Alex Williamson
On Wed, 23 Oct 2019 21:07:47 +0800
Jason Wang  wrote:

> Mdev bus only supports vfio driver right now, so it doesn't implement
> match method. But in the future, we may add drivers other than vfio,
> the first driver could be virtio-mdev. This means we need to add
> device class id support in bus match method to pair the mdev device
> and mdev driver correctly.
> 
> So this patch adds id_table to mdev_driver and class_id for mdev
> device with the match method for mdev bus.
> 
> Signed-off-by: Jason Wang 
> ---
>  .../driver-api/vfio-mediated-device.rst   |  5 +
>  drivers/gpu/drm/i915/gvt/kvmgt.c  |  1 +
>  drivers/s390/cio/vfio_ccw_ops.c   |  1 +
>  drivers/s390/crypto/vfio_ap_ops.c |  1 +
>  drivers/vfio/mdev/mdev_core.c | 18 +++
>  drivers/vfio/mdev/mdev_driver.c   | 22 +++
>  drivers/vfio/mdev/mdev_private.h  |  1 +
>  drivers/vfio/mdev/vfio_mdev.c |  6 +
>  include/linux/mdev.h  |  8 +++
>  include/linux/mod_devicetable.h   |  8 +++
>  samples/vfio-mdev/mbochs.c|  1 +
>  samples/vfio-mdev/mdpy.c  |  1 +
>  samples/vfio-mdev/mtty.c  |  1 +
>  13 files changed, 74 insertions(+)
> 
> diff --git a/Documentation/driver-api/vfio-mediated-device.rst 
> b/Documentation/driver-api/vfio-mediated-device.rst
> index 25eb7d5b834b..6709413bee29 100644
> --- a/Documentation/driver-api/vfio-mediated-device.rst
> +++ b/Documentation/driver-api/vfio-mediated-device.rst
> @@ -102,12 +102,14 @@ structure to represent a mediated device's driver::
>* @probe: called when new device created
>* @remove: called when device removed
>* @driver: device driver structure
> +  * @id_table: the ids serviced by this driver
>*/
>   struct mdev_driver {
>const char *name;
>int  (*probe)  (struct device *dev);
>void (*remove) (struct device *dev);
>struct device_driverdriver;
> +  const struct mdev_class_id *id_table;
>   };
>  
>  A mediated bus driver for mdev should use this structure in the function 
> calls
> @@ -170,6 +172,9 @@ that a driver should use to unregister itself with the 
> mdev core driver::
>  
>   extern void mdev_unregister_device(struct device *dev);
>  
> +It is also required to specify the class_id in create() callback through::
> +
> + int mdev_set_class(struct mdev_device *mdev, u16 id);
>  
>  Mediated Device Management Interface Through sysfs
>  ==
> diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c 
> b/drivers/gpu/drm/i915/gvt/kvmgt.c
> index 343d79c1cb7e..6420f0dbd31b 100644
> --- a/drivers/gpu/drm/i915/gvt/kvmgt.c
> +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
> @@ -678,6 +678,7 @@ static int intel_vgpu_create(struct kobject *kobj, struct 
> mdev_device *mdev)
>dev_name(mdev_dev(mdev)));
>   ret = 0;
>  
> + mdev_set_class(mdev, MDEV_CLASS_ID_VFIO);
>  out:
>   return ret;
>  }
> diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
> index f0d71ab77c50..cf2c013ae32f 100644
> --- a/drivers/s390/cio/vfio_ccw_ops.c
> +++ b/drivers/s390/cio/vfio_ccw_ops.c
> @@ -129,6 +129,7 @@ static int vfio_ccw_mdev_create(struct kobject *kobj, 
> struct mdev_device *mdev)
>  private->sch->schid.ssid,
>  private->sch->schid.sch_no);
>  
> + mdev_set_class(mdev, MDEV_CLASS_ID_VFIO);
>   return 0;
>  }
>  
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c 
> b/drivers/s390/crypto/vfio_ap_ops.c
> index 5c0f53c6dde7..07c31070afeb 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -343,6 +343,7 @@ static int vfio_ap_mdev_create(struct kobject *kobj, 
> struct mdev_device *mdev)
>   list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
>   mutex_unlock(&matrix_dev->lock);
>  
> + mdev_set_class(mdev, MDEV_CLASS_ID_VFIO);
>   return 0;
>  }
>  
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index b558d4cfd082..3a9c52d71b4e 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -45,6 +45,16 @@ void mdev_set_drvdata(struct mdev_device *mdev, void *data)
>  }
>  EXPORT_SYMBOL(mdev_set_drvdata);
>  
> +/* Specify the class for the mdev device, this must be called during
> + * create() callback.
> + */
> +void mdev_set_class(struct mdev_device *mdev, u16 id)
> +{
> + WARN_ON(mdev->class_id);
> + mdev->class_id = id;
> +}
> +EXPORT_SYMBOL(mdev_set_class);
> +
>  struct device *mdev_dev(struct mdev_device *mdev)
>  {
>   return &mdev->dev;
> @@ -135,6 +145,7 @@ static int mdev_device_remove_cb(struct device *dev, void 
> *data)
>   * mdev_register_device : Register a device
>   * @dev: device structure representi

Re: [PATCH 2/3] usb, kcov: collect coverage from hub_event

2019-10-23 Thread kbuild test robot
Hi Andrey,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.4-rc4 next-20191023]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Andrey-Konovalov/kcov-collect-coverage-from-usb-and-vhost/20191023-185245
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
3b7c59a1950c75f2c0152e5a9cd77675b09233d6
config: s390-allmodconfig (attached as .config)
compiler: s390-linux-gcc (GCC) 7.4.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.4.0 make.cross ARCH=s390 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

>> ERROR: "kcov_remote_stop" [drivers/usb/core/usbcore.ko] undefined!
>> ERROR: "kcov_remote_start" [drivers/usb/core/usbcore.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 1/3] kcov: remote coverage support

2019-10-23 Thread kbuild test robot
Hi Andrey,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.4-rc4 next-20191023]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Andrey-Konovalov/kcov-collect-coverage-from-usb-and-vhost/20191023-185245
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
3b7c59a1950c75f2c0152e5a9cd77675b09233d6
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 7.4.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.4.0 make.cross ARCH=arm 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

   kernel/kcov.o: In function `kcov_remote_stop':
>> kcov.c:(.text+0x1094): undefined reference to `__aeabi_uldivmod'
   kcov.c:(.text+0x1144): undefined reference to `__aeabi_uldivmod'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] drm/simple-kms: Standardize arguments for callbacks

2019-10-23 Thread Linus Walleij
On Wed, Oct 23, 2019 at 12:13 PM Daniel Vetter  wrote:

> Passing the wrong type feels icky, everywhere else we use the pipe as
> the first parameter. Spotted while discussing patches with Thomas
> Zimmermann.
>
> v2: Make xen compile correctly
>
> Acked-By: Thomas Zimmermann  (v1)
> Cc: Thomas Zimmermann 
> Cc: Noralf Trønnes 
> Cc: Gerd Hoffmann 
> Cc: Eric Anholt 
> Cc: Emil Velikov 
> Cc: virtualization@lists.linux-foundation.org
> Cc: Linus Walleij 
> Signed-off-by: Daniel Vetter 

Makes perfect sense.
Reviewed-by: Linus Walleij 

Yours,
Linus Walleij
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net-next 11/14] vsock: add multi-transports support

2019-10-23 Thread Stefano Garzarella
On Wed, Oct 23, 2019 at 11:59 AM Stefano Garzarella  wrote:
>
> This patch adds the support of multiple transports in the
> VSOCK core.
>
> With the multi-transports support, we can use vsock with nested VMs
> (using also different hypervisors) loading both guest->host and
> host->guest transports at the same time.
>
> Major changes:
> - vsock core module can be loaded regardless of the transports
> - vsock_core_init() and vsock_core_exit() are renamed to
>   vsock_core_register() and vsock_core_unregister()
> - vsock_core_register() has a feature parameter (H2G, G2H, DGRAM)
>   to identify which directions the transport can handle and if it's
>   support DGRAM (only vmci)
> - each stream socket is assigned to a transport when the remote CID
>   is set (during the connect() or when we receive a connection request
>   on a listener socket).
>   The remote CID is used to decide which transport to use:
>   - remote CID > VMADDR_CID_HOST will use host->guest transport
>   - remote CID <= VMADDR_CID_HOST will use guest->host transport
> - listener sockets are not bound to any transports since no transport
>   operations are done on it. In this way we can create a listener
>   socket, also if the transports are not loaded or with VMADDR_CID_ANY
>   to listen on all transports.
> - DGRAM sockets are handled as before, since only the vmci_transport
>   provides this feature.
>
> Signed-off-by: Stefano Garzarella 
> ---
> RFC -> v1:
> - documented VSOCK_TRANSPORT_F_* flags
> - fixed vsock_assign_transport() when the socket is already assigned
>   (e.g connection failed)
> - moved features outside of struct vsock_transport, and used as
>   parameter of vsock_core_register()
> ---
>  drivers/vhost/vsock.c   |   5 +-
>  include/net/af_vsock.h  |  17 +-
>  net/vmw_vsock/af_vsock.c| 237 ++--
>  net/vmw_vsock/hyperv_transport.c|  26 ++-
>  net/vmw_vsock/virtio_transport.c|   7 +-
>  net/vmw_vsock/virtio_transport_common.c |  28 ++-
>  net/vmw_vsock/vmci_transport.c  |  31 +++-
>  7 files changed, 270 insertions(+), 81 deletions(-)
>
> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> index 6d7e4f022748..b235f4bbe8ea 100644
> --- a/drivers/vhost/vsock.c
> +++ b/drivers/vhost/vsock.c
> @@ -831,7 +831,8 @@ static int __init vhost_vsock_init(void)
>  {
> int ret;
>
> -   ret = vsock_core_init(&vhost_transport.transport);
> +   ret = vsock_core_register(&vhost_transport.transport,
> + VSOCK_TRANSPORT_F_H2G);
> if (ret < 0)
> return ret;
> return misc_register(&vhost_vsock_misc);
> @@ -840,7 +841,7 @@ static int __init vhost_vsock_init(void)
>  static void __exit vhost_vsock_exit(void)
>  {
> misc_deregister(&vhost_vsock_misc);
> -   vsock_core_exit();
> +   vsock_core_unregister(&vhost_transport.transport);
>  };
>
>  module_init(vhost_vsock_init);
> diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
> index fa1570dc9f5c..27a3463e4892 100644
> --- a/include/net/af_vsock.h
> +++ b/include/net/af_vsock.h
> @@ -91,6 +91,14 @@ struct vsock_transport_send_notify_data {
> u64 data2; /* Transport-defined. */
>  };
>
> +/* Transport features flags */
> +/* Transport provides host->guest communication */
> +#define VSOCK_TRANSPORT_F_H2G  0x0001
> +/* Transport provides guest->host communication */
> +#define VSOCK_TRANSPORT_F_G2H  0x0002
> +/* Transport provides DGRAM communication */
> +#define VSOCK_TRANSPORT_F_DGRAM0x0004
> +
>  struct vsock_transport {
> /* Initialize/tear-down socket. */
> int (*init)(struct vsock_sock *, struct vsock_sock *);
> @@ -154,12 +162,8 @@ struct vsock_transport {
>
>  / CORE /
>
> -int __vsock_core_init(const struct vsock_transport *t, struct module *owner);
> -static inline int vsock_core_init(const struct vsock_transport *t)
> -{
> -   return __vsock_core_init(t, THIS_MODULE);
> -}
> -void vsock_core_exit(void);
> +int vsock_core_register(const struct vsock_transport *t, int features);
> +void vsock_core_unregister(const struct vsock_transport *t);
>
>  /* The transport may downcast this to access transport-specific functions */
>  const struct vsock_transport *vsock_core_get_transport(struct vsock_sock 
> *vsk);
> @@ -190,6 +194,7 @@ struct sock *vsock_find_connected_socket(struct 
> sockaddr_vm *src,
>  struct sockaddr_vm *dst);
>  void vsock_remove_sock(struct vsock_sock *vsk);
>  void vsock_for_each_connected_socket(void (*fn)(struct sock *sk));
> +int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk);
>
>  / TAP /
>
> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
> index d89381166028..85d9a147 100644
> --- a/net/vmw_vsock/af_vsock.c
> +++ b/net/vmw_vsock/af_vsock.c
> @@ -130,7 +130,12 @@ static struct proto vsock_proto = 

Re: [PATCH -next] virtiofs: remove unused variable 'fc'

2019-10-23 Thread Stefan Hajnoczi
On Wed, Oct 23, 2019 at 02:21:30PM +0800, YueHaibing wrote:
> fs/fuse/virtio_fs.c:983:20: warning:
>  variable fc set but not used [-Wunused-but-set-variable]
> 
> It is not used since commit 7ee1e2e631db ("virtiofs:
> No need to check fpq->connected state")
> 
> Signed-off-by: YueHaibing 
> ---
>  fs/fuse/virtio_fs.c | 2 --
>  1 file changed, 2 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtiofs: Remove set but not used variable 'fc'

2019-10-23 Thread Stefan Hajnoczi
On Wed, Oct 23, 2019 at 10:02:49AM +0800, zhengbin wrote:
> Fixes gcc '-Wunused-but-set-variable' warning:
> 
> fs/fuse/virtio_fs.c: In function virtio_fs_wake_pending_and_unlock:
> fs/fuse/virtio_fs.c:983:20: warning: variable fc set but not used 
> [-Wunused-but-set-variable]
> 
> It is not used since commit 7ee1e2e631db ("virtiofs:
> No need to check fpq->connected state")
> 
> Reported-by: Hulk Robot 
> Signed-off-by: zhengbin 
> ---
>  fs/fuse/virtio_fs.c | 2 --
>  1 file changed, 2 deletions(-)

Only affects the linux-next tree, not virtio-fs-dev or linux.git/master.
Same as "[PATCH -next] virtiofs: remove unused variable 'fc'"
(<20191023062130.23068-1-yuehaib...@huawei.com>).

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 3/3] vhost, kcov: collect coverage from vhost_worker

2019-10-23 Thread Dmitry Vyukov via Virtualization
On Wed, Oct 23, 2019 at 3:35 PM Andrey Konovalov  wrote:
>
> On Wed, Oct 23, 2019 at 10:36 AM Dmitry Vyukov  wrote:
> >
> > On Tue, Oct 22, 2019 at 6:46 PM Andrey Konovalov  
> > wrote:
> > >
> > > This patch adds kcov_remote_start()/kcov_remote_stop() annotations to the
> > > vhost_worker() function, which is responsible for processing vhost works.
> > > Since vhost_worker() threads are spawned per vhost device instance
> > > the common kcov handle is used for kcov_remote_start()/stop() annotations
> > > (see Documentation/dev-tools/kcov.rst for details). As the result kcov can
> > > now be used to collect coverage from vhost worker threads.
> > >
> > > Signed-off-by: Andrey Konovalov 
> > > ---
> > >  drivers/vhost/vhost.c | 6 ++
> > >  drivers/vhost/vhost.h | 1 +
> > >  2 files changed, 7 insertions(+)
> > >
> > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > > index 36ca2cf419bf..a5a557c4b67f 100644
> > > --- a/drivers/vhost/vhost.c
> > > +++ b/drivers/vhost/vhost.c
> > > @@ -30,6 +30,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >
> > >  #include "vhost.h"
> > >
> > > @@ -357,7 +358,9 @@ static int vhost_worker(void *data)
> > > llist_for_each_entry_safe(work, work_next, node, node) {
> > > clear_bit(VHOST_WORK_QUEUED, &work->flags);
> > > __set_current_state(TASK_RUNNING);
> > > +   kcov_remote_start(dev->kcov_handle);
> > > work->fn(work);
> > > +   kcov_remote_stop();
> > > if (need_resched())
> > > schedule();
> > > }
> > > @@ -546,6 +549,7 @@ long vhost_dev_set_owner(struct vhost_dev *dev)
> > >
> > > /* No owner, become one */
> > > dev->mm = get_task_mm(current);
> > > +   dev->kcov_handle = current->kcov_handle;
> >
> > kcov_handle is not present in task_struct if !CONFIG_KCOV
> >
> > Also this does not use KCOV_SUBSYSTEM_COMMON.
> > We discussed something along the following lines:
> >
> > u64 kcov_remote_handle(u64 subsys, u64 id)
> > {
> >   WARN_ON(subsys or id has wrong bits set).
>
> Hm, we can't have warnings in kcov_remote_handle() that is exposed in
> uapi headers. What we can do is return 0 (invalid handle) if subsys/id
> have incorrect bits set. And then we can either have another
> kcov_remote_handle() internally (with a different name though) that
> has a warning, or have warning in kcov_remote_start(). WDYT?

I would probably add the warning to kcov_remote_start(). This avoids
the need for another function and will catch a wrong ID if caller
generated it by some other means.
And then ioctls should also detect bad handles passed in and return
EINVAL. Then we will cover errors for both kernel and user programs.

>
> >   return ...;
> > }
> >
> > kcov_remote_handle(KCOV_SUBSYSTEM_USB, bus);
> > kcov_remote_handle(KCOV_SUBSYSTEM_COMMON, current->kcov_handle);
> >
> >
> > > worker = kthread_create(vhost_worker, dev, "vhost-%d", 
> > > current->pid);
> > > if (IS_ERR(worker)) {
> > > err = PTR_ERR(worker);
> > > @@ -571,6 +575,7 @@ long vhost_dev_set_owner(struct vhost_dev *dev)
> > > if (dev->mm)
> > > mmput(dev->mm);
> > > dev->mm = NULL;
> > > +   dev->kcov_handle = 0;
> > >  err_mm:
> > > return err;
> > >  }
> > > @@ -682,6 +687,7 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
> > > if (dev->worker) {
> > > kthread_stop(dev->worker);
> > > dev->worker = NULL;
> > > +   dev->kcov_handle = 0;
> > > }
> > > if (dev->mm)
> > > mmput(dev->mm);
> > > diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> > > index e9ed2722b633..a123fd70847e 100644
> > > --- a/drivers/vhost/vhost.h
> > > +++ b/drivers/vhost/vhost.h
> > > @@ -173,6 +173,7 @@ struct vhost_dev {
> > > int iov_limit;
> > > int weight;
> > > int byte_weight;
> > > +   u64 kcov_handle;
> > >  };
> > >
> > >  bool vhost_exceeds_weight(struct vhost_virtqueue *vq, int pkts, int 
> > > total_len);
> > > --
> > > 2.23.0.866.gb869b98d4c-goog
> > >
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] drm/virtio: print a single line with device features

2019-10-23 Thread Daniel Vetter
On Wed, Oct 23, 2019 at 3:18 PM Jani Nikula  wrote:
>
> On Tue, 22 Oct 2019, Daniel Vetter  wrote:
> > On Fri, Oct 18, 2019 at 01:38:32PM +0200, Gerd Hoffmann wrote:
> >> Signed-off-by: Gerd Hoffmann 
> >> ---
> >>  drivers/gpu/drm/virtio/virtgpu_kms.c | 9 -
> >>  1 file changed, 4 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
> >> b/drivers/gpu/drm/virtio/virtgpu_kms.c
> >> index 0b3cdb0d83b0..2f5773e43557 100644
> >> --- a/drivers/gpu/drm/virtio/virtgpu_kms.c
> >> +++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
> >> @@ -155,16 +155,15 @@ int virtio_gpu_init(struct drm_device *dev)
> >>  #ifdef __LITTLE_ENDIAN
> >>  if (virtio_has_feature(vgdev->vdev, VIRTIO_GPU_F_VIRGL))
> >>  vgdev->has_virgl_3d = true;
> >> -DRM_INFO("virgl 3d acceleration %s\n",
> >> - vgdev->has_virgl_3d ? "enabled" : "not supported by host");
> >> -#else
> >> -DRM_INFO("virgl 3d acceleration not supported by guest\n");
> >>  #endif
> >>  if (virtio_has_feature(vgdev->vdev, VIRTIO_GPU_F_EDID)) {
> >>  vgdev->has_edid = true;
> >> -DRM_INFO("EDID support available.\n");
> >>  }
> >>
> >> +DRM_INFO("features: %cvirgl %cedid\n",
> >> + vgdev->has_virgl_3d ? '+' : '-',
> >> + vgdev->has_edid ? '+' : '-');
> >
> > Maybe we should move the various yesno/onoff/enableddisabled helpers from
> > i915_utils.h to drm_utils.h and use them more widely?
>
> I'm trying to take it one step further by adding them to
> include/linux/string-choice.h [1]. Maybe, uh, fourth time's the charm?
>
> BR,
> Jani.
>
> [1] http://lore.kernel.org/r/20191023131308.9420-1-jani.nik...@intel.com

Oh nice, r-b: me on that.

I think the rule for new headers like this is "just do it" once you
have enough senior kernel maintainers' approval. Maybe ask Dave Airlie
for ack and with Greg's r-b then just stuff it into drm-misc-next or
so?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH V5 6/6] docs: sample driver to demonstrate how to implement virtio-mdev framework

2019-10-23 Thread Jason Wang
This sample driver creates mdev device that simulate virtio net device
over virtio mdev transport. The device is implemented through vringh
and workqueue. A device specific dma ops is to make sure HVA is used
directly as the IOVA. This should be sufficient for kernel virtio
driver to work.

Only 'virtio' type is supported right now. I plan to add 'vhost' type
on top which requires some virtual IOMMU implemented in this sample
driver.

Signed-off-by: Jason Wang 
---
 MAINTAINERS|   1 +
 samples/Kconfig|   7 +
 samples/vfio-mdev/Makefile |   1 +
 samples/vfio-mdev/mvnet.c  | 691 +
 4 files changed, 700 insertions(+)
 create mode 100644 samples/vfio-mdev/mvnet.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 9e10ae9c2b4d..8b17927a81fd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17248,6 +17248,7 @@ F:  include/linux/virtio*.h
 F: include/uapi/linux/virtio_*.h
 F: drivers/crypto/virtio/
 F: mm/balloon_compaction.c
+F: samples/vfio-mdev/mvnet.c
 
 VIRTIO BLOCK AND SCSI DRIVERS
 M: "Michael S. Tsirkin" 
diff --git a/samples/Kconfig b/samples/Kconfig
index c8dacb4dda80..a1a1ca2c00b7 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -131,6 +131,13 @@ config SAMPLE_VFIO_MDEV_MDPY
  mediated device.  It is a simple framebuffer and supports
  the region display interface (VFIO_GFX_PLANE_TYPE_REGION).
 
+config SAMPLE_VIRTIO_MDEV_NET
+tristate "Build virtio mdev net example mediated device sample code -- 
loadable modules only"
+   depends on VIRTIO_MDEV_DEVICE && VHOST_RING && m
+   help
+ Build a networking sample device for use as a virtio
+ mediated device.
+
 config SAMPLE_VFIO_MDEV_MDPY_FB
tristate "Build VFIO mdpy example guest fbdev driver -- loadable module 
only"
depends on FB && m
diff --git a/samples/vfio-mdev/Makefile b/samples/vfio-mdev/Makefile
index 10d179c4fdeb..f34af90ed0a0 100644
--- a/samples/vfio-mdev/Makefile
+++ b/samples/vfio-mdev/Makefile
@@ -3,3 +3,4 @@ obj-$(CONFIG_SAMPLE_VFIO_MDEV_MTTY) += mtty.o
 obj-$(CONFIG_SAMPLE_VFIO_MDEV_MDPY) += mdpy.o
 obj-$(CONFIG_SAMPLE_VFIO_MDEV_MDPY_FB) += mdpy-fb.o
 obj-$(CONFIG_SAMPLE_VFIO_MDEV_MBOCHS) += mbochs.o
+obj-$(CONFIG_SAMPLE_VIRTIO_MDEV_NET) += mvnet.o
diff --git a/samples/vfio-mdev/mvnet.c b/samples/vfio-mdev/mvnet.c
new file mode 100644
index ..a2a902d59fb7
--- /dev/null
+++ b/samples/vfio-mdev/mvnet.c
@@ -0,0 +1,691 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Mediated virtual virtio-net device driver.
+ *
+ * Copyright (c) 2019, Red Hat Inc. All rights reserved.
+ * Author: Jason Wang 
+ *
+ * Sample driver that creates mdev device that simulates ethernet loopback
+ * device.
+ *
+ * Usage:
+ *
+ * # modprobe virtio_mdev
+ * # modprobe mvnet
+ * # cd /sys/devices/virtual/mvnet/mvnet/mdev_supported_types/mvnet-virtio
+ * # echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" > ./create
+ * # cd devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001
+ * # ls -d virtio0
+ * virtio0
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define VERSION_STRING  "0.1"
+#define DRIVER_AUTHOR   "Red Hat Corporation"
+
+#define MVNET_CLASS_NAME "mvnet"
+#define MVNET_NAME   "mvnet"
+
+/*
+ * Global Structures
+ */
+
+static struct mvnet_dev {
+   struct class*vd_class;
+   struct idr  vd_idr;
+   struct device   dev;
+} mvnet_dev;
+
+struct mvnet_virtqueue {
+   struct vringh vring;
+   struct vringh_kiov iov;
+   unsigned short head;
+   bool ready;
+   u64 desc_addr;
+   u64 device_addr;
+   u64 driver_addr;
+   u32 num;
+   void *private;
+   irqreturn_t (*cb)(void *data);
+};
+
+#define MVNET_QUEUE_ALIGN PAGE_SIZE
+#define MVNET_QUEUE_MAX 256
+#define MVNET_DEVICE_ID 0x1
+#define MVNET_VENDOR_ID 0
+
+u64 mvnet_features = (1ULL << VIRTIO_F_ANY_LAYOUT) |
+(1ULL << VIRTIO_F_VERSION_1) |
+(1ULL << VIRTIO_F_IOMMU_PLATFORM);
+
+/* State of each mdev device */
+struct mvnet_state {
+   struct mvnet_virtqueue vqs[2];
+   struct work_struct work;
+   spinlock_t lock;
+   struct mdev_device *mdev;
+   struct virtio_net_config config;
+   void *buffer;
+   u32 status;
+   u32 generation;
+   u64 features;
+   struct list_head next;
+};
+
+static struct mutex mdev_list_lock;
+static struct list_head mdev_devices_list;
+
+static void mvnet_queue_ready(struct mvnet_state *mvnet, unsigned int idx)
+{
+   struct mvnet_virtqueue *vq = &mvnet->vqs[idx];
+   int ret;
+
+   ret = vringh_init_kern(&vq->vring, mvnet_features, MVNET_QUEUE_MAX,
+  false, (struct vring_desc *)vq->desc_addr,
+  (struct vring_avail *)vq->driver_addr,
+   

[PATCH V5 5/6] virtio: introduce a mdev based transport

2019-10-23 Thread Jason Wang
This patch introduces a new mdev transport for virtio. This is used to
use kernel virtio driver to drive the mediated device that is capable
of populating virtqueue directly.

A new virtio-mdev driver will be registered to the mdev bus, when a
new virtio-mdev device is probed, it will register the device with
mdev based config ops. This means it is a software transport between
mdev driver and mdev device. The transport was implemented through
device specific ops which is a part of mdev_parent_ops now.

Signed-off-by: Jason Wang 
---
 drivers/virtio/Kconfig   |   7 +
 drivers/virtio/Makefile  |   1 +
 drivers/virtio/virtio_mdev.c | 413 +++
 3 files changed, 421 insertions(+)
 create mode 100644 drivers/virtio/virtio_mdev.c

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 078615cf2afc..8d18722ab572 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -43,6 +43,13 @@ config VIRTIO_PCI_LEGACY
 
  If unsure, say Y.
 
+config VIRTIO_MDEV_DEVICE
+   tristate "VIRTIO driver for Mediated devices"
+   depends on VFIO_MDEV && VIRTIO
+   default n
+   help
+ VIRTIO based driver for Mediated devices.
+
 config VIRTIO_PMEM
tristate "Support for virtio pmem driver"
depends on VIRTIO
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 3a2b5c5dcf46..ebc7fa15ae82 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -6,3 +6,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
 virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
 obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
 obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
+obj-$(CONFIG_VIRTIO_MDEV_DEVICE) += virtio_mdev.o
diff --git a/drivers/virtio/virtio_mdev.c b/drivers/virtio/virtio_mdev.c
new file mode 100644
index ..eb2621f9182c
--- /dev/null
+++ b/drivers/virtio/virtio_mdev.c
@@ -0,0 +1,413 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VIRTIO based driver for Mediated device
+ *
+ * Copyright (c) 2019, Red Hat. All rights reserved.
+ * Author: Jason Wang 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DRIVER_VERSION  "0.1"
+#define DRIVER_AUTHOR   "Red Hat Corporation"
+#define DRIVER_DESC "VIRTIO based driver for Mediated device"
+
+#define to_virtio_mdev_device(dev) \
+   container_of(dev, struct virtio_mdev_device, vdev)
+
+struct virtio_mdev_device {
+   struct virtio_device vdev;
+   struct mdev_device *mdev;
+   u64 features;
+
+   /* The lock to protect virtqueue list */
+   spinlock_t lock;
+   /* List of virtio_mdev_vq_info */
+   struct list_head virtqueues;
+};
+
+struct virtio_mdev_vq_info {
+   /* the actual virtqueue */
+   struct virtqueue *vq;
+
+   /* the list node for the virtqueues list */
+   struct list_head node;
+};
+
+static struct mdev_device *vm_get_mdev(struct virtio_device *vdev)
+{
+   struct virtio_mdev_device *vm_dev = to_virtio_mdev_device(vdev);
+   struct mdev_device *mdev = vm_dev->mdev;
+
+   return mdev;
+}
+
+static void virtio_mdev_get(struct virtio_device *vdev, unsigned offset,
+   void *buf, unsigned len)
+{
+   struct mdev_device *mdev = vm_get_mdev(vdev);
+   const struct virtio_mdev_device_ops *ops = mdev_get_virtio_ops(mdev);
+
+   ops->get_config(mdev, offset, buf, len);
+}
+
+static void virtio_mdev_set(struct virtio_device *vdev, unsigned offset,
+   const void *buf, unsigned len)
+{
+   struct mdev_device *mdev = vm_get_mdev(vdev);
+   const struct virtio_mdev_device_ops *ops = mdev_get_virtio_ops(mdev);
+
+   ops->set_config(mdev, offset, buf, len);
+}
+
+static u32 virtio_mdev_generation(struct virtio_device *vdev)
+{
+   struct mdev_device *mdev = vm_get_mdev(vdev);
+   const struct virtio_mdev_device_ops *ops = mdev_get_virtio_ops(mdev);
+
+   if (ops->get_generation)
+   return ops->get_generation(mdev);
+
+   return 0;
+}
+
+static u8 virtio_mdev_get_status(struct virtio_device *vdev)
+{
+   struct mdev_device *mdev = vm_get_mdev(vdev);
+   const struct virtio_mdev_device_ops *ops = mdev_get_virtio_ops(mdev);
+
+   return ops->get_status(mdev);
+}
+
+static void virtio_mdev_set_status(struct virtio_device *vdev, u8 status)
+{
+   struct mdev_device *mdev = vm_get_mdev(vdev);
+   const struct virtio_mdev_device_ops *ops = mdev_get_virtio_ops(mdev);
+
+   return ops->set_status(mdev, status);
+}
+
+static void virtio_mdev_reset(struct virtio_device *vdev)
+{
+   struct mdev_device *mdev = vm_get_mdev(vdev);
+   const struct virtio_mdev_device_ops *ops = mdev_get_virtio_ops(mdev);
+
+   return ops->set_status(mdev, 0);
+}
+
+static bool virtio_mdev_notify(struct virtqueue *vq)
+{
+   struct mdev_device *mdev = vm_get_mdev(vq->vd

[PATCH V5 3/6] mdev: introduce device specific ops

2019-10-23 Thread Jason Wang
Currently, except for the create and remove, the rest of
mdev_parent_ops is designed for vfio-mdev driver only and may not help
for kernel mdev driver. With the help of class id, this patch
introduces device specific callbacks inside mdev_device
structure. This allows different set of callback to be used by
vfio-mdev and virtio-mdev.

Signed-off-by: Jason Wang 
---
 .../driver-api/vfio-mediated-device.rst   | 35 +
 MAINTAINERS   |  1 +
 drivers/gpu/drm/i915/gvt/kvmgt.c  | 18 ---
 drivers/s390/cio/vfio_ccw_ops.c   | 18 ---
 drivers/s390/crypto/vfio_ap_ops.c | 14 +++--
 drivers/vfio/mdev/mdev_core.c | 25 -
 drivers/vfio/mdev/mdev_private.h  |  5 ++
 drivers/vfio/mdev/vfio_mdev.c | 37 ++---
 include/linux/mdev.h  | 43 ---
 include/linux/vfio_mdev_ops.h | 52 +++
 samples/vfio-mdev/mbochs.c| 20 ---
 samples/vfio-mdev/mdpy.c  | 20 ---
 samples/vfio-mdev/mtty.c  | 18 ---
 13 files changed, 206 insertions(+), 100 deletions(-)
 create mode 100644 include/linux/vfio_mdev_ops.h

diff --git a/Documentation/driver-api/vfio-mediated-device.rst 
b/Documentation/driver-api/vfio-mediated-device.rst
index 6709413bee29..0d8f9e7d7983 100644
--- a/Documentation/driver-api/vfio-mediated-device.rst
+++ b/Documentation/driver-api/vfio-mediated-device.rst
@@ -152,15 +152,6 @@ callbacks per mdev parent device, per mdev type, or any 
other categorization.
 Vendor drivers are expected to be fully asynchronous in this respect or
 provide their own internal resource protection.)
 
-The callbacks in the mdev_parent_ops structure are as follows:
-
-* open: open callback of mediated device
-* close: close callback of mediated device
-* ioctl: ioctl callback of mediated device
-* read : read emulation callback
-* write: write emulation callback
-* mmap: mmap emulation callback
-
 A driver should use the mdev_parent_ops structure in the function call to
 register itself with the mdev core driver::
 
@@ -172,10 +163,34 @@ that a driver should use to unregister itself with the 
mdev core driver::
 
extern void mdev_unregister_device(struct device *dev);
 
-It is also required to specify the class_id in create() callback through::
+As multiple types of mediated devices may be supported, class id needs
+to be specified in the create callback(). This could be done
+explicitly for the device that does not use on mdev bus for its
+operation through:
 
int mdev_set_class(struct mdev_device *mdev, u16 id);
 
+For the device that uses on the mdev bus for its operation, the class
+should provide helper function to set class id and device specific
+ops. E.g for vfio-mdev devices, the function to be called is::
+
+   int mdev_set_vfio_ops(struct mdev_device *mdev,
+  const struct vfio_mdev_ops *vfio_ops);
+
+The class id (set by this function to MDEV_CLASS_ID_VFIO) is used to
+match a device with an mdev driver via its id table. The device
+specific callbacks (specified in *vfio_ops) are obtainable via
+mdev_get_vfio_ops() (for use by the mdev bus driver). A vfio-mdev
+device (class id MDEV_CLASS_ID_VFIO) uses the following
+device-specific ops:
+
+* open: open callback of vfio mediated device
+* close: close callback of vfio mediated device
+* ioctl: ioctl callback of vfio mediated device
+* read : read emulation callback
+* write: write emulation callback
+* mmap: mmap emulation callback
+
 Mediated Device Management Interface Through sysfs
 ==
 
diff --git a/MAINTAINERS b/MAINTAINERS
index e51a68bf8ca8..9e10ae9c2b4d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17121,6 +17121,7 @@ S:  Maintained
 F: Documentation/driver-api/vfio-mediated-device.rst
 F: drivers/vfio/mdev/
 F: include/linux/mdev.h
+F: include/linux/vfio_mdev_ops.h
 F: samples/vfio-mdev/
 
 VFIO PLATFORM DRIVER
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 6420f0dbd31b..c2b7f9dbe4d1 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -643,6 +644,8 @@ static void kvmgt_put_vfio_device(void *vgpu)
vfio_device_put(((struct intel_vgpu *)vgpu)->vdev.vfio_device);
 }
 
+static const struct vfio_mdev_device_ops intel_vfio_vgpu_dev_ops;
+
 static int intel_vgpu_create(struct kobject *kobj, struct mdev_device *mdev)
 {
struct intel_vgpu *vgpu = NULL;
@@ -678,7 +681,7 @@ static int intel_vgpu_create(struct kobject *kobj, struct 
mdev_device *mdev)
 dev_name(mdev_dev(mdev)));
ret = 0;
 
-   mdev_set_class(mdev, MDEV_CLASS_ID_VFIO);
+   mdev_set_vfio_ops(mdev, &intel_vfio_vgpu_dev_o

Re: [PATCH] drm/virtio: print a single line with device features

2019-10-23 Thread Jani Nikula
On Tue, 22 Oct 2019, Daniel Vetter  wrote:
> On Fri, Oct 18, 2019 at 01:38:32PM +0200, Gerd Hoffmann wrote:
>> Signed-off-by: Gerd Hoffmann 
>> ---
>>  drivers/gpu/drm/virtio/virtgpu_kms.c | 9 -
>>  1 file changed, 4 insertions(+), 5 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
>> b/drivers/gpu/drm/virtio/virtgpu_kms.c
>> index 0b3cdb0d83b0..2f5773e43557 100644
>> --- a/drivers/gpu/drm/virtio/virtgpu_kms.c
>> +++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
>> @@ -155,16 +155,15 @@ int virtio_gpu_init(struct drm_device *dev)
>>  #ifdef __LITTLE_ENDIAN
>>  if (virtio_has_feature(vgdev->vdev, VIRTIO_GPU_F_VIRGL))
>>  vgdev->has_virgl_3d = true;
>> -DRM_INFO("virgl 3d acceleration %s\n",
>> - vgdev->has_virgl_3d ? "enabled" : "not supported by host");
>> -#else
>> -DRM_INFO("virgl 3d acceleration not supported by guest\n");
>>  #endif
>>  if (virtio_has_feature(vgdev->vdev, VIRTIO_GPU_F_EDID)) {
>>  vgdev->has_edid = true;
>> -DRM_INFO("EDID support available.\n");
>>  }
>>  
>> +DRM_INFO("features: %cvirgl %cedid\n",
>> + vgdev->has_virgl_3d ? '+' : '-',
>> + vgdev->has_edid ? '+' : '-');
>
> Maybe we should move the various yesno/onoff/enableddisabled helpers from
> i915_utils.h to drm_utils.h and use them more widely?

I'm trying to take it one step further by adding them to
include/linux/string-choice.h [1]. Maybe, uh, fourth time's the charm?

BR,
Jani.

[1] http://lore.kernel.org/r/20191023131308.9420-1-jani.nik...@intel.com


-- 
Jani Nikula, Intel Open Source Graphics Center
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH V5 2/6] modpost: add support for mdev class id

2019-10-23 Thread Jason Wang
Add support to parse mdev class id table.

Reviewed-by: Parav Pandit 
Signed-off-by: Jason Wang 
---
 drivers/vfio/mdev/vfio_mdev.c |  2 ++
 scripts/mod/devicetable-offsets.c |  3 +++
 scripts/mod/file2alias.c  | 10 ++
 3 files changed, 15 insertions(+)

diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index 7b24ee9cb8dd..cb701cd646f0 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -125,6 +125,8 @@ static const struct mdev_class_id id_table[] = {
{ 0 },
 };
 
+MODULE_DEVICE_TABLE(mdev, id_table);
+
 static struct mdev_driver vfio_mdev_driver = {
.name   = "vfio_mdev",
.probe  = vfio_mdev_probe,
diff --git a/scripts/mod/devicetable-offsets.c 
b/scripts/mod/devicetable-offsets.c
index 054405b90ba4..6cbb1062488a 100644
--- a/scripts/mod/devicetable-offsets.c
+++ b/scripts/mod/devicetable-offsets.c
@@ -231,5 +231,8 @@ int main(void)
DEVID(wmi_device_id);
DEVID_FIELD(wmi_device_id, guid_string);
 
+   DEVID(mdev_class_id);
+   DEVID_FIELD(mdev_class_id, id);
+
return 0;
 }
diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c
index c91eba751804..d365dfe7c718 100644
--- a/scripts/mod/file2alias.c
+++ b/scripts/mod/file2alias.c
@@ -1335,6 +1335,15 @@ static int do_wmi_entry(const char *filename, void 
*symval, char *alias)
return 1;
 }
 
+/* looks like: "mdev:cN" */
+static int do_mdev_entry(const char *filename, void *symval, char *alias)
+{
+   DEF_FIELD(symval, mdev_class_id, id);
+
+   sprintf(alias, "mdev:c%02X", id);
+   return 1;
+}
+
 /* Does namelen bytes of name exactly match the symbol? */
 static bool sym_is(const char *name, unsigned namelen, const char *symbol)
 {
@@ -1407,6 +1416,7 @@ static const struct devtable devtable[] = {
{"typec", SIZE_typec_device_id, do_typec_entry},
{"tee", SIZE_tee_client_device_id, do_tee_entry},
{"wmi", SIZE_wmi_device_id, do_wmi_entry},
+   {"mdev", SIZE_mdev_class_id, do_mdev_entry},
 };
 
 /* Create MODULE_ALIAS() statements.
-- 
2.19.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH V5 1/6] mdev: class id support

2019-10-23 Thread Jason Wang
Mdev bus only supports vfio driver right now, so it doesn't implement
match method. But in the future, we may add drivers other than vfio,
the first driver could be virtio-mdev. This means we need to add
device class id support in bus match method to pair the mdev device
and mdev driver correctly.

So this patch adds id_table to mdev_driver and class_id for mdev
device with the match method for mdev bus.

Signed-off-by: Jason Wang 
---
 .../driver-api/vfio-mediated-device.rst   |  5 +
 drivers/gpu/drm/i915/gvt/kvmgt.c  |  1 +
 drivers/s390/cio/vfio_ccw_ops.c   |  1 +
 drivers/s390/crypto/vfio_ap_ops.c |  1 +
 drivers/vfio/mdev/mdev_core.c | 18 +++
 drivers/vfio/mdev/mdev_driver.c   | 22 +++
 drivers/vfio/mdev/mdev_private.h  |  1 +
 drivers/vfio/mdev/vfio_mdev.c |  6 +
 include/linux/mdev.h  |  8 +++
 include/linux/mod_devicetable.h   |  8 +++
 samples/vfio-mdev/mbochs.c|  1 +
 samples/vfio-mdev/mdpy.c  |  1 +
 samples/vfio-mdev/mtty.c  |  1 +
 13 files changed, 74 insertions(+)

diff --git a/Documentation/driver-api/vfio-mediated-device.rst 
b/Documentation/driver-api/vfio-mediated-device.rst
index 25eb7d5b834b..6709413bee29 100644
--- a/Documentation/driver-api/vfio-mediated-device.rst
+++ b/Documentation/driver-api/vfio-mediated-device.rst
@@ -102,12 +102,14 @@ structure to represent a mediated device's driver::
   * @probe: called when new device created
   * @remove: called when device removed
   * @driver: device driver structure
+  * @id_table: the ids serviced by this driver
   */
  struct mdev_driver {
 const char *name;
 int  (*probe)  (struct device *dev);
 void (*remove) (struct device *dev);
 struct device_driverdriver;
+const struct mdev_class_id *id_table;
  };
 
 A mediated bus driver for mdev should use this structure in the function calls
@@ -170,6 +172,9 @@ that a driver should use to unregister itself with the mdev 
core driver::
 
extern void mdev_unregister_device(struct device *dev);
 
+It is also required to specify the class_id in create() callback through::
+
+   int mdev_set_class(struct mdev_device *mdev, u16 id);
 
 Mediated Device Management Interface Through sysfs
 ==
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 343d79c1cb7e..6420f0dbd31b 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -678,6 +678,7 @@ static int intel_vgpu_create(struct kobject *kobj, struct 
mdev_device *mdev)
 dev_name(mdev_dev(mdev)));
ret = 0;
 
+   mdev_set_class(mdev, MDEV_CLASS_ID_VFIO);
 out:
return ret;
 }
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index f0d71ab77c50..cf2c013ae32f 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -129,6 +129,7 @@ static int vfio_ccw_mdev_create(struct kobject *kobj, 
struct mdev_device *mdev)
   private->sch->schid.ssid,
   private->sch->schid.sch_no);
 
+   mdev_set_class(mdev, MDEV_CLASS_ID_VFIO);
return 0;
 }
 
diff --git a/drivers/s390/crypto/vfio_ap_ops.c 
b/drivers/s390/crypto/vfio_ap_ops.c
index 5c0f53c6dde7..07c31070afeb 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -343,6 +343,7 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct 
mdev_device *mdev)
list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
mutex_unlock(&matrix_dev->lock);
 
+   mdev_set_class(mdev, MDEV_CLASS_ID_VFIO);
return 0;
 }
 
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index b558d4cfd082..3a9c52d71b4e 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -45,6 +45,16 @@ void mdev_set_drvdata(struct mdev_device *mdev, void *data)
 }
 EXPORT_SYMBOL(mdev_set_drvdata);
 
+/* Specify the class for the mdev device, this must be called during
+ * create() callback.
+ */
+void mdev_set_class(struct mdev_device *mdev, u16 id)
+{
+   WARN_ON(mdev->class_id);
+   mdev->class_id = id;
+}
+EXPORT_SYMBOL(mdev_set_class);
+
 struct device *mdev_dev(struct mdev_device *mdev)
 {
return &mdev->dev;
@@ -135,6 +145,7 @@ static int mdev_device_remove_cb(struct device *dev, void 
*data)
  * mdev_register_device : Register a device
  * @dev: device structure representing parent device.
  * @ops: Parent device operation structure to be registered.
+ * @id: class id.
  *
  * Add device to list of registered parent devices.
  * Returns a negative value on error, otherwise 0.
@@ -324,6 +335,13 @@ int mdev_device_create(struct ko

[PATCH V5 0/6] mdev based hardware virtio offloading support

2019-10-23 Thread Jason Wang
Hi all:

There are hardwares that can do virtio datapath offloading while
having its own control path. This path tries to implement a mdev based
unified API to support using kernel virtio driver to drive those
devices. This is done by introducing a new mdev transport for virtio
(virtio_mdev) and register itself as a new kind of mdev driver. Then
it provides a unified way for kernel virtio driver to talk with mdev
device implementation.

Though the series only contains kernel driver support, the goal is to
make the transport generic enough to support userspace drivers. This
means vhost-mdev[1] could be built on top as well by resuing the
transport.

A sample driver is also implemented which simulate a virito-net
loopback ethernet device on top of vringh + workqueue. This could be
used as a reference implementation for real hardware driver.

Also a real ICF VF driver was also posted here[2] which is a good
reference for vendors who is interested in their own virtio datapath
offloading product.

Consider mdev framework only support VFIO device and driver right now,
this series also extend it to support other types. This is done
through introducing class id to the device and pairing it with
id_talbe claimed by the driver. On top, this seris also decouple
device specific parents ops out of the common ones.

Pktgen test was done with virito-net + mvnet loop back device.

Please review.

[1] https://lkml.org/lkml/2019/10/22/262
[2] https://lkml.org/lkml/2019/10/15/1226

Changes from V4:

- keep mdev_set_class() for the device that doesn't use device ops
- use union for device ops pointer in mdev_device
- introduce class specific helper for getting is device ops
- use WARN_ON instead of BUG_ON in mdev_set_virtio_ops
- explain details of get_mdev_features() and get_vendor_id()
- distinguish the optional virito device ops from mandatory ones and
  make get_generation() optional
- rename vfio_mdev.h to vfio_mdev_ops.h, rename virito_mdev.h to
  virtio_mdev_ops.h
- don't abuse version fileds in virtio_mdev structure, use features
  instead
- fix warning during device remove
- style & docs tweaks and typo fixes

Changes from V3:

- document that class id (device ops) must be specified in create()
- add WARN() when trying to set class_id when it has already set
- add WARN() when class_id is not specified in create() and correctly
  return an error in this case
- correct the prototype of mdev_set_class() in the doc
- add documention of mdev_set_class()
- remove the unnecessary "class_id_fail" label when class id is not
  specified in create()
- convert id_table in vfio_mdev to const
- move mdev_set_class and its friends after mdev_uuid()
- suqash the patch of bus uevent into patch of introducing class id
- tweak the words in the docs per Cornelia suggestion
- tie class_id and device ops through class specific initialization
  routine like mdev_set_vfio_ops()
- typos fixes in the docs of virtio-mdev callbacks
- document the usage of virtqueues in struct virtio_mdev_device
- remove the useless vqs array in struct virtio_mdev_device
- rename MDEV_ID_XXX to MDEV_CLASS_ID_XXX

Changes from V2:

- fail when class_id is not specified
- drop the vringh patch
- match the doc to the code
- tweak the commit log
- move device_ops from parent to mdev device
- remove the unused MDEV_ID_VHOST

Changes from V1:

- move virtio_mdev.c to drivers/virtio
- store class_id in mdev_device instead of mdev_parent
- store device_ops in mdev_device instead of mdev_parent
- reorder the patch, vringh fix comes first
- really silent compiling warnings
- really switch to use u16 for class_id
- uevent and modpost support for mdev class_id
- vraious tweaks per comments from Parav

Changes from RFC-V2:

- silent compile warnings on some specific configuration
- use u16 instead u8 for class id
- reseve MDEV_ID_VHOST for future vhost-mdev work
- introduce "virtio" type for mvnet and make "vhost" type for future
  work
- add entries in MAINTAINER
- tweak and typos fixes in commit log

Changes from RFC-V1:

- rename device id to class id
- add docs for class id and device specific ops (device_ops)
- split device_ops into seperate headers
- drop the mdev_set_dma_ops()
- use device_ops to implement the transport API, then it's not a part
  of UAPI any more
- use GFP_ATOMIC in mvnet sample device and other tweaks
- set_vring_base/get_vring_base support for mvnet device

Jason Wang (6):
  mdev: class id support
  modpost: add support for mdev class id
  mdev: introduce device specific ops
  mdev: introduce virtio device and its device ops
  virtio: introduce a mdev based transport
  docs: sample driver to demonstrate how to implement virtio-mdev
framework

 .../driver-api/vfio-mediated-device.rst   |  38 +-
 MAINTAINERS   |   2 +
 drivers/gpu/drm/i915/gvt/kvmgt.c  |  17 +-
 drivers/s390/cio/vfio_ccw_ops.c   |  17 +-
 drivers/s390/crypto/vfio_ap_ops.c |  13 +-
 drivers/vfio/mdev/mdev_core

Re: [RFC 1/2] vhost: IFC VF hardware operation layer

2019-10-23 Thread Jason Wang


On 2019/10/23 下午6:13, Simon Horman wrote:

On Tue, Oct 22, 2019 at 09:32:36AM +0800, Jason Wang wrote:

On 2019/10/22 上午12:31, Simon Horman wrote:

On Mon, Oct 21, 2019 at 05:55:33PM +0800, Zhu, Lingshan wrote:

On 10/16/2019 5:53 PM, Simon Horman wrote:

Hi Zhu,

thanks for your patch.

On Wed, Oct 16, 2019 at 09:10:40AM +0800, Zhu Lingshan wrote:

...


+static void ifcvf_read_dev_config(struct ifcvf_hw *hw, u64 offset,
+  void *dst, int length)
+{
+   int i;
+   u8 *p;
+   u8 old_gen, new_gen;
+
+   do {
+   old_gen = ioread8(&hw->common_cfg->config_generation);
+
+   p = dst;
+   for (i = 0; i < length; i++)
+   *p++ = ioread8((u8 *)hw->dev_cfg + offset + i);
+
+   new_gen = ioread8(&hw->common_cfg->config_generation);
+   } while (old_gen != new_gen);

Would it be wise to limit the number of iterations of the loop above?

Thanks but I don't quite get it. This is used to make sure the function
would get the latest config.

I am worried about the possibility that it will loop forever.
Could that happen?

...

My understanding is that the function here is similar to virtio config
generation [1]. So this can only happen for a buggy hardware.

Ok, so this circles back to my original question.
Should we put a bound on the number of times the loop runs
or should we accept that the kernel locks up if the HW is buggy?



I'm not sure, and similar logic has been used by virtio-pci drivers for 
years. Consider this logic is pretty simple and it should not be the 
only place that virito hardware can lock kernel, we can keep it as is.


Actually, there's no need for hardware to implement generation logic, it 
could be emulated by software or even ignored. In new version of 
virtio-mdev, get_generation() is optional, when it was not implemented, 
0 is simply returned by virtio-mdev transport.


Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2] vhost: introduce mdev based hardware backend

2019-10-23 Thread Jason Wang


On 2019/10/23 下午6:11, Tiwei Bie wrote:

On Wed, Oct 23, 2019 at 03:25:00PM +0800, Jason Wang wrote:

On 2019/10/23 下午3:07, Tiwei Bie wrote:

On Wed, Oct 23, 2019 at 01:46:23PM +0800, Jason Wang wrote:

On 2019/10/23 上午11:02, Tiwei Bie wrote:

On Tue, Oct 22, 2019 at 09:30:16PM +0800, Jason Wang wrote:

On 2019/10/22 下午5:52, Tiwei Bie wrote:

This patch introduces a mdev based hardware vhost backend.
This backend is built on top of the same abstraction used
in virtio-mdev and provides a generic vhost interface for
userspace to accelerate the virtio devices in guest.

This backend is implemented as a mdev device driver on top
of the same mdev device ops used in virtio-mdev but using
a different mdev class id, and it will register the device
as a VFIO device for userspace to use. Userspace can setup
the IOMMU with the existing VFIO container/group APIs and
then get the device fd with the device name. After getting
the device fd of this device, userspace can use vhost ioctls
to setup the backend.

Signed-off-by: Tiwei Bie 
---
This patch depends on below series:
https://lkml.org/lkml/2019/10/17/286

v1 -> v2:
- Replace _SET_STATE with _SET_STATUS (MST);
- Check status bits at each step (MST);
- Report the max ring size and max number of queues (MST);
- Add missing MODULE_DEVICE_TABLE (Jason);
- Only support the network backend w/o multiqueue for now;

Any idea on how to extend it to support devices other than net? I think we
want a generic API or an API that could be made generic in the future.

Do we want to e.g having a generic vhost mdev for all kinds of devices or
introducing e.g vhost-net-mdev and vhost-scsi-mdev?

One possible way is to do what vhost-user does. I.e. Apart from
the generic ring, features, ... related ioctls, we also introduce
device specific ioctls when we need them. As vhost-mdev just needs
to forward configs between parent and userspace and even won't
cache any info when possible,

So it looks to me this is only possible if we expose e.g set_config and
get_config to userspace.

The set_config and get_config interface isn't really everything
of device specific settings. We also have ctrlq in virtio-net.


Yes, but it could be processed by the exist API. Isn't it? Just set ctrl vq
address and let parent to deal with that.

I mean how to expose ctrlq related settings to userspace?



I think it works like:

1) userspace find ctrl_vq is supported

2) then it can allocate memory for ctrl vq and set its address through 
vhost-mdev


3) userspace can populate ctrl vq itself







I think it might be better to do
this in one generic vhost-mdev module.

Looking at definitions of VhostUserRequest in qemu, it mixed generic API
with device specific API. If we want go this ways (a generic vhost-mdev),
more questions needs to be answered:

1) How could userspace know which type of vhost it would use? Do we need to
expose virtio subsystem device in for userspace this case?

2) That generic vhost-mdev module still need to filter out unsupported
ioctls for a specific type. E.g if it probes a net device, it should refuse
API for other type. This in fact a vhost-mdev-net but just not modularize it
on top of vhost-mdev.



- Some minor fixes and improvements;
- Rebase on top of virtio-mdev series v4;

[...]

+
+static long vhost_mdev_get_features(struct vhost_mdev *m, u64 __user *featurep)
+{
+   if (copy_to_user(featurep, &m->features, sizeof(m->features)))
+   return -EFAULT;

As discussed in previous version do we need to filter out MQ feature here?

I think it's more straightforward to let the parent drivers to
filter out the unsupported features. Otherwise it would be tricky
when we want to add more features in vhost-mdev module,

It's as simple as remove the feature from blacklist?

It's not really that easy. It may break the old drivers.


I'm not sure I understand here, we do feature negotiation anyhow. For old
drivers do you mean the guest drivers without MQ?

For old drivers I mean old parent drivers. It's possible
to compile old drivers on new kernels.



Yes, but if old parent driver itself can not support MQ it should just 
not advertise that feature.





I'm not quite sure how will we implement MQ support in
vhost-mdev.



Yes, that's why I ask here. I think we want the vhost-mdev to be generic 
which means it's better not let vhost-mdev to know anything which is 
device specific. So this is a question that should be considered.




If we need to introduce new virtio_mdev_device_ops
callbacks and an old driver exposed the MQ feature,
then the new vhost-mdev will see this old driver expose
MQ feature but not provide corresponding callbacks.ean



That's exact the issue which current API can not handle, so that's why I 
suggest to filter MQ out for vhost-mdev.


And in the future, we can:

1) invent new ioctls and convert them to config access or

2) just exposing config for userspace to access (then vhost-mdev work 
much more similar to virtio-mdev).








i.e. if
the par

Re: [PATCH v2] vhost: introduce mdev based hardware backend

2019-10-23 Thread Tiwei Bie
On Wed, Oct 23, 2019 at 03:25:00PM +0800, Jason Wang wrote:
> On 2019/10/23 下午3:07, Tiwei Bie wrote:
> > On Wed, Oct 23, 2019 at 01:46:23PM +0800, Jason Wang wrote:
> > > On 2019/10/23 上午11:02, Tiwei Bie wrote:
> > > > On Tue, Oct 22, 2019 at 09:30:16PM +0800, Jason Wang wrote:
> > > > > On 2019/10/22 下午5:52, Tiwei Bie wrote:
> > > > > > This patch introduces a mdev based hardware vhost backend.
> > > > > > This backend is built on top of the same abstraction used
> > > > > > in virtio-mdev and provides a generic vhost interface for
> > > > > > userspace to accelerate the virtio devices in guest.
> > > > > > 
> > > > > > This backend is implemented as a mdev device driver on top
> > > > > > of the same mdev device ops used in virtio-mdev but using
> > > > > > a different mdev class id, and it will register the device
> > > > > > as a VFIO device for userspace to use. Userspace can setup
> > > > > > the IOMMU with the existing VFIO container/group APIs and
> > > > > > then get the device fd with the device name. After getting
> > > > > > the device fd of this device, userspace can use vhost ioctls
> > > > > > to setup the backend.
> > > > > > 
> > > > > > Signed-off-by: Tiwei Bie 
> > > > > > ---
> > > > > > This patch depends on below series:
> > > > > > https://lkml.org/lkml/2019/10/17/286
> > > > > > 
> > > > > > v1 -> v2:
> > > > > > - Replace _SET_STATE with _SET_STATUS (MST);
> > > > > > - Check status bits at each step (MST);
> > > > > > - Report the max ring size and max number of queues (MST);
> > > > > > - Add missing MODULE_DEVICE_TABLE (Jason);
> > > > > > - Only support the network backend w/o multiqueue for now;
> > > > > Any idea on how to extend it to support devices other than net? I 
> > > > > think we
> > > > > want a generic API or an API that could be made generic in the future.
> > > > > 
> > > > > Do we want to e.g having a generic vhost mdev for all kinds of 
> > > > > devices or
> > > > > introducing e.g vhost-net-mdev and vhost-scsi-mdev?
> > > > One possible way is to do what vhost-user does. I.e. Apart from
> > > > the generic ring, features, ... related ioctls, we also introduce
> > > > device specific ioctls when we need them. As vhost-mdev just needs
> > > > to forward configs between parent and userspace and even won't
> > > > cache any info when possible,
> > > 
> > > So it looks to me this is only possible if we expose e.g set_config and
> > > get_config to userspace.
> > The set_config and get_config interface isn't really everything
> > of device specific settings. We also have ctrlq in virtio-net.
> 
> 
> Yes, but it could be processed by the exist API. Isn't it? Just set ctrl vq
> address and let parent to deal with that.

I mean how to expose ctrlq related settings to userspace?

> 
> 
> > 
> > > 
> > > > I think it might be better to do
> > > > this in one generic vhost-mdev module.
> > > 
> > > Looking at definitions of VhostUserRequest in qemu, it mixed generic API
> > > with device specific API. If we want go this ways (a generic vhost-mdev),
> > > more questions needs to be answered:
> > > 
> > > 1) How could userspace know which type of vhost it would use? Do we need 
> > > to
> > > expose virtio subsystem device in for userspace this case?
> > > 
> > > 2) That generic vhost-mdev module still need to filter out unsupported
> > > ioctls for a specific type. E.g if it probes a net device, it should 
> > > refuse
> > > API for other type. This in fact a vhost-mdev-net but just not modularize 
> > > it
> > > on top of vhost-mdev.
> > > 
> > > 
> > > > > > - Some minor fixes and improvements;
> > > > > > - Rebase on top of virtio-mdev series v4;
> > [...]
> > > > > > +
> > > > > > +static long vhost_mdev_get_features(struct vhost_mdev *m, u64 
> > > > > > __user *featurep)
> > > > > > +{
> > > > > > +   if (copy_to_user(featurep, &m->features, sizeof(m->features)))
> > > > > > +   return -EFAULT;
> > > > > As discussed in previous version do we need to filter out MQ feature 
> > > > > here?
> > > > I think it's more straightforward to let the parent drivers to
> > > > filter out the unsupported features. Otherwise it would be tricky
> > > > when we want to add more features in vhost-mdev module,
> > > 
> > > It's as simple as remove the feature from blacklist?
> > It's not really that easy. It may break the old drivers.
> 
> 
> I'm not sure I understand here, we do feature negotiation anyhow. For old
> drivers do you mean the guest drivers without MQ?

For old drivers I mean old parent drivers. It's possible
to compile old drivers on new kernels.

I'm not quite sure how will we implement MQ support in
vhost-mdev. If we need to introduce new virtio_mdev_device_ops
callbacks and an old driver exposed the MQ feature,
then the new vhost-mdev will see this old driver expose
MQ feature but not provide corresponding callbacks.

> 
> 
> > 
> > > 
> > > > i.e. if
> > > > the parent drivers may expose unsupported features and relay on
> > > > vhost-mdev to filter them o

[PATCH] drm/simple-kms: Standardize arguments for callbacks

2019-10-23 Thread Daniel Vetter
Passing the wrong type feels icky, everywhere else we use the pipe as
the first parameter. Spotted while discussing patches with Thomas
Zimmermann.

v2: Make xen compile correctly

Acked-By: Thomas Zimmermann  (v1)
Cc: Thomas Zimmermann 
Cc: Noralf Trønnes 
Cc: Gerd Hoffmann 
Cc: Eric Anholt 
Cc: Emil Velikov 
Cc: virtualization@lists.linux-foundation.org
Cc: Linus Walleij 
Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/cirrus/cirrus.c | 2 +-
 drivers/gpu/drm/drm_simple_kms_helper.c | 2 +-
 drivers/gpu/drm/pl111/pl111_display.c   | 4 ++--
 drivers/gpu/drm/xen/xen_drm_front_kms.c | 7 ---
 include/drm/drm_simple_kms_helper.h | 2 +-
 5 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/cirrus/cirrus.c b/drivers/gpu/drm/cirrus/cirrus.c
index 7d08d067e1a4..248c9f765c45 100644
--- a/drivers/gpu/drm/cirrus/cirrus.c
+++ b/drivers/gpu/drm/cirrus/cirrus.c
@@ -390,7 +390,7 @@ static int cirrus_conn_init(struct cirrus_device *cirrus)
 /* -- */
 /* cirrus (simple) display pipe  */
 
-static enum drm_mode_status cirrus_pipe_mode_valid(struct drm_crtc *crtc,
+static enum drm_mode_status cirrus_pipe_mode_valid(struct 
drm_simple_display_pipe *pipe,
   const struct 
drm_display_mode *mode)
 {
if (cirrus_check_size(mode->hdisplay, mode->vdisplay, NULL) < 0)
diff --git a/drivers/gpu/drm/drm_simple_kms_helper.c 
b/drivers/gpu/drm/drm_simple_kms_helper.c
index 046055719245..15fb516ae2d8 100644
--- a/drivers/gpu/drm/drm_simple_kms_helper.c
+++ b/drivers/gpu/drm/drm_simple_kms_helper.c
@@ -43,7 +43,7 @@ drm_simple_kms_crtc_mode_valid(struct drm_crtc *crtc,
/* Anything goes */
return MODE_OK;
 
-   return pipe->funcs->mode_valid(crtc, mode);
+   return pipe->funcs->mode_valid(pipe, mode);
 }
 
 static int drm_simple_kms_crtc_check(struct drm_crtc *crtc,
diff --git a/drivers/gpu/drm/pl111/pl111_display.c 
b/drivers/gpu/drm/pl111/pl111_display.c
index 024771a4083e..703ddc803c55 100644
--- a/drivers/gpu/drm/pl111/pl111_display.c
+++ b/drivers/gpu/drm/pl111/pl111_display.c
@@ -48,10 +48,10 @@ irqreturn_t pl111_irq(int irq, void *data)
 }
 
 static enum drm_mode_status
-pl111_mode_valid(struct drm_crtc *crtc,
+pl111_mode_valid(struct drm_simple_display_pipe *pipe,
 const struct drm_display_mode *mode)
 {
-   struct drm_device *drm = crtc->dev;
+   struct drm_device *drm = pipe->crtc.dev;
struct pl111_drm_dev_private *priv = drm->dev_private;
u32 cpp = priv->variant->fb_bpp / 8;
u64 bw;
diff --git a/drivers/gpu/drm/xen/xen_drm_front_kms.c 
b/drivers/gpu/drm/xen/xen_drm_front_kms.c
index 21ad1c359b61..ff506bc99414 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_kms.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_kms.c
@@ -270,11 +270,12 @@ static void display_update(struct drm_simple_display_pipe 
*pipe,
 }
 
 static enum drm_mode_status
-display_mode_valid(struct drm_crtc *crtc, const struct drm_display_mode *mode)
+display_mode_valid(struct drm_simple_display_pipe *pipe,
+  const struct drm_display_mode *mode)
 {
struct xen_drm_front_drm_pipeline *pipeline =
-   container_of(crtc, struct xen_drm_front_drm_pipeline,
-pipe.crtc);
+   container_of(pipe, struct xen_drm_front_drm_pipeline,
+pipe);
 
if (mode->hdisplay != pipeline->width)
return MODE_ERROR;
diff --git a/include/drm/drm_simple_kms_helper.h 
b/include/drm/drm_simple_kms_helper.h
index 4d89cd0a60db..15afee9cf049 100644
--- a/include/drm/drm_simple_kms_helper.h
+++ b/include/drm/drm_simple_kms_helper.h
@@ -49,7 +49,7 @@ struct drm_simple_display_pipe_funcs {
 *
 * drm_mode_status Enum
 */
-   enum drm_mode_status (*mode_valid)(struct drm_crtc *crtc,
+   enum drm_mode_status (*mode_valid)(struct drm_simple_display_pipe *pipe,
   const struct drm_display_mode *mode);
 
/**
-- 
2.23.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0

2019-10-23 Thread David Hildenbrand

On 23.10.19 11:43, Michal Hocko wrote:

On Tue 22-10-19 16:02:09, David Hildenbrand wrote:
[...]

MEM_CANCEL_OFFLINE could gain the reference back to balance the
MEM_GOING_OFFLINE step.


The pages are already unisolated and could be used by the buddy. But again,
I think you have an idea that tries to avoid putting pages to the buddy.


Yeah, set_page_count(page, 0) if you do not want to release that page
from the notifier context to reflect that the page is ok to be offlined
with the rest.
   


I neither see how you deal with __test_page_isolated_in_pageblock() nor with
__offline_isolated_pages(). Sorry, but what I read is incomplete and you
probably have a full proposal in your head. Please read below how I think
you want to solve it.


Yeah, sorry that I am throwing incomplete ideas at you. I am just trying
to really nail down how to deal with reference counting here because it
is an important aspect.


I think we collected all the missing pieces now :) Thanks!

[...]



I was reading

include/linux/mm_types.h:

"If you want to use the refcount field, it must be used in such a way
  that other CPUs temporarily incrementing and then decrementing the
  refcount does not cause problems"

And that made me think "anybody can go ahead and try get_page_unless_zero()".

If I am missing something here and this can indeed not happen (e.g.,
because PageOffline() pages are never mapped to user space), then I'll
happily remove this code.


The point is that if the owner of the page is holding the only reference
to the page then it is clear that nothing like that's happened.


Right, and I think the race I described won't happen in practice. Nobody 
should be trying to do a get_page_unless_zero() on random pages that are 
not even mapped to user space. I was (as so often) very careful :)



Let's recap what I suggest:

"PageOffline() pages that have a reference count of 0 will be treated
  like free pages when offlining pages, allowing the containing memory
  block to get offlined. In case a driver wants to revive such a page, it
  has to synchronize against memory onlining/offlining (e.g., using memory
  notifiers) while incrementing the reference count. Also, a driver that
  relies in this feature is aware that re-onlining the memory will require
  to re-set the pages PageOffline() - e.g., via the online_page_callback_t."


OK

[...]

d) __put_page() is modified to not return pages to the buddy in any
case as a safety net. We might be able to get rid of that.


I do not like exactly this part


Yeah, and I think I can drop it from this patch.

  

What I think you suggest:

a) has_unmovable_pages() skips over all PageOffline() pages.
This results in a lot of false negatives when trying to offline. Might be 
ok.

b) The driver decrements the reference count of the PageOffline pages
in MEM_GOING_OFFLINE.


Well, driver should make the page unreferenced or fail. What is done
really depends on the specific driver


c) The driver increments the reference count of the PageOffline pages
in MEM_CANCEL_OFFLINE. One issue might be that the pages are no longer
isolated once we get that call. Might be ok.


Only previous PageBuddy pages are returned to the allocator IIRC. Mostly
because of MovablePage()


d) How to make __test_page_isolated_in_pageblock() succeed?
Like I propose in this patch (PageOffline() + refcount == 0)?


Yep


e) How to make __offline_isolated_pages() succeed?
Like I propose in this patch (PageOffline() + refcount == 0)?


Simply skip over PageOffline pages. Reference count should never be != 0
at this stage.


Right, that should be guaranteed by d). (as long as people play by the 
rules) Same applies to my current patch.


  

In summary, is what you suggest simply delaying setting the reference count to 0
in MEM_GOING_OFFLINE instead of right away when the driver unpluggs the pages?


Yes


What's the big benefit you see and I fail to see?


Aparat from no hooks into __put_page it is also an explicit control over
the page via reference counting. Do you see any downsides?


The only downside I see is that we get more false negatives on 
has_unmovable_pages(), eventually resulting in the offlining stage after 
isolation to loop forever (as some PageOffline() pages are not movable 
(especially, XEN balloon, HyperV balloon), there won't be progress).


I somewhat don't like forcing everybody that uses PageOffline() 
(especially all users of balloon compaction) to implement memory 
notifiers just to avoid that. Maybe, we even want to use PageOffline() 
in the future in the core (e.g., for memory holes instead of PG_reserved 
or similar).


Thanks!

--

Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next 13/14] vsock: prevent transport modules unloading

2019-10-23 Thread Stefano Garzarella
This patch adds 'module' member in the 'struct vsock_transport'
in order to get/put the transport module. This prevents the
module unloading while sockets are assigned to it.

We increase the module refcnt when a socket is assigned to a
transport, and we decrease the module refcnt when the socket
is destructed.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
RFC -> v1:
- fixed typo 's/tranport/transport/' in a comment (Stefan)
---
 drivers/vhost/vsock.c|  2 ++
 include/net/af_vsock.h   |  2 ++
 net/vmw_vsock/af_vsock.c | 20 
 net/vmw_vsock/hyperv_transport.c |  2 ++
 net/vmw_vsock/virtio_transport.c |  2 ++
 net/vmw_vsock/vmci_transport.c   |  1 +
 6 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index b235f4bbe8ea..fdda9ec625ad 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -386,6 +386,8 @@ static bool vhost_vsock_more_replies(struct vhost_vsock 
*vsock)
 
 static struct virtio_transport vhost_transport = {
.transport = {
+   .module   = THIS_MODULE,
+
.get_local_cid= vhost_transport_get_local_cid,
 
.init = virtio_transport_do_socket_init,
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 27a3463e4892..269e2f034789 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -100,6 +100,8 @@ struct vsock_transport_send_notify_data {
 #define VSOCK_TRANSPORT_F_DGRAM0x0004
 
 struct vsock_transport {
+   struct module *module;
+
/* Initialize/tear-down socket. */
int (*init)(struct vsock_sock *, struct vsock_sock *);
void (*destruct)(struct vsock_sock *);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 85d9a147..1f2e707cae66 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -380,6 +380,16 @@ void vsock_enqueue_accept(struct sock *listener, struct 
sock *connected)
 }
 EXPORT_SYMBOL_GPL(vsock_enqueue_accept);
 
+static void vsock_deassign_transport(struct vsock_sock *vsk)
+{
+   if (!vsk->transport)
+   return;
+
+   vsk->transport->destruct(vsk);
+   module_put(vsk->transport->module);
+   vsk->transport = NULL;
+}
+
 /* Assign a transport to a socket and call the .init transport callback.
  *
  * Note: for stream socket this must be called when vsk->remote_addr is set
@@ -413,10 +423,13 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct 
vsock_sock *psk)
return 0;
 
vsk->transport->release(vsk);
-   vsk->transport->destruct(vsk);
+   vsock_deassign_transport(vsk);
}
 
-   if (!new_transport)
+   /* We increase the module refcnt to prevent the transport unloading
+* while there are open sockets assigned to it.
+*/
+   if (!new_transport || !try_module_get(new_transport->module))
return -ENODEV;
 
vsk->transport = new_transport;
@@ -737,8 +750,7 @@ static void vsock_sk_destruct(struct sock *sk)
 {
struct vsock_sock *vsk = vsock_sk(sk);
 
-   if (vsk->transport)
-   vsk->transport->destruct(vsk);
+   vsock_deassign_transport(vsk);
 
/* When clearing these addresses, there's no need to set the family and
 * possibly register the address family with the kernel.
diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index 0ea66d87af39..d0a349d85414 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -857,6 +857,8 @@ int hvs_notify_send_post_enqueue(struct vsock_sock *vsk, 
ssize_t written,
 }
 
 static struct vsock_transport hvs_transport = {
+   .module   = THIS_MODULE,
+
.get_local_cid= hvs_get_local_cid,
 
.init = hvs_sock_init,
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 83ad85050384..1458c5c8b64d 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -462,6 +462,8 @@ static void virtio_vsock_rx_done(struct virtqueue *vq)
 
 static struct virtio_transport virtio_transport = {
.transport = {
+   .module   = THIS_MODULE,
+
.get_local_cid= virtio_transport_get_local_cid,
 
.init = virtio_transport_do_socket_init,
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index 04437f822d82..0cbf023fae11 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -2019,6 +2019,7 @@ static u32 vmci_transport_get_local_cid(void)
 }
 
 static struct vsock_transport vmci_transport = {
+   .module = THIS_MODULE,
.init = vmci_transport_socket_init,
.destruct = vmci_transport_destruct,
 

[PATCH net-next 14/14] vsock: fix bind() behaviour taking care of CID

2019-10-23 Thread Stefano Garzarella
When we are looking for a socket bound to a specific address,
we also have to take into account the CID.

This patch is useful with multi-transports support because it
allows the binding of the same port with different CID, and
it prevents a connection to a wrong socket bound to the same
port, but with different CID.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 net/vmw_vsock/af_vsock.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 1f2e707cae66..7183de277072 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -228,10 +228,16 @@ static struct sock *__vsock_find_bound_socket(struct 
sockaddr_vm *addr)
 {
struct vsock_sock *vsk;
 
-   list_for_each_entry(vsk, vsock_bound_sockets(addr), bound_table)
-   if (addr->svm_port == vsk->local_addr.svm_port)
+   list_for_each_entry(vsk, vsock_bound_sockets(addr), bound_table) {
+   if (vsock_addr_equals_addr(addr, &vsk->local_addr))
return sk_vsock(vsk);
 
+   if (addr->svm_port == vsk->local_addr.svm_port &&
+   (vsk->local_addr.svm_cid == VMADDR_CID_ANY ||
+addr->svm_cid == VMADDR_CID_ANY))
+   return sk_vsock(vsk);
+   }
+
return NULL;
 }
 
-- 
2.21.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next 10/14] hv_sock: set VMADDR_CID_HOST in the hvs_remote_addr_init()

2019-10-23 Thread Stefano Garzarella
Remote peer is always the host, so we set VMADDR_CID_HOST as
remote CID instead of VMADDR_CID_ANY.

Reviewed-by: Dexuan Cui 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 net/vmw_vsock/hyperv_transport.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index 0ce792a1bf6c..fc7e61765a4a 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -188,7 +188,8 @@ static void hvs_remote_addr_init(struct sockaddr_vm *remote,
static u32 host_ephemeral_port = MIN_HOST_EPHEMERAL_PORT;
struct sock *sk;
 
-   vsock_addr_init(remote, VMADDR_CID_ANY, VMADDR_PORT_ANY);
+   /* Remote peer is always the host */
+   vsock_addr_init(remote, VMADDR_CID_HOST, VMADDR_PORT_ANY);
 
while (1) {
/* Wrap around ? */
-- 
2.21.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next 12/14] vsock/vmci: register vmci_transport only when VMCI guest/host are active

2019-10-23 Thread Stefano Garzarella
To allow other transports to be loaded with vmci_transport,
we register the vmci_transport as G2H or H2G only when a VMCI guest
or host is active.

To do that, this patch adds a callback registered in the vmci driver
that will be called when a new host or guest become active.
This callback will register the vmci_transport in the VSOCK core.
If the transport is already registered, we ignore the error coming
from vsock_core_register().

Cc: Jorgen Hansen 
Signed-off-by: Stefano Garzarella 
---
 drivers/misc/vmw_vmci/vmci_driver.c | 50 +
 drivers/misc/vmw_vmci/vmci_driver.h |  2 ++
 drivers/misc/vmw_vmci/vmci_guest.c  |  2 ++
 drivers/misc/vmw_vmci/vmci_host.c   |  7 
 include/linux/vmw_vmci_api.h|  2 ++
 net/vmw_vsock/vmci_transport.c  | 29 +++--
 6 files changed, 82 insertions(+), 10 deletions(-)

diff --git a/drivers/misc/vmw_vmci/vmci_driver.c 
b/drivers/misc/vmw_vmci/vmci_driver.c
index 819e35995d32..195afbd7edc1 100644
--- a/drivers/misc/vmw_vmci/vmci_driver.c
+++ b/drivers/misc/vmw_vmci/vmci_driver.c
@@ -28,6 +28,9 @@ MODULE_PARM_DESC(disable_guest,
 static bool vmci_guest_personality_initialized;
 static bool vmci_host_personality_initialized;
 
+static DEFINE_MUTEX(vmci_vsock_mutex); /* protects vmci_vsock_transport_cb */
+static vmci_vsock_cb vmci_vsock_transport_cb;
+
 /*
  * vmci_get_context_id() - Gets the current context ID.
  *
@@ -45,6 +48,53 @@ u32 vmci_get_context_id(void)
 }
 EXPORT_SYMBOL_GPL(vmci_get_context_id);
 
+/*
+ * vmci_register_vsock_callback() - Register the VSOCK vmci_transport callback.
+ *
+ * The callback will be called every time a new host or guest become active,
+ * or if they are already active when this function is called.
+ * To unregister the callback, call this function with NULL parameter.
+ *
+ * Returns 0 on success. -EBUSY if a callback is already registered.
+ */
+int vmci_register_vsock_callback(vmci_vsock_cb callback)
+{
+   int err = 0;
+
+   mutex_lock(&vmci_vsock_mutex);
+
+   if (vmci_vsock_transport_cb && callback) {
+   err = -EBUSY;
+   goto out;
+   }
+
+   vmci_vsock_transport_cb = callback;
+
+   if (!vmci_vsock_transport_cb)
+   goto out;
+
+   if (vmci_guest_code_active())
+   vmci_vsock_transport_cb(false);
+
+   if (vmci_host_users() > 0)
+   vmci_vsock_transport_cb(true);
+
+out:
+   mutex_unlock(&vmci_vsock_mutex);
+   return err;
+}
+EXPORT_SYMBOL_GPL(vmci_register_vsock_callback);
+
+void vmci_call_vsock_callback(bool is_host)
+{
+   mutex_lock(&vmci_vsock_mutex);
+
+   if (vmci_vsock_transport_cb)
+   vmci_vsock_transport_cb(is_host);
+
+   mutex_unlock(&vmci_vsock_mutex);
+}
+
 static int __init vmci_drv_init(void)
 {
int vmci_err;
diff --git a/drivers/misc/vmw_vmci/vmci_driver.h 
b/drivers/misc/vmw_vmci/vmci_driver.h
index aab81b67670c..990682480bf6 100644
--- a/drivers/misc/vmw_vmci/vmci_driver.h
+++ b/drivers/misc/vmw_vmci/vmci_driver.h
@@ -36,10 +36,12 @@ extern struct pci_dev *vmci_pdev;
 
 u32 vmci_get_context_id(void);
 int vmci_send_datagram(struct vmci_datagram *dg);
+void vmci_call_vsock_callback(bool is_host);
 
 int vmci_host_init(void);
 void vmci_host_exit(void);
 bool vmci_host_code_active(void);
+int vmci_host_users(void);
 
 int vmci_guest_init(void);
 void vmci_guest_exit(void);
diff --git a/drivers/misc/vmw_vmci/vmci_guest.c 
b/drivers/misc/vmw_vmci/vmci_guest.c
index 7a84a48c75da..cc8eeb361fcd 100644
--- a/drivers/misc/vmw_vmci/vmci_guest.c
+++ b/drivers/misc/vmw_vmci/vmci_guest.c
@@ -637,6 +637,8 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
  vmci_dev->iobase + VMCI_CONTROL_ADDR);
 
pci_set_drvdata(pdev, vmci_dev);
+
+   vmci_call_vsock_callback(false);
return 0;
 
 err_free_irq:
diff --git a/drivers/misc/vmw_vmci/vmci_host.c 
b/drivers/misc/vmw_vmci/vmci_host.c
index 833e2bd248a5..ff3c396146ff 100644
--- a/drivers/misc/vmw_vmci/vmci_host.c
+++ b/drivers/misc/vmw_vmci/vmci_host.c
@@ -108,6 +108,11 @@ bool vmci_host_code_active(void)
 atomic_read(&vmci_host_active_users) > 0);
 }
 
+int vmci_host_users(void)
+{
+   return atomic_read(&vmci_host_active_users);
+}
+
 /*
  * Called on open of /dev/vmci.
  */
@@ -338,6 +343,8 @@ static int vmci_host_do_init_context(struct vmci_host_dev 
*vmci_host_dev,
vmci_host_dev->ct_type = VMCIOBJ_CONTEXT;
atomic_inc(&vmci_host_active_users);
 
+   vmci_call_vsock_callback(true);
+
retval = 0;
 
 out:
diff --git a/include/linux/vmw_vmci_api.h b/include/linux/vmw_vmci_api.h
index acd9fafe4fc6..f28907345c80 100644
--- a/include/linux/vmw_vmci_api.h
+++ b/include/linux/vmw_vmci_api.h
@@ -19,6 +19,7 @@
 struct msghdr;
 typedef void (vmci_device_shutdown_fn) (void *device_registration,
void *user_data);
+typedef void (*vmci_vsock_cb) (bool is_host);
 
 int vmci_datagram_creat

[PATCH net-next 11/14] vsock: add multi-transports support

2019-10-23 Thread Stefano Garzarella
This patch adds the support of multiple transports in the
VSOCK core.

With the multi-transports support, we can use vsock with nested VMs
(using also different hypervisors) loading both guest->host and
host->guest transports at the same time.

Major changes:
- vsock core module can be loaded regardless of the transports
- vsock_core_init() and vsock_core_exit() are renamed to
  vsock_core_register() and vsock_core_unregister()
- vsock_core_register() has a feature parameter (H2G, G2H, DGRAM)
  to identify which directions the transport can handle and if it's
  support DGRAM (only vmci)
- each stream socket is assigned to a transport when the remote CID
  is set (during the connect() or when we receive a connection request
  on a listener socket).
  The remote CID is used to decide which transport to use:
  - remote CID > VMADDR_CID_HOST will use host->guest transport
  - remote CID <= VMADDR_CID_HOST will use guest->host transport
- listener sockets are not bound to any transports since no transport
  operations are done on it. In this way we can create a listener
  socket, also if the transports are not loaded or with VMADDR_CID_ANY
  to listen on all transports.
- DGRAM sockets are handled as before, since only the vmci_transport
  provides this feature.

Signed-off-by: Stefano Garzarella 
---
RFC -> v1:
- documented VSOCK_TRANSPORT_F_* flags
- fixed vsock_assign_transport() when the socket is already assigned
  (e.g connection failed)
- moved features outside of struct vsock_transport, and used as
  parameter of vsock_core_register()
---
 drivers/vhost/vsock.c   |   5 +-
 include/net/af_vsock.h  |  17 +-
 net/vmw_vsock/af_vsock.c| 237 ++--
 net/vmw_vsock/hyperv_transport.c|  26 ++-
 net/vmw_vsock/virtio_transport.c|   7 +-
 net/vmw_vsock/virtio_transport_common.c |  28 ++-
 net/vmw_vsock/vmci_transport.c  |  31 +++-
 7 files changed, 270 insertions(+), 81 deletions(-)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 6d7e4f022748..b235f4bbe8ea 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -831,7 +831,8 @@ static int __init vhost_vsock_init(void)
 {
int ret;
 
-   ret = vsock_core_init(&vhost_transport.transport);
+   ret = vsock_core_register(&vhost_transport.transport,
+ VSOCK_TRANSPORT_F_H2G);
if (ret < 0)
return ret;
return misc_register(&vhost_vsock_misc);
@@ -840,7 +841,7 @@ static int __init vhost_vsock_init(void)
 static void __exit vhost_vsock_exit(void)
 {
misc_deregister(&vhost_vsock_misc);
-   vsock_core_exit();
+   vsock_core_unregister(&vhost_transport.transport);
 };
 
 module_init(vhost_vsock_init);
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index fa1570dc9f5c..27a3463e4892 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -91,6 +91,14 @@ struct vsock_transport_send_notify_data {
u64 data2; /* Transport-defined. */
 };
 
+/* Transport features flags */
+/* Transport provides host->guest communication */
+#define VSOCK_TRANSPORT_F_H2G  0x0001
+/* Transport provides guest->host communication */
+#define VSOCK_TRANSPORT_F_G2H  0x0002
+/* Transport provides DGRAM communication */
+#define VSOCK_TRANSPORT_F_DGRAM0x0004
+
 struct vsock_transport {
/* Initialize/tear-down socket. */
int (*init)(struct vsock_sock *, struct vsock_sock *);
@@ -154,12 +162,8 @@ struct vsock_transport {
 
 / CORE /
 
-int __vsock_core_init(const struct vsock_transport *t, struct module *owner);
-static inline int vsock_core_init(const struct vsock_transport *t)
-{
-   return __vsock_core_init(t, THIS_MODULE);
-}
-void vsock_core_exit(void);
+int vsock_core_register(const struct vsock_transport *t, int features);
+void vsock_core_unregister(const struct vsock_transport *t);
 
 /* The transport may downcast this to access transport-specific functions */
 const struct vsock_transport *vsock_core_get_transport(struct vsock_sock *vsk);
@@ -190,6 +194,7 @@ struct sock *vsock_find_connected_socket(struct sockaddr_vm 
*src,
 struct sockaddr_vm *dst);
 void vsock_remove_sock(struct vsock_sock *vsk);
 void vsock_for_each_connected_socket(void (*fn)(struct sock *sk));
+int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk);
 
 / TAP /
 
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index d89381166028..85d9a147 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -130,7 +130,12 @@ static struct proto vsock_proto = {
 #define VSOCK_DEFAULT_BUFFER_MAX_SIZE (1024 * 256)
 #define VSOCK_DEFAULT_BUFFER_MIN_SIZE 128
 
-static const struct vsock_transport *transport_single;
+/* Transport used for host->guest communication */
+static const struct vsock_transport *transport_h2g;
+/* Transport use

[PATCH net-next 06/14] vsock: add 'struct vsock_sock *' param to vsock_core_get_transport()

2019-10-23 Thread Stefano Garzarella
Since now the 'struct vsock_sock' object contains a pointer to
the transport, this patch adds a parameter to the
vsock_core_get_transport() to return the right transport
assigned to the socket.

This patch modifies also the virtio_transport_get_ops(), that
uses the vsock_core_get_transport(), adding the
'struct vsock_sock *' parameter.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
RFC -> v1:
- Removed comment about protecting transport_single (Stefan)
---
 include/net/af_vsock.h  | 2 +-
 net/vmw_vsock/af_vsock.c| 7 ++-
 net/vmw_vsock/virtio_transport_common.c | 9 +
 3 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index a5e1e134261d..2ca67d048de4 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -166,7 +166,7 @@ static inline int vsock_core_init(const struct 
vsock_transport *t)
 void vsock_core_exit(void);
 
 /* The transport may downcast this to access transport-specific functions */
-const struct vsock_transport *vsock_core_get_transport(void);
+const struct vsock_transport *vsock_core_get_transport(struct vsock_sock *vsk);
 
 / UTILS /
 
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index c3a14f853eb0..eaea159006c8 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -2001,12 +2001,9 @@ void vsock_core_exit(void)
 }
 EXPORT_SYMBOL_GPL(vsock_core_exit);
 
-const struct vsock_transport *vsock_core_get_transport(void)
+const struct vsock_transport *vsock_core_get_transport(struct vsock_sock *vsk)
 {
-   /* vsock_register_mutex not taken since only the transport uses this
-* function and only while registered.
-*/
-   return transport_single;
+   return vsk->transport;
 }
 EXPORT_SYMBOL_GPL(vsock_core_get_transport);
 
diff --git a/net/vmw_vsock/virtio_transport_common.c 
b/net/vmw_vsock/virtio_transport_common.c
index 9763394f7a61..37a1c7e7c7fe 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -29,9 +29,10 @@
 /* Threshold for detecting small packets to copy */
 #define GOOD_COPY_LEN  128
 
-static const struct virtio_transport *virtio_transport_get_ops(void)
+static const struct virtio_transport *
+virtio_transport_get_ops(struct vsock_sock *vsk)
 {
-   const struct vsock_transport *t = vsock_core_get_transport();
+   const struct vsock_transport *t = vsock_core_get_transport(vsk);
 
return container_of(t, struct virtio_transport, transport);
 }
@@ -168,7 +169,7 @@ static int virtio_transport_send_pkt_info(struct vsock_sock 
*vsk,
struct virtio_vsock_pkt *pkt;
u32 pkt_len = info->pkt_len;
 
-   src_cid = virtio_transport_get_ops()->transport.get_local_cid();
+   src_cid = virtio_transport_get_ops(vsk)->transport.get_local_cid();
src_port = vsk->local_addr.svm_port;
if (!info->remote_cid) {
dst_cid = vsk->remote_addr.svm_cid;
@@ -201,7 +202,7 @@ static int virtio_transport_send_pkt_info(struct vsock_sock 
*vsk,
 
virtio_transport_inc_tx_pkt(vvs, pkt);
 
-   return virtio_transport_get_ops()->send_pkt(pkt);
+   return virtio_transport_get_ops(vsk)->send_pkt(pkt);
 }
 
 static bool virtio_transport_inc_rx_pkt(struct virtio_vsock_sock *vvs,
-- 
2.21.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC 2/2] vhost: IFC VF vdpa layer

2019-10-23 Thread Jason Wang


On 2019/10/23 下午5:24, Zhu, Lingshan wrote:



set_config/get_config is missing. It looks to me they are not 
hard, just implementing the access to dev_cfg. It's key to make 
kernel virtio driver to work.


And in the new version of virito-mdev, features like _F_LOG_ALL 
should be advertised through get_mdev_features.
IMHO, currently the driver can work without set/get_config, 
vhost_mdev doesn't call them for now.



Yes, but it was required by virtio_mdev for host driver to work, and 
it looks to me it's not hard to add them. If possible please add 
them and "virtio" type then we can use the ops for both the case of 
VM and containers.

sure


Hello Jason,

Just want to double confirm the implementation of set/get_config, for 
now, dev_cfg only contains mac[6], status and max_virtqueue_pairs, is 
that enough to support virtio_mdev?


THanks!



Yes, and it depends on the features that you want to advertise. If you 
don't want to advertise MQ, there's no need to expose max_virtqueue_pairs.


Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH net-next 09/14] vsock: move vsock_insert_unbound() in the vsock_create()

2019-10-23 Thread Stefano Garzarella
vsock_insert_unbound() was called only when 'sock' parameter of
__vsock_create() was not null. This only happened when
__vsock_create() was called by vsock_create().

In order to simplify the multi-transports support, this patch
moves vsock_insert_unbound() at the end of vsock_create().

Reviewed-by: Dexuan Cui 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 net/vmw_vsock/af_vsock.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 95878bed2c67..d89381166028 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -634,9 +634,6 @@ static struct sock *__vsock_create(struct net *net,
return NULL;
}
 
-   if (sock)
-   vsock_insert_unbound(vsk);
-
return sk;
 }
 
@@ -1889,6 +1886,8 @@ static const struct proto_ops vsock_stream_ops = {
 static int vsock_create(struct net *net, struct socket *sock,
int protocol, int kern)
 {
+   struct sock *sk;
+
if (!sock)
return -EINVAL;
 
@@ -1908,7 +1907,13 @@ static int vsock_create(struct net *net, struct socket 
*sock,
 
sock->state = SS_UNCONNECTED;
 
-   return __vsock_create(net, sock, NULL, GFP_KERNEL, 0, kern) ? 0 : 
-ENOMEM;
+   sk = __vsock_create(net, sock, NULL, GFP_KERNEL, 0, kern);
+   if (!sk)
+   return -ENOMEM;
+
+   vsock_insert_unbound(vsock_sk(sk));
+
+   return 0;
 }
 
 static const struct net_proto_family vsock_family_ops = {
-- 
2.21.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next 05/14] vsock/virtio: add transport parameter to the virtio_transport_reset_no_sock()

2019-10-23 Thread Stefano Garzarella
We are going to add 'struct vsock_sock *' parameter to
virtio_transport_get_ops().

In some cases, like in the virtio_transport_reset_no_sock(),
we don't have any socket assigned to the packet received,
so we can't use the virtio_transport_get_ops().

In order to allow virtio_transport_reset_no_sock() to use the
'.send_pkt' callback from the 'vhost_transport' or 'virtio_transport',
we add the 'struct virtio_transport *' to it and to its caller:
virtio_transport_recv_pkt().

We moved the 'vhost_transport' and 'virtio_transport' definition,
to pass their address to the virtio_transport_recv_pkt().

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 drivers/vhost/vsock.c   |  94 +++---
 include/linux/virtio_vsock.h|   3 +-
 net/vmw_vsock/virtio_transport.c| 160 
 net/vmw_vsock/virtio_transport_common.c |  12 +-
 4 files changed, 135 insertions(+), 134 deletions(-)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 9f57736fe15e..92ab3852c954 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -384,6 +384,52 @@ static bool vhost_vsock_more_replies(struct vhost_vsock 
*vsock)
return val < vq->num;
 }
 
+static struct virtio_transport vhost_transport = {
+   .transport = {
+   .get_local_cid= vhost_transport_get_local_cid,
+
+   .init = virtio_transport_do_socket_init,
+   .destruct = virtio_transport_destruct,
+   .release  = virtio_transport_release,
+   .connect  = virtio_transport_connect,
+   .shutdown = virtio_transport_shutdown,
+   .cancel_pkt   = vhost_transport_cancel_pkt,
+
+   .dgram_enqueue= virtio_transport_dgram_enqueue,
+   .dgram_dequeue= virtio_transport_dgram_dequeue,
+   .dgram_bind   = virtio_transport_dgram_bind,
+   .dgram_allow  = virtio_transport_dgram_allow,
+
+   .stream_enqueue   = virtio_transport_stream_enqueue,
+   .stream_dequeue   = virtio_transport_stream_dequeue,
+   .stream_has_data  = virtio_transport_stream_has_data,
+   .stream_has_space = virtio_transport_stream_has_space,
+   .stream_rcvhiwat  = virtio_transport_stream_rcvhiwat,
+   .stream_is_active = virtio_transport_stream_is_active,
+   .stream_allow = virtio_transport_stream_allow,
+
+   .notify_poll_in   = virtio_transport_notify_poll_in,
+   .notify_poll_out  = virtio_transport_notify_poll_out,
+   .notify_recv_init = virtio_transport_notify_recv_init,
+   .notify_recv_pre_block= 
virtio_transport_notify_recv_pre_block,
+   .notify_recv_pre_dequeue  = 
virtio_transport_notify_recv_pre_dequeue,
+   .notify_recv_post_dequeue = 
virtio_transport_notify_recv_post_dequeue,
+   .notify_send_init = virtio_transport_notify_send_init,
+   .notify_send_pre_block= 
virtio_transport_notify_send_pre_block,
+   .notify_send_pre_enqueue  = 
virtio_transport_notify_send_pre_enqueue,
+   .notify_send_post_enqueue = 
virtio_transport_notify_send_post_enqueue,
+
+   .set_buffer_size  = virtio_transport_set_buffer_size,
+   .set_min_buffer_size  = 
virtio_transport_set_min_buffer_size,
+   .set_max_buffer_size  = 
virtio_transport_set_max_buffer_size,
+   .get_buffer_size  = virtio_transport_get_buffer_size,
+   .get_min_buffer_size  = 
virtio_transport_get_min_buffer_size,
+   .get_max_buffer_size  = 
virtio_transport_get_max_buffer_size,
+   },
+
+   .send_pkt = vhost_transport_send_pkt,
+};
+
 static void vhost_vsock_handle_tx_kick(struct vhost_work *work)
 {
struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
@@ -438,7 +484,7 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work 
*work)
 
/* Only accept correctly addressed packets */
if (le64_to_cpu(pkt->hdr.src_cid) == vsock->guest_cid)
-   virtio_transport_recv_pkt(pkt);
+   virtio_transport_recv_pkt(&vhost_transport, pkt);
else
virtio_transport_free_pkt(pkt);
 
@@ -786,52 +832,6 @@ static struct miscdevice vhost_vsock_misc = {
.fops = &vhost_vsock_fops,
 };
 
-static struct virtio_transport vhost_transport = {
-   .transport = {
-   .get_local_cid= vhost_transport_get_local_cid,
-
-   .init = virtio_transport_do_socket_init,
-   .destruct   

[PATCH net-next 07/14] vsock: handle buffer_size sockopts in the core

2019-10-23 Thread Stefano Garzarella
virtio_transport and vmci_transport handle the buffer_size
sockopts in a very similar way.

In order to support multiple transports, this patch moves this
handling in the core to allow the user to change the options
also if the socket is not yet assigned to any transport.

This patch also adds the '.notify_buffer_size' callback in the
'struct virtio_transport' in order to inform the transport,
when the buffer_size is changed by the user. It is also useful
to limit the 'buffer_size' requested (e.g. virtio transports).

Acked-by: Dexuan Cui 
Signed-off-by: Stefano Garzarella 
---
RFC -> v1:
- changed .notify_buffer_size return to void (Stefan)
- documented that .notify_buffer_size is called with sk_lock held (Stefan)
---
 drivers/vhost/vsock.c   |  7 +-
 include/linux/virtio_vsock.h| 15 +
 include/net/af_vsock.h  | 15 ++---
 net/vmw_vsock/af_vsock.c| 43 ++---
 net/vmw_vsock/hyperv_transport.c| 36 ---
 net/vmw_vsock/virtio_transport.c|  8 +--
 net/vmw_vsock/virtio_transport_common.c | 79 ---
 net/vmw_vsock/vmci_transport.c  | 86 +++--
 net/vmw_vsock/vmci_transport.h  |  3 -
 9 files changed, 65 insertions(+), 227 deletions(-)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 92ab3852c954..6d7e4f022748 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -418,13 +418,8 @@ static struct virtio_transport vhost_transport = {
.notify_send_pre_block= 
virtio_transport_notify_send_pre_block,
.notify_send_pre_enqueue  = 
virtio_transport_notify_send_pre_enqueue,
.notify_send_post_enqueue = 
virtio_transport_notify_send_post_enqueue,
+   .notify_buffer_size   = virtio_transport_notify_buffer_size,
 
-   .set_buffer_size  = virtio_transport_set_buffer_size,
-   .set_min_buffer_size  = 
virtio_transport_set_min_buffer_size,
-   .set_max_buffer_size  = 
virtio_transport_set_max_buffer_size,
-   .get_buffer_size  = virtio_transport_get_buffer_size,
-   .get_min_buffer_size  = 
virtio_transport_get_min_buffer_size,
-   .get_max_buffer_size  = 
virtio_transport_get_max_buffer_size,
},
 
.send_pkt = vhost_transport_send_pkt,
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 96d8132acbd7..b79befd2a5a4 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -7,9 +7,6 @@
 #include 
 #include 
 
-#define VIRTIO_VSOCK_DEFAULT_MIN_BUF_SIZE  128
-#define VIRTIO_VSOCK_DEFAULT_BUF_SIZE  (1024 * 256)
-#define VIRTIO_VSOCK_DEFAULT_MAX_BUF_SIZE  (1024 * 256)
 #define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE   (1024 * 4)
 #define VIRTIO_VSOCK_MAX_BUF_SIZE  0xUL
 #define VIRTIO_VSOCK_MAX_PKT_BUF_SIZE  (1024 * 64)
@@ -25,11 +22,6 @@ enum {
 struct virtio_vsock_sock {
struct vsock_sock *vsk;
 
-   /* Protected by lock_sock(sk_vsock(trans->vsk)) */
-   u32 buf_size;
-   u32 buf_size_min;
-   u32 buf_size_max;
-
spinlock_t tx_lock;
spinlock_t rx_lock;
 
@@ -93,12 +85,6 @@ s64 virtio_transport_stream_has_space(struct vsock_sock 
*vsk);
 
 int virtio_transport_do_socket_init(struct vsock_sock *vsk,
 struct vsock_sock *psk);
-u64 virtio_transport_get_buffer_size(struct vsock_sock *vsk);
-u64 virtio_transport_get_min_buffer_size(struct vsock_sock *vsk);
-u64 virtio_transport_get_max_buffer_size(struct vsock_sock *vsk);
-void virtio_transport_set_buffer_size(struct vsock_sock *vsk, u64 val);
-void virtio_transport_set_min_buffer_size(struct vsock_sock *vsk, u64 val);
-void virtio_transport_set_max_buffer_size(struct vsock_sock *vs, u64 val);
 int
 virtio_transport_notify_poll_in(struct vsock_sock *vsk,
size_t target,
@@ -125,6 +111,7 @@ int virtio_transport_notify_send_pre_enqueue(struct 
vsock_sock *vsk,
struct vsock_transport_send_notify_data *data);
 int virtio_transport_notify_send_post_enqueue(struct vsock_sock *vsk,
ssize_t written, struct vsock_transport_send_notify_data *data);
+void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val);
 
 u64 virtio_transport_stream_rcvhiwat(struct vsock_sock *vsk);
 bool virtio_transport_stream_is_active(struct vsock_sock *vsk);
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 2ca67d048de4..4b5d16840fd4 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -65,6 +65,11 @@ struct vsock_sock {
bool sent_request;
bool ignore_connecting_rst;
 
+   /* Protected by lock_sock(sk) */
+   u64 buffer_size;
+   u64 buffer_min_size;
+   u64 buffer_max_size;
+
/* Private to transport. */
void *trans;
 };
@@ -140,18 +145,12 @@ struct vsock_transport {

[PATCH net-next 08/14] vsock: add vsock_create_connected() called by transports

2019-10-23 Thread Stefano Garzarella
All transports call __vsock_create() with the same parameters,
most of them depending on the parent socket. In order to simplify
the VSOCK core APIs exposed to the transports, this patch adds
the vsock_create_connected() callable from transports to create
a new socket when a connection request is received.
We also unexported the __vsock_create().

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 include/net/af_vsock.h  |  5 +
 net/vmw_vsock/af_vsock.c| 20 +---
 net/vmw_vsock/hyperv_transport.c|  3 +--
 net/vmw_vsock/virtio_transport_common.c |  3 +--
 net/vmw_vsock/vmci_transport.c  |  3 +--
 5 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 4b5d16840fd4..fa1570dc9f5c 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -76,10 +76,7 @@ struct vsock_sock {
 
 s64 vsock_stream_has_data(struct vsock_sock *vsk);
 s64 vsock_stream_has_space(struct vsock_sock *vsk);
-struct sock *__vsock_create(struct net *net,
-   struct socket *sock,
-   struct sock *parent,
-   gfp_t priority, unsigned short type, int kern);
+struct sock *vsock_create_connected(struct sock *parent);
 
 / TRANSPORT /
 
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 90ac46ea12ef..95878bed2c67 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -567,12 +567,12 @@ static int __vsock_bind(struct sock *sk, struct 
sockaddr_vm *addr)
 
 static void vsock_connect_timeout(struct work_struct *work);
 
-struct sock *__vsock_create(struct net *net,
-   struct socket *sock,
-   struct sock *parent,
-   gfp_t priority,
-   unsigned short type,
-   int kern)
+static struct sock *__vsock_create(struct net *net,
+  struct socket *sock,
+  struct sock *parent,
+  gfp_t priority,
+  unsigned short type,
+  int kern)
 {
struct sock *sk;
struct vsock_sock *psk;
@@ -639,7 +639,6 @@ struct sock *__vsock_create(struct net *net,
 
return sk;
 }
-EXPORT_SYMBOL_GPL(__vsock_create);
 
 static void __vsock_release(struct sock *sk, int level)
 {
@@ -705,6 +704,13 @@ static int vsock_queue_rcv_skb(struct sock *sk, struct 
sk_buff *skb)
return err;
 }
 
+struct sock *vsock_create_connected(struct sock *parent)
+{
+   return __vsock_create(sock_net(parent), NULL, parent, GFP_KERNEL,
+ parent->sk_type, 0);
+}
+EXPORT_SYMBOL_GPL(vsock_create_connected);
+
 s64 vsock_stream_has_data(struct vsock_sock *vsk)
 {
return vsk->transport->stream_has_data(vsk);
diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index d62297a62ca6..0ce792a1bf6c 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -360,8 +360,7 @@ static void hvs_open_connection(struct vmbus_channel *chan)
if (sk->sk_ack_backlog >= sk->sk_max_ack_backlog)
goto out;
 
-   new = __vsock_create(sock_net(sk), NULL, sk, GFP_KERNEL,
-sk->sk_type, 0);
+   new = vsock_create_connected(sk);
if (!new)
goto out;
 
diff --git a/net/vmw_vsock/virtio_transport_common.c 
b/net/vmw_vsock/virtio_transport_common.c
index b2a310dfa158..f7d0ecbd8f97 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -1002,8 +1002,7 @@ virtio_transport_recv_listen(struct sock *sk, struct 
virtio_vsock_pkt *pkt)
return -ENOMEM;
}
 
-   child = __vsock_create(sock_net(sk), NULL, sk, GFP_KERNEL,
-  sk->sk_type, 0);
+   child = vsock_create_connected(sk);
if (!child) {
virtio_transport_reset(vsk, pkt);
return -ENOMEM;
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index 8290d37b6587..5955238ffc13 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1004,8 +1004,7 @@ static int vmci_transport_recv_listen(struct sock *sk,
return -ECONNREFUSED;
}
 
-   pending = __vsock_create(sock_net(sk), NULL, sk, GFP_KERNEL,
-sk->sk_type, 0);
+   pending = vsock_create_connected(sk);
if (!pending) {
vmci_transport_send_reset(sk, pkt);
return -ENOMEM;
-- 
2.21.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/

[PATCH net-next 04/14] vsock: add 'transport' member in the struct vsock_sock

2019-10-23 Thread Stefano Garzarella
As a preparation to support multiple transports, this patch adds
the 'transport' member at the 'struct vsock_sock'.
This new field is initialized during the creation in the
__vsock_create() function.

This patch also renames the global 'transport' pointer to
'transport_single', since for now we're only supporting a single
transport registered at run-time.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 include/net/af_vsock.h   |  1 +
 net/vmw_vsock/af_vsock.c | 56 +++-
 2 files changed, 39 insertions(+), 18 deletions(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index c660402b10f2..a5e1e134261d 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -27,6 +27,7 @@ extern spinlock_t vsock_table_lock;
 struct vsock_sock {
/* sk must be the first member. */
struct sock sk;
+   const struct vsock_transport *transport;
struct sockaddr_vm local_addr;
struct sockaddr_vm remote_addr;
/* Links for the global tables of bound and connected sockets. */
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 2f2582fb7fdd..c3a14f853eb0 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -126,7 +126,7 @@ static struct proto vsock_proto = {
  */
 #define VSOCK_DEFAULT_CONNECT_TIMEOUT (2 * HZ)
 
-static const struct vsock_transport *transport;
+static const struct vsock_transport *transport_single;
 static DEFINE_MUTEX(vsock_register_mutex);
 
 / UTILS /
@@ -408,7 +408,9 @@ static bool vsock_is_pending(struct sock *sk)
 
 static int vsock_send_shutdown(struct sock *sk, int mode)
 {
-   return transport->shutdown(vsock_sk(sk), mode);
+   struct vsock_sock *vsk = vsock_sk(sk);
+
+   return vsk->transport->shutdown(vsk, mode);
 }
 
 static void vsock_pending_work(struct work_struct *work)
@@ -518,7 +520,7 @@ static int __vsock_bind_stream(struct vsock_sock *vsk,
 static int __vsock_bind_dgram(struct vsock_sock *vsk,
  struct sockaddr_vm *addr)
 {
-   return transport->dgram_bind(vsk, addr);
+   return vsk->transport->dgram_bind(vsk, addr);
 }
 
 static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr)
@@ -536,7 +538,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm 
*addr)
 * like AF_INET prevents binding to a non-local IP address (in most
 * cases), we only allow binding to the local CID.
 */
-   cid = transport->get_local_cid();
+   cid = vsk->transport->get_local_cid();
if (addr->svm_cid != cid && addr->svm_cid != VMADDR_CID_ANY)
return -EADDRNOTAVAIL;
 
@@ -586,6 +588,7 @@ struct sock *__vsock_create(struct net *net,
sk->sk_type = type;
 
vsk = vsock_sk(sk);
+   vsk->transport = transport_single;
vsock_addr_init(&vsk->local_addr, VMADDR_CID_ANY, VMADDR_PORT_ANY);
vsock_addr_init(&vsk->remote_addr, VMADDR_CID_ANY, VMADDR_PORT_ANY);
 
@@ -616,7 +619,7 @@ struct sock *__vsock_create(struct net *net,
vsk->connect_timeout = VSOCK_DEFAULT_CONNECT_TIMEOUT;
}
 
-   if (transport->init(vsk, psk) < 0) {
+   if (vsk->transport->init(vsk, psk) < 0) {
sk_free(sk);
return NULL;
}
@@ -641,7 +644,7 @@ static void __vsock_release(struct sock *sk, int level)
/* The release call is supposed to use lock_sock_nested()
 * rather than lock_sock(), if a sock lock should be acquired.
 */
-   transport->release(vsk);
+   vsk->transport->release(vsk);
 
/* When "level" is SINGLE_DEPTH_NESTING, use the nested
 * version to avoid the warning "possible recursive locking
@@ -670,7 +673,7 @@ static void vsock_sk_destruct(struct sock *sk)
 {
struct vsock_sock *vsk = vsock_sk(sk);
 
-   transport->destruct(vsk);
+   vsk->transport->destruct(vsk);
 
/* When clearing these addresses, there's no need to set the family and
 * possibly register the address family with the kernel.
@@ -694,13 +697,13 @@ static int vsock_queue_rcv_skb(struct sock *sk, struct 
sk_buff *skb)
 
 s64 vsock_stream_has_data(struct vsock_sock *vsk)
 {
-   return transport->stream_has_data(vsk);
+   return vsk->transport->stream_has_data(vsk);
 }
 EXPORT_SYMBOL_GPL(vsock_stream_has_data);
 
 s64 vsock_stream_has_space(struct vsock_sock *vsk)
 {
-   return transport->stream_has_space(vsk);
+   return vsk->transport->stream_has_space(vsk);
 }
 EXPORT_SYMBOL_GPL(vsock_stream_has_space);
 
@@ -869,6 +872,7 @@ static __poll_t vsock_poll(struct file *file, struct socket 
*sock,
mask |= EPOLLOUT | EPOLLWRNORM | EPOLLWRBAND;
 
} else if (sock->type == SOCK_STREAM) {
+   const struct vsock_transport *transport = vsk->transport;
lock_sock(sk);
 
/* Listening s

[PATCH net-next 03/14] vsock: remove include/linux/vm_sockets.h file

2019-10-23 Thread Stefano Garzarella
This header file now only includes the "uapi/linux/vm_sockets.h".
We can include directly it when needed.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 include/linux/vm_sockets.h| 13 -
 include/net/af_vsock.h|  2 +-
 include/net/vsock_addr.h  |  2 +-
 net/vmw_vsock/vmci_transport_notify.h |  1 -
 4 files changed, 2 insertions(+), 16 deletions(-)
 delete mode 100644 include/linux/vm_sockets.h

diff --git a/include/linux/vm_sockets.h b/include/linux/vm_sockets.h
deleted file mode 100644
index 7dd899ccb920..
--- a/include/linux/vm_sockets.h
+++ /dev/null
@@ -1,13 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * VMware vSockets Driver
- *
- * Copyright (C) 2007-2013 VMware, Inc. All rights reserved.
- */
-
-#ifndef _VM_SOCKETS_H
-#define _VM_SOCKETS_H
-
-#include 
-
-#endif /* _VM_SOCKETS_H */
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 80ea0f93d3f7..c660402b10f2 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -10,7 +10,7 @@
 
 #include 
 #include 
-#include 
+#include 
 
 #include "vsock_addr.h"
 
diff --git a/include/net/vsock_addr.h b/include/net/vsock_addr.h
index 57d2db5c4bdf..cf8cc140d68d 100644
--- a/include/net/vsock_addr.h
+++ b/include/net/vsock_addr.h
@@ -8,7 +8,7 @@
 #ifndef _VSOCK_ADDR_H_
 #define _VSOCK_ADDR_H_
 
-#include 
+#include 
 
 void vsock_addr_init(struct sockaddr_vm *addr, u32 cid, u32 port);
 int vsock_addr_validate(const struct sockaddr_vm *addr);
diff --git a/net/vmw_vsock/vmci_transport_notify.h 
b/net/vmw_vsock/vmci_transport_notify.h
index 7843f08d4290..a1aa5a998c0e 100644
--- a/net/vmw_vsock/vmci_transport_notify.h
+++ b/net/vmw_vsock/vmci_transport_notify.h
@@ -11,7 +11,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "vmci_transport.h"
 
-- 
2.21.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next 02/14] vsock: remove vm_sockets_get_local_cid()

2019-10-23 Thread Stefano Garzarella
vm_sockets_get_local_cid() is only used in virtio_transport_common.c.
We can replace it calling the virtio_transport_get_ops() and
using the get_local_cid() callback registered by the transport.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 include/linux/vm_sockets.h  |  2 --
 net/vmw_vsock/af_vsock.c| 10 --
 net/vmw_vsock/virtio_transport_common.c |  2 +-
 3 files changed, 1 insertion(+), 13 deletions(-)

diff --git a/include/linux/vm_sockets.h b/include/linux/vm_sockets.h
index 33f1a2ecd905..7dd899ccb920 100644
--- a/include/linux/vm_sockets.h
+++ b/include/linux/vm_sockets.h
@@ -10,6 +10,4 @@
 
 #include 
 
-int vm_sockets_get_local_cid(void);
-
 #endif /* _VM_SOCKETS_H */
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 2ab43b2bba31..2f2582fb7fdd 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -129,16 +129,6 @@ static struct proto vsock_proto = {
 static const struct vsock_transport *transport;
 static DEFINE_MUTEX(vsock_register_mutex);
 
-/ EXPORTS /
-
-/* Get the ID of the local context.  This is transport dependent. */
-
-int vm_sockets_get_local_cid(void)
-{
-   return transport->get_local_cid();
-}
-EXPORT_SYMBOL_GPL(vm_sockets_get_local_cid);
-
 / UTILS /
 
 /* Each bound VSocket is stored in the bind hash table and each connected
diff --git a/net/vmw_vsock/virtio_transport_common.c 
b/net/vmw_vsock/virtio_transport_common.c
index d02c9b41a768..b1cd16ed66ea 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -168,7 +168,7 @@ static int virtio_transport_send_pkt_info(struct vsock_sock 
*vsk,
struct virtio_vsock_pkt *pkt;
u32 pkt_len = info->pkt_len;
 
-   src_cid = vm_sockets_get_local_cid();
+   src_cid = virtio_transport_get_ops()->transport.get_local_cid();
src_port = vsk->local_addr.svm_port;
if (!info->remote_cid) {
dst_cid = vsk->remote_addr.svm_cid;
-- 
2.21.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next 01/14] vsock/vmci: remove unused VSOCK_DEFAULT_CONNECT_TIMEOUT

2019-10-23 Thread Stefano Garzarella
The VSOCK_DEFAULT_CONNECT_TIMEOUT definition was introduced with
commit d021c344051af ("VSOCK: Introduce VM Sockets"), but it is
never used in the net/vmw_vsock/vmci_transport.c.

VSOCK_DEFAULT_CONNECT_TIMEOUT is used and defined in
net/vmw_vsock/af_vsock.c

Cc: Jorgen Hansen 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Stefano Garzarella 
---
 net/vmw_vsock/vmci_transport.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index 8c9c4ed90fa7..f8e3131ac480 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -78,11 +78,6 @@ static int PROTOCOL_OVERRIDE = -1;
 #define VMCI_TRANSPORT_DEFAULT_QP_SIZE   262144
 #define VMCI_TRANSPORT_DEFAULT_QP_SIZE_MAX   262144
 
-/* The default peer timeout indicates how long we will wait for a peer response
- * to a control message.
- */
-#define VSOCK_DEFAULT_CONNECT_TIMEOUT (2 * HZ)
-
 /* Helper function to convert from a VMCI error code to a VSock error code. */
 
 static s32 vmci_transport_error_to_vsock_error(s32 vmci_error)
-- 
2.21.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next 00/14] vsock: add multi-transports support

2019-10-23 Thread Stefano Garzarella
This series adds the multi-transports support to vsock, following
this proposal: https://www.spinics.net/lists/netdev/msg575792.html

With the multi-transports support, we can use VSOCK with nested VMs
(using also different hypervisors) loading both guest->host and
host->guest transports at the same time.
Before this series, vmci-transport supported this behavior but only
using VMware hypervisor on L0, L1, etc.

RFC: https://patchwork.ozlabs.org/cover/1168442/
RFC -> v1:
- Added R-b/A-b from Dexuan and Stefan
- Fixed comments and typos in several patches (Stefan)
- Patch 7: changed .notify_buffer_size return to void (Stefan)
- Added patch 8 to simplify the API exposed to the transports (Stefan)
- Patch 11:
  + documented VSOCK_TRANSPORT_F_* flags (Stefan)
  + fixed vsock_assign_transport() when the socket is already assigned
  + moved features outside of struct vsock_transport, and used as
parameter of vsock_core_register() as a preparation of Patch 12
- Removed "vsock: add 'transport_hg' to handle g2h\h2g transports" patch
- Added patch 12 to register vmci_transport only when VMCI guest/host
  are active

The first 9 patches are cleanups and preparations, maybe some of
these can go regardless of this series.

Patch 10 changes the hvs_remote_addr_init(). setting the
VMADDR_CID_HOST as remote CID instead of VMADDR_CID_ANY to make
the choice of transport to be used work properly.

Patch 11 adds multi-transports support.

Patch 12 touch a little bit the vmci_transport and the vmci driver
to register the vmci_transport only when there are active host/guest.

Patch 13 prevents the transport modules unloading while sockets are
assigned to them.

Patch 14 fixes an issue in the bind() logic discoverable only with
the new multi-transport support.

I've tested this series with nested KVM (vsock-transport [L0,L1],
virtio-transport[L1,L2]) and with VMware (L0) + KVM (L1)
(vmci-transport [L0,L1], vhost-transport [L1], virtio-transport[L2]).

Dexuan successfully tested the RFC series on HyperV with a Linux guest.

Stefano Garzarella (14):
  vsock/vmci: remove unused VSOCK_DEFAULT_CONNECT_TIMEOUT
  vsock: remove vm_sockets_get_local_cid()
  vsock: remove include/linux/vm_sockets.h file
  vsock: add 'transport' member in the struct vsock_sock
  vsock/virtio: add transport parameter to the
virtio_transport_reset_no_sock()
  vsock: add 'struct vsock_sock *' param to vsock_core_get_transport()
  vsock: handle buffer_size sockopts in the core
  vsock: add vsock_create_connected() called by transports
  vsock: move vsock_insert_unbound() in the vsock_create()
  hv_sock: set VMADDR_CID_HOST in the hvs_remote_addr_init()
  vsock: add multi-transports support
  vsock/vmci: register vmci_transport only when VMCI guest/host are
active
  vsock: prevent transport modules unloading
  vsock: fix bind() behaviour taking care of CID

 drivers/misc/vmw_vmci/vmci_driver.c |  50 
 drivers/misc/vmw_vmci/vmci_driver.h |   2 +
 drivers/misc/vmw_vmci/vmci_guest.c  |   2 +
 drivers/misc/vmw_vmci/vmci_host.c   |   7 +
 drivers/vhost/vsock.c   |  96 +++---
 include/linux/virtio_vsock.h|  18 +-
 include/linux/vm_sockets.h  |  15 -
 include/linux/vmw_vmci_api.h|   2 +
 include/net/af_vsock.h  |  44 +--
 include/net/vsock_addr.h|   2 +-
 net/vmw_vsock/af_vsock.c| 376 ++--
 net/vmw_vsock/hyperv_transport.c|  70 ++---
 net/vmw_vsock/virtio_transport.c| 177 ++-
 net/vmw_vsock/virtio_transport_common.c | 131 +++--
 net/vmw_vsock/vmci_transport.c  | 137 +++--
 net/vmw_vsock/vmci_transport.h  |   3 -
 net/vmw_vsock/vmci_transport_notify.h   |   1 -
 17 files changed, 627 insertions(+), 506 deletions(-)
 delete mode 100644 include/linux/vm_sockets.h

-- 
2.21.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0

2019-10-23 Thread Michal Hocko
On Tue 22-10-19 16:02:09, David Hildenbrand wrote:
[...]
> >>> MEM_CANCEL_OFFLINE could gain the reference back to balance the
> >>> MEM_GOING_OFFLINE step.
> >>
> >> The pages are already unisolated and could be used by the buddy. But again,
> >> I think you have an idea that tries to avoid putting pages to the buddy.
> > 
> > Yeah, set_page_count(page, 0) if you do not want to release that page
> > from the notifier context to reflect that the page is ok to be offlined
> > with the rest.
> >   
> 
> I neither see how you deal with __test_page_isolated_in_pageblock() nor with
> __offline_isolated_pages(). Sorry, but what I read is incomplete and you
> probably have a full proposal in your head. Please read below how I think
> you want to solve it.

Yeah, sorry that I am throwing incomplete ideas at you. I am just trying
to really nail down how to deal with reference counting here because it
is an important aspect.
 
> >>> explicit control via the reference count which is the standard way to
> >>> control the struct page life cycle.
> >>>
> >>> Anyway hooking into __put_page (which tends to be a hot path with
> >>> something that is barely used on most systems) doesn't sound nice to me.
> >>> This is the whole point which made me think about the whole reference
> >>> count approach in the first place.
> >>
> >> Again, the race I think that is possible
> >>
> >> somebody: get_page_unless_zero(page)
> >> virtio_mem: page_ref_dec(pfn_to_page(pfn)
> >> somebody: put_page() -> straight to the buddy
> > 
> > Who is that somebody? I thought that it is only the owner/driver to have
> > a control over the page. Also the above is not possible as long as the
> > owner/driver keeps a reference to the PageOffline page throughout the
> > time it is marked that way.
> >   
> 
> I was reading
> 
> include/linux/mm_types.h:
> 
> "If you want to use the refcount field, it must be used in such a way
>  that other CPUs temporarily incrementing and then decrementing the
>  refcount does not cause problems"
> 
> And that made me think "anybody can go ahead and try get_page_unless_zero()".
> 
> If I am missing something here and this can indeed not happen (e.g.,
> because PageOffline() pages are never mapped to user space), then I'll
> happily remove this code.

The point is that if the owner of the page is holding the only reference
to the page then it is clear that nothing like that's happened.
 
[...]

> Let's recap what I suggest:
> 
> "PageOffline() pages that have a reference count of 0 will be treated
>  like free pages when offlining pages, allowing the containing memory
>  block to get offlined. In case a driver wants to revive such a page, it
>  has to synchronize against memory onlining/offlining (e.g., using memory
>  notifiers) while incrementing the reference count. Also, a driver that
>  relies in this feature is aware that re-onlining the memory will require
>  to re-set the pages PageOffline() - e.g., via the online_page_callback_t."

OK

[...]
> d) __put_page() is modified to not return pages to the buddy in any
>case as a safety net. We might be able to get rid of that.

I do not like exactly this part
 
> What I think you suggest:
> 
> a) has_unmovable_pages() skips over all PageOffline() pages.
>This results in a lot of false negatives when trying to offline. Might be 
> ok.
> 
> b) The driver decrements the reference count of the PageOffline pages
>in MEM_GOING_OFFLINE.

Well, driver should make the page unreferenced or fail. What is done
really depends on the specific driver

> c) The driver increments the reference count of the PageOffline pages
>in MEM_CANCEL_OFFLINE. One issue might be that the pages are no longer
>isolated once we get that call. Might be ok.

Only previous PageBuddy pages are returned to the allocator IIRC. Mostly
because of MovablePage()

> d) How to make __test_page_isolated_in_pageblock() succeed?
>Like I propose in this patch (PageOffline() + refcount == 0)?

Yep

> e) How to make __offline_isolated_pages() succeed?
>Like I propose in this patch (PageOffline() + refcount == 0)?

Simply skip over PageOffline pages. Reference count should never be != 0
at this stage.
 
> In summary, is what you suggest simply delaying setting the reference count 
> to 0
> in MEM_GOING_OFFLINE instead of right away when the driver unpluggs the pages?

Yes

> What's the big benefit you see and I fail to see?

Aparat from no hooks into __put_page it is also an explicit control over
the page via reference counting. Do you see any downsides?
-- 
Michal Hocko
SUSE Labs
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] drm/simple-kms: Standardize arguments for callbacks

2019-10-23 Thread Daniel Vetter
On Wed, Oct 23, 2019 at 08:47:57AM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 22.10.19 um 21:03 schrieb Daniel Vetter:
> > On Tue, Oct 22, 2019 at 7:16 PM Thomas Zimmermann  
> > wrote:
> >>
> >> Hi,
> >>
> >> there are two types of callbacks in struct
> >> drm_simple_display_pipe_funcs: functions that are genuine to simple KMS,
> >> and functions that are merely forwarded from another structure (crtc,
> >> plane, etc).
> >>
> >> In the former category are enable(), disable(), check(), and update().
> >> Those should probably receive a simple display pipe as their first 
> >> argument.
> > 
> > mode_valid _very_ much belongs to this category too, since there's
> > mode_valid hooks also on other objects. But simple pipe helper
> > condenses that down to one mode_valid hook (we could also put the
> > mode_valid onto encoder, wouldn't change anything).
> > 
> >> In the latter category are mode_valid(), prepare_fb(), cleanup_fb(),
> >> enable_vblank(), and disable_vblank(). IMHO those functions should
> >> receive a pointer to the original structure as their first argument.
> >> This type provides the context in which the operations make sense. (Even
> >> their documentation already refers to the original structure.)
> > 
> > Now on those you can maybe make a case that they only exist in one
> > object. But the entire point of simple helpers was to condense the zoo
> > of drm types down to one. Only reason you don't also get a
> > drm_simple_display_pipe_state is that this one would be a bit more
> > work to make work correctly. If we full on leak all the underlying
> > objects, then you might as well set them up yourself and set up all
> > the hooks, it's just a few more lines of code.
> > 
> > Imo for simple pipe we should go more into that direction, not less.
> > 
> >> I admit that not all are as shareable as prepare_fb() and enable_fb().
> >> But what else than boiler-plate wrappers do we get from simple display
> >> pipe here?
> > 
> > Boiler plate wrappers is pretty much the entire point of simple pipe
> > helpers. Anytime you're interested in the things it abstracts away
> > (crtc, plane, encoder) you probably want your own atomic
> > implementation.
> 
> I was speaking of boiler-plate code in drivers and other helpers (e.g.,
> drm_gem_fb_simple_display_pipe_prepare_fb() )
> 
> TBH I don't think it is possible to build and use simple pipe without
> exposing the underlying primitives (crtc, plane, connector). This would
> require a completely separate set of atomic helpers. IMHO the current
> simple pipe is a mid-layer and comes with typical mid-layer problems.

Helpers can be midlayers, as long as their optional. And for simple I
agree it's not a perfect illusion, it's a tradeoff between having a huge
helper library that remaps everything and still enabling useful code and
complexity savings in the tiny drivers.

Maybe another rule of thumb: If your driver has more than one source file,
simple pipe is maybe not what you're looking for. Exceptions apply ofc.
simple pipe was designed for drm/tiny, and it utterly excels at that. But
that doesn't make it a general purpose tool.

> Anyway, given your rational for the current design, I'll update my
> recent patches for prepare_fb() to support simple pipe.
> 
> For this patch
> 
>   Acked-By: Thomas Zimmermann 

Thanks, will apply.
-Daniel

> 
> Best regards
> Thomas
> 
> > conversion is a good fit, it's not meant to be useful for all small
> > drivers. Only for the _really_ simple ones.
> > 
> > Otherwise if we readd all the bells and whistles to simple pipe
> > helpers, then we just end back where we started. That's also why I
> > personally think explicit simple wrappers would fit better, instead of
> > wrestling the prepare/cleanup_fb functions to match full atomic
> > helpers.
> > -Daniel
> > 
> >>
> >> Best regards
> >> Thomas
> >>
> >> Am 22.10.19 um 17:55 schrieb Daniel Vetter:
> >>> Passing the wrong type feels icky, everywhere else we use the pipe as
> >>> the first parameter. Spotted while discussing patches with Thomas
> >>> Zimmermann.
> >>>
> >>> Cc: Thomas Zimmermann 
> >>> Cc: Noralf Trønnes 
> >>> Cc: Gerd Hoffmann 
> >>> Cc: Eric Anholt 
> >>> Cc: Emil Velikov 
> >>> Cc: virtualization@lists.linux-foundation.org
> >>> Cc: Linus Walleij 
> >>> Signed-off-by: Daniel Vetter 
> >>> ---
> >>>  drivers/gpu/drm/cirrus/cirrus.c | 2 +-
> >>>  drivers/gpu/drm/drm_simple_kms_helper.c | 2 +-
> >>>  drivers/gpu/drm/pl111/pl111_display.c   | 4 ++--
> >>>  include/drm/drm_simple_kms_helper.h | 2 +-
> >>>  4 files changed, 5 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/cirrus/cirrus.c 
> >>> b/drivers/gpu/drm/cirrus/cirrus.c
> >>> index 7d08d067e1a4..248c9f765c45 100644
> >>> --- a/drivers/gpu/drm/cirrus/cirrus.c
> >>> +++ b/drivers/gpu/drm/cirrus/cirrus.c
> >>> @@ -390,7 +390,7 @@ static int cirrus_conn_init(struct cirrus_device 
> >>> *cirrus)
> >>>  /* ---

Re: [PATCH 0/3] kcov: collect coverage from usb and vhost

2019-10-23 Thread Dmitry Vyukov via Virtualization
On Tue, Oct 22, 2019 at 6:46 PM Andrey Konovalov  wrote:
>
> This patchset extends kcov to allow collecting coverage from the USB
> subsystem and vhost workers. See the first patch description for details
> about the kcov extension. The other two patches apply this kcov extension
> to USB and vhost.
>
> These patches have been used to enable coverage-guided USB fuzzing with
> syzkaller for the last few years, see the details here:
>
> https://github.com/google/syzkaller/blob/master/docs/linux/external_fuzzing_usb.md
>
> This patchset has been pushed to the public Linux kernel Gerrit instance:
>
> https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/1524

Oh, so much easier to review with side-by-side diffs, context and
smart in-line colouring!

> Changes from RFC v1:
> - Remove unnecessary #ifdef's from drivers/vhost/vhost.c.
> - Reset t->kcov when area allocation fails in kcov_remote_start().
> - Use struct_size to calculate array size in kcov_ioctl().
> - Add a limit on area_size in kcov_remote_arg.
> - Added kcov_disable() helper.
> - Changed encoding of kcov remote handle ids, see the documentation.
> - Added a comment reference for kcov_sequence task_struct field.
> - Change common_handle type to u32.
> - Add checks for handle validity into kcov_ioctl_locked() and
> kcov_remote_start().
> - Updated documentation to reflect the changes.
>
> Andrey Konovalov (3):
>   kcov: remote coverage support
>   usb, kcov: collect coverage from hub_event
>   vhost, kcov: collect coverage from vhost_worker
>
>  Documentation/dev-tools/kcov.rst | 120 
>  drivers/usb/core/hub.c   |   5 +
>  drivers/vhost/vhost.c|   6 +
>  drivers/vhost/vhost.h|   1 +
>  include/linux/kcov.h |   6 +
>  include/linux/sched.h|   6 +
>  include/uapi/linux/kcov.h|  20 ++
>  kernel/kcov.c| 464 ---
>  8 files changed, 593 insertions(+), 35 deletions(-)
>
> --
> 2.23.0.866.gb869b98d4c-goog
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 3/3] vhost, kcov: collect coverage from vhost_worker

2019-10-23 Thread Dmitry Vyukov via Virtualization
On Tue, Oct 22, 2019 at 6:46 PM Andrey Konovalov  wrote:
>
> This patch adds kcov_remote_start()/kcov_remote_stop() annotations to the
> vhost_worker() function, which is responsible for processing vhost works.
> Since vhost_worker() threads are spawned per vhost device instance
> the common kcov handle is used for kcov_remote_start()/stop() annotations
> (see Documentation/dev-tools/kcov.rst for details). As the result kcov can
> now be used to collect coverage from vhost worker threads.
>
> Signed-off-by: Andrey Konovalov 
> ---
>  drivers/vhost/vhost.c | 6 ++
>  drivers/vhost/vhost.h | 1 +
>  2 files changed, 7 insertions(+)
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 36ca2cf419bf..a5a557c4b67f 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -30,6 +30,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include "vhost.h"
>
> @@ -357,7 +358,9 @@ static int vhost_worker(void *data)
> llist_for_each_entry_safe(work, work_next, node, node) {
> clear_bit(VHOST_WORK_QUEUED, &work->flags);
> __set_current_state(TASK_RUNNING);
> +   kcov_remote_start(dev->kcov_handle);
> work->fn(work);
> +   kcov_remote_stop();
> if (need_resched())
> schedule();
> }
> @@ -546,6 +549,7 @@ long vhost_dev_set_owner(struct vhost_dev *dev)
>
> /* No owner, become one */
> dev->mm = get_task_mm(current);
> +   dev->kcov_handle = current->kcov_handle;

kcov_handle is not present in task_struct if !CONFIG_KCOV

Also this does not use KCOV_SUBSYSTEM_COMMON.
We discussed something along the following lines:

u64 kcov_remote_handle(u64 subsys, u64 id)
{
  WARN_ON(subsys or id has wrong bits set).
  return ...;
}

kcov_remote_handle(KCOV_SUBSYSTEM_USB, bus);
kcov_remote_handle(KCOV_SUBSYSTEM_COMMON, current->kcov_handle);


> worker = kthread_create(vhost_worker, dev, "vhost-%d", current->pid);
> if (IS_ERR(worker)) {
> err = PTR_ERR(worker);
> @@ -571,6 +575,7 @@ long vhost_dev_set_owner(struct vhost_dev *dev)
> if (dev->mm)
> mmput(dev->mm);
> dev->mm = NULL;
> +   dev->kcov_handle = 0;
>  err_mm:
> return err;
>  }
> @@ -682,6 +687,7 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
> if (dev->worker) {
> kthread_stop(dev->worker);
> dev->worker = NULL;
> +   dev->kcov_handle = 0;
> }
> if (dev->mm)
> mmput(dev->mm);
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index e9ed2722b633..a123fd70847e 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -173,6 +173,7 @@ struct vhost_dev {
> int iov_limit;
> int weight;
> int byte_weight;
> +   u64 kcov_handle;
>  };
>
>  bool vhost_exceeds_weight(struct vhost_virtqueue *vq, int pkts, int 
> total_len);
> --
> 2.23.0.866.gb869b98d4c-goog
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2] vhost: introduce mdev based hardware backend

2019-10-23 Thread Jason Wang


On 2019/10/23 下午3:07, Tiwei Bie wrote:

On Wed, Oct 23, 2019 at 01:46:23PM +0800, Jason Wang wrote:

On 2019/10/23 上午11:02, Tiwei Bie wrote:

On Tue, Oct 22, 2019 at 09:30:16PM +0800, Jason Wang wrote:

On 2019/10/22 下午5:52, Tiwei Bie wrote:

This patch introduces a mdev based hardware vhost backend.
This backend is built on top of the same abstraction used
in virtio-mdev and provides a generic vhost interface for
userspace to accelerate the virtio devices in guest.

This backend is implemented as a mdev device driver on top
of the same mdev device ops used in virtio-mdev but using
a different mdev class id, and it will register the device
as a VFIO device for userspace to use. Userspace can setup
the IOMMU with the existing VFIO container/group APIs and
then get the device fd with the device name. After getting
the device fd of this device, userspace can use vhost ioctls
to setup the backend.

Signed-off-by: Tiwei Bie 
---
This patch depends on below series:
https://lkml.org/lkml/2019/10/17/286

v1 -> v2:
- Replace _SET_STATE with _SET_STATUS (MST);
- Check status bits at each step (MST);
- Report the max ring size and max number of queues (MST);
- Add missing MODULE_DEVICE_TABLE (Jason);
- Only support the network backend w/o multiqueue for now;

Any idea on how to extend it to support devices other than net? I think we
want a generic API or an API that could be made generic in the future.

Do we want to e.g having a generic vhost mdev for all kinds of devices or
introducing e.g vhost-net-mdev and vhost-scsi-mdev?

One possible way is to do what vhost-user does. I.e. Apart from
the generic ring, features, ... related ioctls, we also introduce
device specific ioctls when we need them. As vhost-mdev just needs
to forward configs between parent and userspace and even won't
cache any info when possible,


So it looks to me this is only possible if we expose e.g set_config and
get_config to userspace.

The set_config and get_config interface isn't really everything
of device specific settings. We also have ctrlq in virtio-net.



Yes, but it could be processed by the exist API. Isn't it? Just set ctrl 
vq address and let parent to deal with that.








I think it might be better to do
this in one generic vhost-mdev module.


Looking at definitions of VhostUserRequest in qemu, it mixed generic API
with device specific API. If we want go this ways (a generic vhost-mdev),
more questions needs to be answered:

1) How could userspace know which type of vhost it would use? Do we need to
expose virtio subsystem device in for userspace this case?

2) That generic vhost-mdev module still need to filter out unsupported
ioctls for a specific type. E.g if it probes a net device, it should refuse
API for other type. This in fact a vhost-mdev-net but just not modularize it
on top of vhost-mdev.



- Some minor fixes and improvements;
- Rebase on top of virtio-mdev series v4;

[...]

+
+static long vhost_mdev_get_features(struct vhost_mdev *m, u64 __user *featurep)
+{
+   if (copy_to_user(featurep, &m->features, sizeof(m->features)))
+   return -EFAULT;

As discussed in previous version do we need to filter out MQ feature here?

I think it's more straightforward to let the parent drivers to
filter out the unsupported features. Otherwise it would be tricky
when we want to add more features in vhost-mdev module,


It's as simple as remove the feature from blacklist?

It's not really that easy. It may break the old drivers.



I'm not sure I understand here, we do feature negotiation anyhow. For 
old drivers do you mean the guest drivers without MQ?








i.e. if
the parent drivers may expose unsupported features and relay on
vhost-mdev to filter them out, these features will be exposed
to userspace automatically when they are enabled in vhost-mdev
in the future.


The issue is, it's only that vhost-mdev knows its own limitation. E.g in
this patch, vhost-mdev only implements a subset of transport API, but parent
doesn't know about that.

Still MQ as an example, there's no way (or no need) for parent to know that
vhost-mdev does not support MQ.

The mdev is a MDEV_CLASS_ID_VHOST mdev device. When the parent
is being developed, it should know the currently supported features
of vhost-mdev.



How can parent know MQ is not supported by vhost-mdev?





And this allows old kenrel to work with new
parent drivers.

The new drivers should provide things like VIRTIO_MDEV_F_VERSION_1
to be compatible with the old kernels. When VIRTIO_MDEV_F_VERSION_1
is provided/negotiated, the behaviours should be consistent.



To be clear, I didn't mean a change in virtio-mdev API, I meant:

1) old vhost-mdev kernel driver that filters out MQ

2) new parent driver that support MQ





So basically we have three choices here:

1) Implement what vhost-user did and implement a generic vhost-mdev (but may
still have lots of device specific code). To support advanced feature which
requires the access to config, still lots of API tha

Re: [PATCH v2] vhost: introduce mdev based hardware backend

2019-10-23 Thread Tiwei Bie
On Wed, Oct 23, 2019 at 01:46:23PM +0800, Jason Wang wrote:
> On 2019/10/23 上午11:02, Tiwei Bie wrote:
> > On Tue, Oct 22, 2019 at 09:30:16PM +0800, Jason Wang wrote:
> > > On 2019/10/22 下午5:52, Tiwei Bie wrote:
> > > > This patch introduces a mdev based hardware vhost backend.
> > > > This backend is built on top of the same abstraction used
> > > > in virtio-mdev and provides a generic vhost interface for
> > > > userspace to accelerate the virtio devices in guest.
> > > > 
> > > > This backend is implemented as a mdev device driver on top
> > > > of the same mdev device ops used in virtio-mdev but using
> > > > a different mdev class id, and it will register the device
> > > > as a VFIO device for userspace to use. Userspace can setup
> > > > the IOMMU with the existing VFIO container/group APIs and
> > > > then get the device fd with the device name. After getting
> > > > the device fd of this device, userspace can use vhost ioctls
> > > > to setup the backend.
> > > > 
> > > > Signed-off-by: Tiwei Bie 
> > > > ---
> > > > This patch depends on below series:
> > > > https://lkml.org/lkml/2019/10/17/286
> > > > 
> > > > v1 -> v2:
> > > > - Replace _SET_STATE with _SET_STATUS (MST);
> > > > - Check status bits at each step (MST);
> > > > - Report the max ring size and max number of queues (MST);
> > > > - Add missing MODULE_DEVICE_TABLE (Jason);
> > > > - Only support the network backend w/o multiqueue for now;
> > > 
> > > Any idea on how to extend it to support devices other than net? I think we
> > > want a generic API or an API that could be made generic in the future.
> > > 
> > > Do we want to e.g having a generic vhost mdev for all kinds of devices or
> > > introducing e.g vhost-net-mdev and vhost-scsi-mdev?
> > One possible way is to do what vhost-user does. I.e. Apart from
> > the generic ring, features, ... related ioctls, we also introduce
> > device specific ioctls when we need them. As vhost-mdev just needs
> > to forward configs between parent and userspace and even won't
> > cache any info when possible,
> 
> 
> So it looks to me this is only possible if we expose e.g set_config and
> get_config to userspace.

The set_config and get_config interface isn't really everything
of device specific settings. We also have ctrlq in virtio-net.

> 
> 
> > I think it might be better to do
> > this in one generic vhost-mdev module.
> 
> 
> Looking at definitions of VhostUserRequest in qemu, it mixed generic API
> with device specific API. If we want go this ways (a generic vhost-mdev),
> more questions needs to be answered:
> 
> 1) How could userspace know which type of vhost it would use? Do we need to
> expose virtio subsystem device in for userspace this case?
> 
> 2) That generic vhost-mdev module still need to filter out unsupported
> ioctls for a specific type. E.g if it probes a net device, it should refuse
> API for other type. This in fact a vhost-mdev-net but just not modularize it
> on top of vhost-mdev.
> 
> 
> > 
> > > 
> > > > - Some minor fixes and improvements;
> > > > - Rebase on top of virtio-mdev series v4;
[...]
> > > > +
> > > > +static long vhost_mdev_get_features(struct vhost_mdev *m, u64 __user 
> > > > *featurep)
> > > > +{
> > > > +   if (copy_to_user(featurep, &m->features, sizeof(m->features)))
> > > > +   return -EFAULT;
> > > 
> > > As discussed in previous version do we need to filter out MQ feature here?
> > I think it's more straightforward to let the parent drivers to
> > filter out the unsupported features. Otherwise it would be tricky
> > when we want to add more features in vhost-mdev module,
> 
> 
> It's as simple as remove the feature from blacklist?

It's not really that easy. It may break the old drivers.

> 
> 
> > i.e. if
> > the parent drivers may expose unsupported features and relay on
> > vhost-mdev to filter them out, these features will be exposed
> > to userspace automatically when they are enabled in vhost-mdev
> > in the future.
> 
> 
> The issue is, it's only that vhost-mdev knows its own limitation. E.g in
> this patch, vhost-mdev only implements a subset of transport API, but parent
> doesn't know about that.
> 
> Still MQ as an example, there's no way (or no need) for parent to know that
> vhost-mdev does not support MQ.

The mdev is a MDEV_CLASS_ID_VHOST mdev device. When the parent
is being developed, it should know the currently supported features
of vhost-mdev.

> And this allows old kenrel to work with new
> parent drivers.

The new drivers should provide things like VIRTIO_MDEV_F_VERSION_1
to be compatible with the old kernels. When VIRTIO_MDEV_F_VERSION_1
is provided/negotiated, the behaviours should be consistent.

> 
> So basically we have three choices here:
> 
> 1) Implement what vhost-user did and implement a generic vhost-mdev (but may
> still have lots of device specific code). To support advanced feature which
> requires the access to config, still lots of API that needs to be added.
> 
> 2) Implement what vhost-kern