On Tue, Sep 17, 2019 at 11:32:03AM +0800, Jason Wang wrote:
> On 2019/9/17 上午9:02, Tiwei Bie wrote:
> > This RFC is to demonstrate below ideas,
> > 
> > a) Build vhost-mdev on top of the same abstraction defined in
> >     the virtio-mdev series [1];
> > 
> > b) Introduce /dev/vhost-mdev to do vhost ioctls and support
> >     setting mdev device as backend;
> > 
> > Now the userspace API looks like this:
> > 
> > - Userspace generates a compatible mdev device;
> > 
> > - Userspace opens this mdev device with VFIO API (including
> >    doing IOMMU programming for this mdev device with VFIO's
> >    container/group based interface);
> > 
> > - Userspace opens /dev/vhost-mdev and gets vhost fd;
> > 
> > - Userspace uses vhost ioctls to setup vhost (userspace should
> >    do VHOST_MDEV_SET_BACKEND ioctl with VFIO group fd and device
> >    fd first before doing other vhost ioctls);
> > 
> > Only compile test has been done for this series for now.
> 
> 
> Have a hard thought on the architecture:

Thanks a lot! Do appreciate it!

> 
> 1) Create a vhost char device and pass vfio mdev device fd to it as a
> backend and translate vhost-mdev ioctl to virtio mdev transport (e.g
> read/write). DMA was done through the VFIO DMA mapping on the container that
> is attached.

Yeah, that's what we are doing in this series.

> 
> We have two more choices:
> 
> 2) Use vfio-mdev but do not create vhost-mdev device, instead, just
> implement vhost ioctl on vfio_device_ops, and translate them into
> virtio-mdev transport or just pass ioctl to parent.

Yeah. Instead of introducing /dev/vhost-mdev char device, do
vhost ioctls on VFIO device fd directly. That's what we did
in RFC v3.

> 
> 3) Don't use vfio-mdev, create a new vhost-mdev driver, during probe still
> try to add dev to vfio group and talk to parent with device specific ops

If my understanding is correct, this means we need to introduce
a new VFIO device driver to replace the existing vfio-mdev driver
in our case. Below is a quick draft just to show my understanding:

#include <linux/init.h>
#include <linux/module.h>
#include <linux/device.h>
#include <linux/kernel.h>
#include <linux/slab.h>
#include <linux/vfio.h>
#include <linux/mdev.h>

#include "mdev_private.h"

/* XXX: we need a proper way to include below vhost header. */
#include "../../vhost/vhost.h"

static int vfio_vhost_mdev_open(void *device_data)
{
        if (!try_module_get(THIS_MODULE))
                return -ENODEV;

        /* ... */
        vhost_dev_init(...);

        return 0;
}

static void vfio_vhost_mdev_release(void *device_data)
{
        /* ... */
        module_put(THIS_MODULE);
}

static long vfio_vhost_mdev_unlocked_ioctl(void *device_data,
                                           unsigned int cmd, unsigned long arg)
{
        struct mdev_device *mdev = device_data;
        struct mdev_parent *parent = mdev->parent;

        /*
         * Use vhost ioctls.
         *
         * We will have a different parent_ops design.
         * And potentially, we can share the same parent_ops
         * with virtio_mdev.
         */
        switch (cmd) {
        case VHOST_GET_FEATURES:
                parent->ops->get_features(mdev, ...);
                break;
        /* ... */
        }

        return 0;
}

static ssize_t vfio_vhost_mdev_read(void *device_data, char __user *buf,
                                    size_t count, loff_t *ppos)
{
        /* ... */
        return 0;
}

static ssize_t vfio_vhost_mdev_write(void *device_data, const char __user *buf,
                                     size_t count, loff_t *ppos)
{
        /* ... */
        return 0;
}

static int vfio_vhost_mdev_mmap(void *device_data, struct vm_area_struct *vma)
{
        /* ... */
        return 0;
}

static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = {
        .name           = "vfio-vhost-mdev",
        .open           = vfio_vhost_mdev_open,
        .release        = vfio_vhost_mdev_release,
        .ioctl          = vfio_vhost_mdev_unlocked_ioctl,
        .read           = vfio_vhost_mdev_read,
        .write          = vfio_vhost_mdev_write,
        .mmap           = vfio_vhost_mdev_mmap,
};

static int vfio_vhost_mdev_probe(struct device *dev)
{
        struct mdev_device *mdev = to_mdev_device(dev);

        /* ... */
        return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev);
}

static void vfio_vhost_mdev_remove(struct device *dev)
{
        /* ... */
        vfio_del_group_dev(dev);
}

static struct mdev_driver vfio_vhost_mdev_driver = {
        .name   = "vfio_vhost_mdev",
        .probe  = vfio_vhost_mdev_probe,
        .remove = vfio_vhost_mdev_remove,
};

static int __init vfio_vhost_mdev_init(void)
{
        return mdev_register_driver(&vfio_vhost_mdev_driver, THIS_MODULE);
}
module_init(vfio_vhost_mdev_init)

static void __exit vfio_vhost_mdev_exit(void)
{
        mdev_unregister_driver(&vfio_vhost_mdev_driver);
}
module_exit(vfio_vhost_mdev_exit)

> 
> So I have some questions:
> 
> 1) Compared to method 2, what's the advantage of creating a new vhost char
> device? I guess it's for keep the API compatibility?

One benefit is that we can avoid doing vhost ioctls on
VFIO device fd.

> 
> 2) For method 2, is there any easy way for user/admin to distinguish e.g
> ordinary vfio-mdev for vhost from ordinary vfio-mdev?

I think device-api could be a choice.

> I saw you introduce
> ops matching helper but it's not friendly to management.

The ops matching helper is just to check whether a given
vfio-device is based on a mdev device.

> 
> 3) A drawback of 1) and 2) is that it must follow vfio_device_ops that
> assumes the parameter comes from userspace, it prevents support kernel
> virtio drivers.
> 
> 4) So comes the idea of method 3, since it register a new vhost-mdev driver,
> we can use device specific ops instead of VFIO ones, then we can have a
> common API between vDPA parent and vhost-mdev/virtio-mdev drivers.

As the above draft shows, this requires introducing a new
VFIO device driver. I think Alex's opinion matters here.

Thanks,
Tiwei

> 
> What's your thoughts?
> 
> Thanks
> 
> 
> > 
> > RFCv3: https://patchwork.kernel.org/patch/11117785/
> > 
> > [1] https://lkml.org/lkml/2019/9/10/135
> > 
> > Tiwei Bie (3):
> >    vfio: support getting vfio device from device fd
> >    vfio: support checking vfio driver by device ops
> >    vhost: introduce mdev based hardware backend
> > 
> >   drivers/vfio/mdev/vfio_mdev.c    |   3 +-
> >   drivers/vfio/vfio.c              |  32 +++
> >   drivers/vhost/Kconfig            |   9 +
> >   drivers/vhost/Makefile           |   3 +
> >   drivers/vhost/mdev.c             | 462 +++++++++++++++++++++++++++++++
> >   drivers/vhost/vhost.c            |  39 ++-
> >   drivers/vhost/vhost.h            |   6 +
> >   include/linux/vfio.h             |  11 +
> >   include/uapi/linux/vhost.h       |  10 +
> >   include/uapi/linux/vhost_types.h |   5 +
> >   10 files changed, 573 insertions(+), 7 deletions(-)
> >   create mode 100644 drivers/vhost/mdev.c
> > 
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to