Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-27 Thread Jike Song
On 10/21/2016 01:12 AM, Alex Williamson wrote:
> On Thu, 20 Oct 2016 15:23:53 +0800
> Jike Song  wrote:
> 
>> On 10/18/2016 05:22 AM, Kirti Wankhede wrote:
>>> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
>>> new file mode 100644
>>> index ..7db5ec164aeb
>>> --- /dev/null
>>> +++ b/drivers/vfio/mdev/mdev_core.c
>>> @@ -0,0 +1,372 @@
>>> +/*
>>> + * Mediated device Core Driver
>>> + *
>>> + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
>>> + * Author: Neo Jia 
>>> + *Kirti Wankhede 
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + */
>>> +
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +
>>> +#include "mdev_private.h"
>>> +
>>> +#define DRIVER_VERSION "0.1"
>>> +#define DRIVER_AUTHOR  "NVIDIA Corporation"
>>> +#define DRIVER_DESC"Mediated device Core Driver"
>>> +
>>> +static LIST_HEAD(parent_list);
>>> +static DEFINE_MUTEX(parent_list_lock);
>>> +static struct class_compat *mdev_bus_compat_class;
>>> +  
>>
>>> +
>>> +/*
>>> + * mdev_register_device : Register a device
>>> + * @dev: device structure representing parent device.
>>> + * @ops: Parent device operation structure to be registered.
>>> + *
>>> + * Add device to list of registered parent devices.
>>> + * Returns a negative value on error, otherwise 0.
>>> + */
>>> +int mdev_register_device(struct device *dev, const struct parent_ops *ops)
>>> +{
>>> +   int ret = 0;
>>> +   struct parent_device *parent;
>>> +
>>> +   /* check for mandatory ops */
>>> +   if (!ops || !ops->create || !ops->remove || !ops->supported_type_groups)
>>> +   return -EINVAL;
>>> +
>>> +   dev = get_device(dev);
>>> +   if (!dev)
>>> +   return -EINVAL;
>>> +
>>> +   mutex_lock(_list_lock);
>>> +
>>> +   /* Check for duplicate */
>>> +   parent = __find_parent_device(dev);
>>> +   if (parent) {
>>> +   ret = -EEXIST;
>>> +   goto add_dev_err;
>>> +   }
>>> +
>>> +   parent = kzalloc(sizeof(*parent), GFP_KERNEL);
>>> +   if (!parent) {
>>> +   ret = -ENOMEM;
>>> +   goto add_dev_err;
>>> +   }
>>> +
>>> +   kref_init(>ref);
>>> +
>>> +   parent->dev = dev;
>>> +   parent->ops = ops;
>>> +
>>> +   ret = parent_create_sysfs_files(parent);
>>> +   if (ret) {
>>> +   mutex_unlock(_list_lock);
>>> +   mdev_put_parent(parent);
>>> +   return ret;
>>> +   }
>>> +
>>> +   ret = class_compat_create_link(mdev_bus_compat_class, dev, NULL);
>>> +   if (ret)
>>> +   dev_warn(dev, "Failed to create compatibility class link\n");
>>> +
>>> +   list_add(>next, _list);
>>> +   mutex_unlock(_list_lock);
>>> +
>>> +   dev_info(dev, "MDEV: Registered\n");
>>> +   return 0;
>>> +
>>> +add_dev_err:
>>> +   mutex_unlock(_list_lock);
>>> +   put_device(dev);
>>> +   return ret;
>>> +}
>>> +EXPORT_SYMBOL(mdev_register_device);  
>>
>>> +static int __init mdev_init(void)
>>> +{
>>> +   int ret;
>>> +
>>> +   ret = mdev_bus_register();
>>> +   if (ret) {
>>> +   pr_err("Failed to register mdev bus\n");
>>> +   return ret;
>>> +   }
>>> +
>>> +   mdev_bus_compat_class = class_compat_register("mdev_bus");
>>> +   if (!mdev_bus_compat_class) {
>>> +   mdev_bus_unregister();
>>> +   return -ENOMEM;
>>> +   }
>>> +
>>> +   /*
>>> +* Attempt to load known vfio_mdev.  This gives us a working environment
>>> +* without the user needing to explicitly load vfio_mdev driver.
>>> +*/
>>> +   request_module_nowait("vfio_mdev");
>>> +
>>> +   return ret;
>>> +}
>>> +
>>> +static void __exit mdev_exit(void)
>>> +{
>>> +   class_compat_unregister(mdev_bus_compat_class);
>>> +   mdev_bus_unregister();
>>> +}
>>> +
>>> +module_init(mdev_init)
>>> +module_exit(mdev_exit)  
>>
>> Hi Kirti,
>>
>> There is a possible issue: mdev_bus_register is called from mdev_init,
>> a module_init, equal to device_initcall if builtin to vmlinux; however,
>> the vendor driver, say i915.ko for intel case, have to call
>> mdev_register_device from its module_init: at that time, mdev_init
>> is still not called.
>>
>> Not sure if this issue exists with nvidia.ko. Though in most cases we
>> are expecting users select mdev as a standalone module, we still won't
>> break builtin case.
>>
>>
>> Hi Alex, do you have any suggestion here?
> 
> To fully solve the problem of built-in drivers making use of the mdev
> infrastructure we'd need to make mdev itself builtin and possibly a
> subsystem that is initialized prior to device drivers.  Is that really
> necessary?  Even though i915.ko is often loaded as part of an
> initramfs, most systems still build it as a module.  I would expect
> that standard module dependencies will pull in the necessary 

Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-27 Thread Jike Song
On 10/21/2016 01:12 AM, Alex Williamson wrote:
> On Thu, 20 Oct 2016 15:23:53 +0800
> Jike Song  wrote:
> 
>> On 10/18/2016 05:22 AM, Kirti Wankhede wrote:
>>> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
>>> new file mode 100644
>>> index ..7db5ec164aeb
>>> --- /dev/null
>>> +++ b/drivers/vfio/mdev/mdev_core.c
>>> @@ -0,0 +1,372 @@
>>> +/*
>>> + * Mediated device Core Driver
>>> + *
>>> + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
>>> + * Author: Neo Jia 
>>> + *Kirti Wankhede 
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + */
>>> +
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +
>>> +#include "mdev_private.h"
>>> +
>>> +#define DRIVER_VERSION "0.1"
>>> +#define DRIVER_AUTHOR  "NVIDIA Corporation"
>>> +#define DRIVER_DESC"Mediated device Core Driver"
>>> +
>>> +static LIST_HEAD(parent_list);
>>> +static DEFINE_MUTEX(parent_list_lock);
>>> +static struct class_compat *mdev_bus_compat_class;
>>> +  
>>
>>> +
>>> +/*
>>> + * mdev_register_device : Register a device
>>> + * @dev: device structure representing parent device.
>>> + * @ops: Parent device operation structure to be registered.
>>> + *
>>> + * Add device to list of registered parent devices.
>>> + * Returns a negative value on error, otherwise 0.
>>> + */
>>> +int mdev_register_device(struct device *dev, const struct parent_ops *ops)
>>> +{
>>> +   int ret = 0;
>>> +   struct parent_device *parent;
>>> +
>>> +   /* check for mandatory ops */
>>> +   if (!ops || !ops->create || !ops->remove || !ops->supported_type_groups)
>>> +   return -EINVAL;
>>> +
>>> +   dev = get_device(dev);
>>> +   if (!dev)
>>> +   return -EINVAL;
>>> +
>>> +   mutex_lock(_list_lock);
>>> +
>>> +   /* Check for duplicate */
>>> +   parent = __find_parent_device(dev);
>>> +   if (parent) {
>>> +   ret = -EEXIST;
>>> +   goto add_dev_err;
>>> +   }
>>> +
>>> +   parent = kzalloc(sizeof(*parent), GFP_KERNEL);
>>> +   if (!parent) {
>>> +   ret = -ENOMEM;
>>> +   goto add_dev_err;
>>> +   }
>>> +
>>> +   kref_init(>ref);
>>> +
>>> +   parent->dev = dev;
>>> +   parent->ops = ops;
>>> +
>>> +   ret = parent_create_sysfs_files(parent);
>>> +   if (ret) {
>>> +   mutex_unlock(_list_lock);
>>> +   mdev_put_parent(parent);
>>> +   return ret;
>>> +   }
>>> +
>>> +   ret = class_compat_create_link(mdev_bus_compat_class, dev, NULL);
>>> +   if (ret)
>>> +   dev_warn(dev, "Failed to create compatibility class link\n");
>>> +
>>> +   list_add(>next, _list);
>>> +   mutex_unlock(_list_lock);
>>> +
>>> +   dev_info(dev, "MDEV: Registered\n");
>>> +   return 0;
>>> +
>>> +add_dev_err:
>>> +   mutex_unlock(_list_lock);
>>> +   put_device(dev);
>>> +   return ret;
>>> +}
>>> +EXPORT_SYMBOL(mdev_register_device);  
>>
>>> +static int __init mdev_init(void)
>>> +{
>>> +   int ret;
>>> +
>>> +   ret = mdev_bus_register();
>>> +   if (ret) {
>>> +   pr_err("Failed to register mdev bus\n");
>>> +   return ret;
>>> +   }
>>> +
>>> +   mdev_bus_compat_class = class_compat_register("mdev_bus");
>>> +   if (!mdev_bus_compat_class) {
>>> +   mdev_bus_unregister();
>>> +   return -ENOMEM;
>>> +   }
>>> +
>>> +   /*
>>> +* Attempt to load known vfio_mdev.  This gives us a working environment
>>> +* without the user needing to explicitly load vfio_mdev driver.
>>> +*/
>>> +   request_module_nowait("vfio_mdev");
>>> +
>>> +   return ret;
>>> +}
>>> +
>>> +static void __exit mdev_exit(void)
>>> +{
>>> +   class_compat_unregister(mdev_bus_compat_class);
>>> +   mdev_bus_unregister();
>>> +}
>>> +
>>> +module_init(mdev_init)
>>> +module_exit(mdev_exit)  
>>
>> Hi Kirti,
>>
>> There is a possible issue: mdev_bus_register is called from mdev_init,
>> a module_init, equal to device_initcall if builtin to vmlinux; however,
>> the vendor driver, say i915.ko for intel case, have to call
>> mdev_register_device from its module_init: at that time, mdev_init
>> is still not called.
>>
>> Not sure if this issue exists with nvidia.ko. Though in most cases we
>> are expecting users select mdev as a standalone module, we still won't
>> break builtin case.
>>
>>
>> Hi Alex, do you have any suggestion here?
> 
> To fully solve the problem of built-in drivers making use of the mdev
> infrastructure we'd need to make mdev itself builtin and possibly a
> subsystem that is initialized prior to device drivers.  Is that really
> necessary?  Even though i915.ko is often loaded as part of an
> initramfs, most systems still build it as a module.  I would expect
> that standard module dependencies will pull in the necessary mdev and
> vfio modules to make this work correctly.  I can't 

Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-26 Thread Kirti Wankhede

>> Medisted bus driver is responsible to add/delete mediated devices to/from
> 
> Medisted -> Mediated
>

Thanks for pointing out the typeo. Correcting it.


>> VFIO group when devices are bound and unbound to the driver.
>>
>> 2. Physical device driver interface
>> This interface provides vendor driver the set APIs to manage physical
>> device related work in its driver. APIs are :
>>
>> * dev_attr_groups: attributes of the parent device.
>> * mdev_attr_groups: attributes of the mediated device.
>> * supported_type_groups: attributes to define supported type. This is
>>   mandatory field.
>> * create: to allocate basic resources in driver for a mediated device.
> 
> in 'which driver'? it should be clear to remove 'in driver' here
> 
>> * remove: to free resources in driver when mediated device is destroyed.
>> * open: open callback of mediated device
>> * release: release callback of mediated device
>> * read : read emulation callback.
>> * write: write emulation callback.
>> * mmap: mmap emulation callback.
>> * ioctl: ioctl callback.
> 
> You only highlight 'mandatory field' for supported_type_groups. What
> about other fields? Are all of them optional? Please clarify and also
> stay consistent to later code comment.
> 

'create' and 'remove' are mandatory. Updating the description here. Rest
all are not cross-checked in mdev core driver, like 'create' and
'remove' but yes rest are optional. If vendor driver don't want to
support emulated region they don't need read/write callbacks. Similarly
if vendor driver don't want to support mmap region, they don't need mmap
callback.

Code comments are consistent with this description.

...
>> +
>> +config VFIO_MDEV
>> +tristate "Mediated device driver framework"
>> +depends on VFIO
>> +default n
>> +help
>> +Provides a framework to virtualize devices which don't have SR_IOV
>> +capability built-in.
> 
> This statement is not accurate. A device can support SR-IOV, but in the same
> time using this mediated technology w/ SR-IOV capability disabled.
> 

If SR-IOV is supported why would user use this framework? SR-IOV would
give better performance.

...
>> +
>> +static struct mdev_device *__find_mdev_device(struct parent_device *parent,
>> +  uuid_le uuid)
> 
> parent_find_mdev_device?
> 

This function search for mdev device with given UUID, so I think its
consistent what we have below for parent, __find_parent_device().

...

>> +static int mdev_device_remove_ops(struct mdev_device *mdev, bool 
>> force_remove)
>> +{
> 
> 
> Can you add some comment here about when force_remove may be expected
> here, which would help others understand immediately instead of walking 
> through
> the whole patch set?
>


mdev_device_remove_ops gets called from sysfs's 'remove' and when parent
device is being unregistered from mdev device framework.
- 'force_remove' is set to 'false' when called from sysfs's 'remove'
which indicates that if the mdev device is active, used by VMM or
userspace application, vendor driver could return error then don't
remove the device.
- 'force_remove' is set to 'true' when called from
mdev_unregister_device() which indicate that parent device is being
removed from mdev device framework so remove mdev device forcefully.

> 
>> +struct parent_device *parent = mdev->parent;
>> +int ret;
>> +
>> +/*
>> + * Vendor driver can return error if VMM or userspace application is
>> + * using this mdev device.
>> + */
>> +ret = parent->ops->remove(mdev);
> 
> what about passing force_remove flag to remove callback, so vendor driver
> can decide whether any force cleanup required?
>

'remove' getting called from sysfs is asynchronous, so vendor driver can
retrun failure in that case if vendor driver finds that mdev device is
being actively used.

mdev_unregister_device() is going to be called from vendor driver itself
when device is being unbound or driver is being unloaded. In this case
vendor driver can identify itself that its in its own teardown path.

So I feel there is no need to pass force_remove flag to 'remove' callback.



>> +int  mdev_create_sysfs_files(struct device *dev, struct mdev_type *type)
>> +{
>> +int ret;
>> +
>> +ret = sysfs_create_files(>kobj, mdev_device_attrs);
>> +if (ret) {
>> +pr_err("Failed to create remove sysfs entry\n");
>> +return ret;
>> +}
>> +
>> +ret = sysfs_create_link(type->devices_kobj, >kobj, dev_name(dev));
>> +if (ret) {
>> +pr_err("Failed to create symlink in types\n");
> 
> looks wrong place...
>

No, this is correct. Above function creates symlink in
mdev_supported_types//devices directory.

>> +goto device_link_failed;
>> +}
>> +
>> +ret = sysfs_create_link(>kobj, >kobj, "mdev_type");
>> +if (ret) {
>> +pr_err("Failed to create symlink in device directory\n");
> 
> exchange with above.
> 
Again this is also 

Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-26 Thread Kirti Wankhede

>> Medisted bus driver is responsible to add/delete mediated devices to/from
> 
> Medisted -> Mediated
>

Thanks for pointing out the typeo. Correcting it.


>> VFIO group when devices are bound and unbound to the driver.
>>
>> 2. Physical device driver interface
>> This interface provides vendor driver the set APIs to manage physical
>> device related work in its driver. APIs are :
>>
>> * dev_attr_groups: attributes of the parent device.
>> * mdev_attr_groups: attributes of the mediated device.
>> * supported_type_groups: attributes to define supported type. This is
>>   mandatory field.
>> * create: to allocate basic resources in driver for a mediated device.
> 
> in 'which driver'? it should be clear to remove 'in driver' here
> 
>> * remove: to free resources in driver when mediated device is destroyed.
>> * open: open callback of mediated device
>> * release: release callback of mediated device
>> * read : read emulation callback.
>> * write: write emulation callback.
>> * mmap: mmap emulation callback.
>> * ioctl: ioctl callback.
> 
> You only highlight 'mandatory field' for supported_type_groups. What
> about other fields? Are all of them optional? Please clarify and also
> stay consistent to later code comment.
> 

'create' and 'remove' are mandatory. Updating the description here. Rest
all are not cross-checked in mdev core driver, like 'create' and
'remove' but yes rest are optional. If vendor driver don't want to
support emulated region they don't need read/write callbacks. Similarly
if vendor driver don't want to support mmap region, they don't need mmap
callback.

Code comments are consistent with this description.

...
>> +
>> +config VFIO_MDEV
>> +tristate "Mediated device driver framework"
>> +depends on VFIO
>> +default n
>> +help
>> +Provides a framework to virtualize devices which don't have SR_IOV
>> +capability built-in.
> 
> This statement is not accurate. A device can support SR-IOV, but in the same
> time using this mediated technology w/ SR-IOV capability disabled.
> 

If SR-IOV is supported why would user use this framework? SR-IOV would
give better performance.

...
>> +
>> +static struct mdev_device *__find_mdev_device(struct parent_device *parent,
>> +  uuid_le uuid)
> 
> parent_find_mdev_device?
> 

This function search for mdev device with given UUID, so I think its
consistent what we have below for parent, __find_parent_device().

...

>> +static int mdev_device_remove_ops(struct mdev_device *mdev, bool 
>> force_remove)
>> +{
> 
> 
> Can you add some comment here about when force_remove may be expected
> here, which would help others understand immediately instead of walking 
> through
> the whole patch set?
>


mdev_device_remove_ops gets called from sysfs's 'remove' and when parent
device is being unregistered from mdev device framework.
- 'force_remove' is set to 'false' when called from sysfs's 'remove'
which indicates that if the mdev device is active, used by VMM or
userspace application, vendor driver could return error then don't
remove the device.
- 'force_remove' is set to 'true' when called from
mdev_unregister_device() which indicate that parent device is being
removed from mdev device framework so remove mdev device forcefully.

> 
>> +struct parent_device *parent = mdev->parent;
>> +int ret;
>> +
>> +/*
>> + * Vendor driver can return error if VMM or userspace application is
>> + * using this mdev device.
>> + */
>> +ret = parent->ops->remove(mdev);
> 
> what about passing force_remove flag to remove callback, so vendor driver
> can decide whether any force cleanup required?
>

'remove' getting called from sysfs is asynchronous, so vendor driver can
retrun failure in that case if vendor driver finds that mdev device is
being actively used.

mdev_unregister_device() is going to be called from vendor driver itself
when device is being unbound or driver is being unloaded. In this case
vendor driver can identify itself that its in its own teardown path.

So I feel there is no need to pass force_remove flag to 'remove' callback.



>> +int  mdev_create_sysfs_files(struct device *dev, struct mdev_type *type)
>> +{
>> +int ret;
>> +
>> +ret = sysfs_create_files(>kobj, mdev_device_attrs);
>> +if (ret) {
>> +pr_err("Failed to create remove sysfs entry\n");
>> +return ret;
>> +}
>> +
>> +ret = sysfs_create_link(type->devices_kobj, >kobj, dev_name(dev));
>> +if (ret) {
>> +pr_err("Failed to create symlink in types\n");
> 
> looks wrong place...
>

No, this is correct. Above function creates symlink in
mdev_supported_types//devices directory.

>> +goto device_link_failed;
>> +}
>> +
>> +ret = sysfs_create_link(>kobj, >kobj, "mdev_type");
>> +if (ret) {
>> +pr_err("Failed to create symlink in device directory\n");
> 
> exchange with above.
> 
Again this is also 

RE: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-26 Thread Tian, Kevin
> From: Kirti Wankhede [mailto:kwankh...@nvidia.com]
> Sent: Tuesday, October 18, 2016 5:22 AM
> 
> Design for Mediated Device Driver:
> Main purpose of this driver is to provide a common interface for mediated
> device management that can be used by different drivers of different
> devices.
> 
> This module provides a generic interface to create the device, add it to
> mediated bus, add device to IOMMU group and then add it to vfio group.
> 
> Below is the high Level block diagram, with Nvidia, Intel and IBM devices
> as example, since these are the devices which are going to actively use
> this module as of now.
> 
>  +---+
>  |   |
>  | +---+ |  mdev_register_driver() +--+
>  | |   | +<+ __init() |
>  | |  mdev | | |  |
>  | |  bus  | +>+  |<-> VFIO user
>  | |  driver   | | probe()/remove()| vfio_mdev.ko |APIs
>  | |   | | |  |
>  | +---+ | +--+
>  |   |
>  |  MDEV CORE|
>  |   MODULE  |
>  |   mdev.ko |
>  | +---+ |  mdev_register_device() +--+
>  | |   | +<+  |
>  | |   | | |  nvidia.ko   |<-> physical
>  | |   | +>+  |device
>  | |   | |callback +--+
>  | | Physical  | |
>  | |  device   | |  mdev_register_device() +--+
>  | | interface | |<+  |
>  | |   | | |  i915.ko |<-> physical
>  | |   | +>+  |device
>  | |   | |callback +--+
>  | |   | |
>  | |   | |  mdev_register_device() +--+
>  | |   | +<+  |
>  | |   | | | ccw_device.ko|<-> physical
>  | |   | +>+  |device
>  | |   | |callback +--+
>  | +---+ |
>  +---+
> 
> Core driver provides two types of registration interfaces:
> 1. Registration interface for mediated bus driver:
> 
> /**
>   * struct mdev_driver - Mediated device's driver
>   * @name: driver name
>   * @probe: called when new device created
>   * @remove:called when device removed
>   * @driver:device driver structure
>   *
>   **/
> struct mdev_driver {
>  const char *name;
>  int  (*probe)  (struct device *dev);
>  void (*remove) (struct device *dev);
>  struct device_driverdriver;
> };
> 
> Mediated bus driver for mdev device should use this interface to register
> and unregister with core driver respectively:
> 
> int  mdev_register_driver(struct mdev_driver *drv, struct module *owner);
> void mdev_unregister_driver(struct mdev_driver *drv);
> 
> Medisted bus driver is responsible to add/delete mediated devices to/from

Medisted -> Mediated

> VFIO group when devices are bound and unbound to the driver.
> 
> 2. Physical device driver interface
> This interface provides vendor driver the set APIs to manage physical
> device related work in its driver. APIs are :
> 
> * dev_attr_groups: attributes of the parent device.
> * mdev_attr_groups: attributes of the mediated device.
> * supported_type_groups: attributes to define supported type. This is
>mandatory field.
> * create: to allocate basic resources in driver for a mediated device.

in 'which driver'? it should be clear to remove 'in driver' here

> * remove: to free resources in driver when mediated device is destroyed.
> * open: open callback of mediated device
> * release: release callback of mediated device
> * read : read emulation callback.
> * write: write emulation callback.
> * mmap: mmap emulation callback.
> * ioctl: ioctl callback.

You only highlight 'mandatory field' for supported_type_groups. What
about other fields? Are all of them optional? Please clarify and also
stay consistent to later code comment.

> 
> Drivers should use these interfaces to register and unregister device to
> mdev core driver respectively:
> 
> extern int  mdev_register_device(struct device *dev,
>  const struct parent_ops *ops);
> extern void mdev_unregister_device(struct device *dev);
> 
> There are no locks to serialize above callbacks in mdev driver and
> vfio_mdev driver. If required, vendor driver can have locks to serialize
> above APIs in their driver.
> 
> Signed-off-by: Kirti Wankhede 
> Signed-off-by: Neo Jia 
> Change-Id: I73a5084574270b14541c529461ea2f03c292d510
> ---
>  drivers/vfio/Kconfig |   1 +
>  drivers/vfio/Makefile|   1 +
>  

RE: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-26 Thread Tian, Kevin
> From: Kirti Wankhede [mailto:kwankh...@nvidia.com]
> Sent: Tuesday, October 18, 2016 5:22 AM
> 
> Design for Mediated Device Driver:
> Main purpose of this driver is to provide a common interface for mediated
> device management that can be used by different drivers of different
> devices.
> 
> This module provides a generic interface to create the device, add it to
> mediated bus, add device to IOMMU group and then add it to vfio group.
> 
> Below is the high Level block diagram, with Nvidia, Intel and IBM devices
> as example, since these are the devices which are going to actively use
> this module as of now.
> 
>  +---+
>  |   |
>  | +---+ |  mdev_register_driver() +--+
>  | |   | +<+ __init() |
>  | |  mdev | | |  |
>  | |  bus  | +>+  |<-> VFIO user
>  | |  driver   | | probe()/remove()| vfio_mdev.ko |APIs
>  | |   | | |  |
>  | +---+ | +--+
>  |   |
>  |  MDEV CORE|
>  |   MODULE  |
>  |   mdev.ko |
>  | +---+ |  mdev_register_device() +--+
>  | |   | +<+  |
>  | |   | | |  nvidia.ko   |<-> physical
>  | |   | +>+  |device
>  | |   | |callback +--+
>  | | Physical  | |
>  | |  device   | |  mdev_register_device() +--+
>  | | interface | |<+  |
>  | |   | | |  i915.ko |<-> physical
>  | |   | +>+  |device
>  | |   | |callback +--+
>  | |   | |
>  | |   | |  mdev_register_device() +--+
>  | |   | +<+  |
>  | |   | | | ccw_device.ko|<-> physical
>  | |   | +>+  |device
>  | |   | |callback +--+
>  | +---+ |
>  +---+
> 
> Core driver provides two types of registration interfaces:
> 1. Registration interface for mediated bus driver:
> 
> /**
>   * struct mdev_driver - Mediated device's driver
>   * @name: driver name
>   * @probe: called when new device created
>   * @remove:called when device removed
>   * @driver:device driver structure
>   *
>   **/
> struct mdev_driver {
>  const char *name;
>  int  (*probe)  (struct device *dev);
>  void (*remove) (struct device *dev);
>  struct device_driverdriver;
> };
> 
> Mediated bus driver for mdev device should use this interface to register
> and unregister with core driver respectively:
> 
> int  mdev_register_driver(struct mdev_driver *drv, struct module *owner);
> void mdev_unregister_driver(struct mdev_driver *drv);
> 
> Medisted bus driver is responsible to add/delete mediated devices to/from

Medisted -> Mediated

> VFIO group when devices are bound and unbound to the driver.
> 
> 2. Physical device driver interface
> This interface provides vendor driver the set APIs to manage physical
> device related work in its driver. APIs are :
> 
> * dev_attr_groups: attributes of the parent device.
> * mdev_attr_groups: attributes of the mediated device.
> * supported_type_groups: attributes to define supported type. This is
>mandatory field.
> * create: to allocate basic resources in driver for a mediated device.

in 'which driver'? it should be clear to remove 'in driver' here

> * remove: to free resources in driver when mediated device is destroyed.
> * open: open callback of mediated device
> * release: release callback of mediated device
> * read : read emulation callback.
> * write: write emulation callback.
> * mmap: mmap emulation callback.
> * ioctl: ioctl callback.

You only highlight 'mandatory field' for supported_type_groups. What
about other fields? Are all of them optional? Please clarify and also
stay consistent to later code comment.

> 
> Drivers should use these interfaces to register and unregister device to
> mdev core driver respectively:
> 
> extern int  mdev_register_device(struct device *dev,
>  const struct parent_ops *ops);
> extern void mdev_unregister_device(struct device *dev);
> 
> There are no locks to serialize above callbacks in mdev driver and
> vfio_mdev driver. If required, vendor driver can have locks to serialize
> above APIs in their driver.
> 
> Signed-off-by: Kirti Wankhede 
> Signed-off-by: Neo Jia 
> Change-Id: I73a5084574270b14541c529461ea2f03c292d510
> ---
>  drivers/vfio/Kconfig |   1 +
>  drivers/vfio/Makefile|   1 +
>  drivers/vfio/mdev/Kconfig|  11 ++
>  

Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-20 Thread Jike Song
On 10/21/2016 01:12 AM, Alex Williamson wrote:
> On Thu, 20 Oct 2016 15:23:53 +0800
> Jike Song  wrote:
> 
>> On 10/18/2016 05:22 AM, Kirti Wankhede wrote:
>>> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
>>> new file mode 100644
>>> index ..7db5ec164aeb
>>> --- /dev/null
>>> +++ b/drivers/vfio/mdev/mdev_core.c
>>> @@ -0,0 +1,372 @@
>>> +/*
>>> + * Mediated device Core Driver
>>> + *
>>> + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
>>> + * Author: Neo Jia 
>>> + *Kirti Wankhede 
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + */
>>> +
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +
>>> +#include "mdev_private.h"
>>> +
>>> +#define DRIVER_VERSION "0.1"
>>> +#define DRIVER_AUTHOR  "NVIDIA Corporation"
>>> +#define DRIVER_DESC"Mediated device Core Driver"
>>> +
>>> +static LIST_HEAD(parent_list);
>>> +static DEFINE_MUTEX(parent_list_lock);
>>> +static struct class_compat *mdev_bus_compat_class;
>>> +  
>>
>>> +
>>> +/*
>>> + * mdev_register_device : Register a device
>>> + * @dev: device structure representing parent device.
>>> + * @ops: Parent device operation structure to be registered.
>>> + *
>>> + * Add device to list of registered parent devices.
>>> + * Returns a negative value on error, otherwise 0.
>>> + */
>>> +int mdev_register_device(struct device *dev, const struct parent_ops *ops)
>>> +{
>>> +   int ret = 0;
>>> +   struct parent_device *parent;
>>> +
>>> +   /* check for mandatory ops */
>>> +   if (!ops || !ops->create || !ops->remove || !ops->supported_type_groups)
>>> +   return -EINVAL;
>>> +
>>> +   dev = get_device(dev);
>>> +   if (!dev)
>>> +   return -EINVAL;
>>> +
>>> +   mutex_lock(_list_lock);
>>> +
>>> +   /* Check for duplicate */
>>> +   parent = __find_parent_device(dev);
>>> +   if (parent) {
>>> +   ret = -EEXIST;
>>> +   goto add_dev_err;
>>> +   }
>>> +
>>> +   parent = kzalloc(sizeof(*parent), GFP_KERNEL);
>>> +   if (!parent) {
>>> +   ret = -ENOMEM;
>>> +   goto add_dev_err;
>>> +   }
>>> +
>>> +   kref_init(>ref);
>>> +
>>> +   parent->dev = dev;
>>> +   parent->ops = ops;
>>> +
>>> +   ret = parent_create_sysfs_files(parent);
>>> +   if (ret) {
>>> +   mutex_unlock(_list_lock);
>>> +   mdev_put_parent(parent);
>>> +   return ret;
>>> +   }
>>> +
>>> +   ret = class_compat_create_link(mdev_bus_compat_class, dev, NULL);
>>> +   if (ret)
>>> +   dev_warn(dev, "Failed to create compatibility class link\n");
>>> +
>>> +   list_add(>next, _list);
>>> +   mutex_unlock(_list_lock);
>>> +
>>> +   dev_info(dev, "MDEV: Registered\n");
>>> +   return 0;
>>> +
>>> +add_dev_err:
>>> +   mutex_unlock(_list_lock);
>>> +   put_device(dev);
>>> +   return ret;
>>> +}
>>> +EXPORT_SYMBOL(mdev_register_device);  
>>
>>> +static int __init mdev_init(void)
>>> +{
>>> +   int ret;
>>> +
>>> +   ret = mdev_bus_register();
>>> +   if (ret) {
>>> +   pr_err("Failed to register mdev bus\n");
>>> +   return ret;
>>> +   }
>>> +
>>> +   mdev_bus_compat_class = class_compat_register("mdev_bus");
>>> +   if (!mdev_bus_compat_class) {
>>> +   mdev_bus_unregister();
>>> +   return -ENOMEM;
>>> +   }
>>> +
>>> +   /*
>>> +* Attempt to load known vfio_mdev.  This gives us a working environment
>>> +* without the user needing to explicitly load vfio_mdev driver.
>>> +*/
>>> +   request_module_nowait("vfio_mdev");
>>> +
>>> +   return ret;
>>> +}
>>> +
>>> +static void __exit mdev_exit(void)
>>> +{
>>> +   class_compat_unregister(mdev_bus_compat_class);
>>> +   mdev_bus_unregister();
>>> +}
>>> +
>>> +module_init(mdev_init)
>>> +module_exit(mdev_exit)  
>>
>> Hi Kirti,
>>
>> There is a possible issue: mdev_bus_register is called from mdev_init,
>> a module_init, equal to device_initcall if builtin to vmlinux; however,
>> the vendor driver, say i915.ko for intel case, have to call
>> mdev_register_device from its module_init: at that time, mdev_init
>> is still not called.
>>
>> Not sure if this issue exists with nvidia.ko. Though in most cases we
>> are expecting users select mdev as a standalone module, we still won't
>> break builtin case.
>>
>>
>> Hi Alex, do you have any suggestion here?
> 
> To fully solve the problem of built-in drivers making use of the mdev
> infrastructure we'd need to make mdev itself builtin and possibly a
> subsystem that is initialized prior to device drivers.  Is that really
> necessary?  Even though i915.ko is often loaded as part of an
> initramfs, most systems still build it as a module.  I would expect
> that standard module dependencies will pull in the necessary 

Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-20 Thread Jike Song
On 10/21/2016 01:12 AM, Alex Williamson wrote:
> On Thu, 20 Oct 2016 15:23:53 +0800
> Jike Song  wrote:
> 
>> On 10/18/2016 05:22 AM, Kirti Wankhede wrote:
>>> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
>>> new file mode 100644
>>> index ..7db5ec164aeb
>>> --- /dev/null
>>> +++ b/drivers/vfio/mdev/mdev_core.c
>>> @@ -0,0 +1,372 @@
>>> +/*
>>> + * Mediated device Core Driver
>>> + *
>>> + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
>>> + * Author: Neo Jia 
>>> + *Kirti Wankhede 
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + */
>>> +
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +
>>> +#include "mdev_private.h"
>>> +
>>> +#define DRIVER_VERSION "0.1"
>>> +#define DRIVER_AUTHOR  "NVIDIA Corporation"
>>> +#define DRIVER_DESC"Mediated device Core Driver"
>>> +
>>> +static LIST_HEAD(parent_list);
>>> +static DEFINE_MUTEX(parent_list_lock);
>>> +static struct class_compat *mdev_bus_compat_class;
>>> +  
>>
>>> +
>>> +/*
>>> + * mdev_register_device : Register a device
>>> + * @dev: device structure representing parent device.
>>> + * @ops: Parent device operation structure to be registered.
>>> + *
>>> + * Add device to list of registered parent devices.
>>> + * Returns a negative value on error, otherwise 0.
>>> + */
>>> +int mdev_register_device(struct device *dev, const struct parent_ops *ops)
>>> +{
>>> +   int ret = 0;
>>> +   struct parent_device *parent;
>>> +
>>> +   /* check for mandatory ops */
>>> +   if (!ops || !ops->create || !ops->remove || !ops->supported_type_groups)
>>> +   return -EINVAL;
>>> +
>>> +   dev = get_device(dev);
>>> +   if (!dev)
>>> +   return -EINVAL;
>>> +
>>> +   mutex_lock(_list_lock);
>>> +
>>> +   /* Check for duplicate */
>>> +   parent = __find_parent_device(dev);
>>> +   if (parent) {
>>> +   ret = -EEXIST;
>>> +   goto add_dev_err;
>>> +   }
>>> +
>>> +   parent = kzalloc(sizeof(*parent), GFP_KERNEL);
>>> +   if (!parent) {
>>> +   ret = -ENOMEM;
>>> +   goto add_dev_err;
>>> +   }
>>> +
>>> +   kref_init(>ref);
>>> +
>>> +   parent->dev = dev;
>>> +   parent->ops = ops;
>>> +
>>> +   ret = parent_create_sysfs_files(parent);
>>> +   if (ret) {
>>> +   mutex_unlock(_list_lock);
>>> +   mdev_put_parent(parent);
>>> +   return ret;
>>> +   }
>>> +
>>> +   ret = class_compat_create_link(mdev_bus_compat_class, dev, NULL);
>>> +   if (ret)
>>> +   dev_warn(dev, "Failed to create compatibility class link\n");
>>> +
>>> +   list_add(>next, _list);
>>> +   mutex_unlock(_list_lock);
>>> +
>>> +   dev_info(dev, "MDEV: Registered\n");
>>> +   return 0;
>>> +
>>> +add_dev_err:
>>> +   mutex_unlock(_list_lock);
>>> +   put_device(dev);
>>> +   return ret;
>>> +}
>>> +EXPORT_SYMBOL(mdev_register_device);  
>>
>>> +static int __init mdev_init(void)
>>> +{
>>> +   int ret;
>>> +
>>> +   ret = mdev_bus_register();
>>> +   if (ret) {
>>> +   pr_err("Failed to register mdev bus\n");
>>> +   return ret;
>>> +   }
>>> +
>>> +   mdev_bus_compat_class = class_compat_register("mdev_bus");
>>> +   if (!mdev_bus_compat_class) {
>>> +   mdev_bus_unregister();
>>> +   return -ENOMEM;
>>> +   }
>>> +
>>> +   /*
>>> +* Attempt to load known vfio_mdev.  This gives us a working environment
>>> +* without the user needing to explicitly load vfio_mdev driver.
>>> +*/
>>> +   request_module_nowait("vfio_mdev");
>>> +
>>> +   return ret;
>>> +}
>>> +
>>> +static void __exit mdev_exit(void)
>>> +{
>>> +   class_compat_unregister(mdev_bus_compat_class);
>>> +   mdev_bus_unregister();
>>> +}
>>> +
>>> +module_init(mdev_init)
>>> +module_exit(mdev_exit)  
>>
>> Hi Kirti,
>>
>> There is a possible issue: mdev_bus_register is called from mdev_init,
>> a module_init, equal to device_initcall if builtin to vmlinux; however,
>> the vendor driver, say i915.ko for intel case, have to call
>> mdev_register_device from its module_init: at that time, mdev_init
>> is still not called.
>>
>> Not sure if this issue exists with nvidia.ko. Though in most cases we
>> are expecting users select mdev as a standalone module, we still won't
>> break builtin case.
>>
>>
>> Hi Alex, do you have any suggestion here?
> 
> To fully solve the problem of built-in drivers making use of the mdev
> infrastructure we'd need to make mdev itself builtin and possibly a
> subsystem that is initialized prior to device drivers.  Is that really
> necessary?  Even though i915.ko is often loaded as part of an
> initramfs, most systems still build it as a module.  I would expect
> that standard module dependencies will pull in the necessary mdev and
> vfio modules to make this work correctly.  I can't 

Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-20 Thread Alex Williamson
On Thu, 20 Oct 2016 15:23:53 +0800
Jike Song  wrote:

> On 10/18/2016 05:22 AM, Kirti Wankhede wrote:
> > diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> > new file mode 100644
> > index ..7db5ec164aeb
> > --- /dev/null
> > +++ b/drivers/vfio/mdev/mdev_core.c
> > @@ -0,0 +1,372 @@
> > +/*
> > + * Mediated device Core Driver
> > + *
> > + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
> > + * Author: Neo Jia 
> > + *Kirti Wankhede 
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "mdev_private.h"
> > +
> > +#define DRIVER_VERSION "0.1"
> > +#define DRIVER_AUTHOR  "NVIDIA Corporation"
> > +#define DRIVER_DESC"Mediated device Core Driver"
> > +
> > +static LIST_HEAD(parent_list);
> > +static DEFINE_MUTEX(parent_list_lock);
> > +static struct class_compat *mdev_bus_compat_class;
> > +  
> 
> > +
> > +/*
> > + * mdev_register_device : Register a device
> > + * @dev: device structure representing parent device.
> > + * @ops: Parent device operation structure to be registered.
> > + *
> > + * Add device to list of registered parent devices.
> > + * Returns a negative value on error, otherwise 0.
> > + */
> > +int mdev_register_device(struct device *dev, const struct parent_ops *ops)
> > +{
> > +   int ret = 0;
> > +   struct parent_device *parent;
> > +
> > +   /* check for mandatory ops */
> > +   if (!ops || !ops->create || !ops->remove || !ops->supported_type_groups)
> > +   return -EINVAL;
> > +
> > +   dev = get_device(dev);
> > +   if (!dev)
> > +   return -EINVAL;
> > +
> > +   mutex_lock(_list_lock);
> > +
> > +   /* Check for duplicate */
> > +   parent = __find_parent_device(dev);
> > +   if (parent) {
> > +   ret = -EEXIST;
> > +   goto add_dev_err;
> > +   }
> > +
> > +   parent = kzalloc(sizeof(*parent), GFP_KERNEL);
> > +   if (!parent) {
> > +   ret = -ENOMEM;
> > +   goto add_dev_err;
> > +   }
> > +
> > +   kref_init(>ref);
> > +
> > +   parent->dev = dev;
> > +   parent->ops = ops;
> > +
> > +   ret = parent_create_sysfs_files(parent);
> > +   if (ret) {
> > +   mutex_unlock(_list_lock);
> > +   mdev_put_parent(parent);
> > +   return ret;
> > +   }
> > +
> > +   ret = class_compat_create_link(mdev_bus_compat_class, dev, NULL);
> > +   if (ret)
> > +   dev_warn(dev, "Failed to create compatibility class link\n");
> > +
> > +   list_add(>next, _list);
> > +   mutex_unlock(_list_lock);
> > +
> > +   dev_info(dev, "MDEV: Registered\n");
> > +   return 0;
> > +
> > +add_dev_err:
> > +   mutex_unlock(_list_lock);
> > +   put_device(dev);
> > +   return ret;
> > +}
> > +EXPORT_SYMBOL(mdev_register_device);  
> 
> > +static int __init mdev_init(void)
> > +{
> > +   int ret;
> > +
> > +   ret = mdev_bus_register();
> > +   if (ret) {
> > +   pr_err("Failed to register mdev bus\n");
> > +   return ret;
> > +   }
> > +
> > +   mdev_bus_compat_class = class_compat_register("mdev_bus");
> > +   if (!mdev_bus_compat_class) {
> > +   mdev_bus_unregister();
> > +   return -ENOMEM;
> > +   }
> > +
> > +   /*
> > +* Attempt to load known vfio_mdev.  This gives us a working environment
> > +* without the user needing to explicitly load vfio_mdev driver.
> > +*/
> > +   request_module_nowait("vfio_mdev");
> > +
> > +   return ret;
> > +}
> > +
> > +static void __exit mdev_exit(void)
> > +{
> > +   class_compat_unregister(mdev_bus_compat_class);
> > +   mdev_bus_unregister();
> > +}
> > +
> > +module_init(mdev_init)
> > +module_exit(mdev_exit)  
> 
> Hi Kirti,
> 
> There is a possible issue: mdev_bus_register is called from mdev_init,
> a module_init, equal to device_initcall if builtin to vmlinux; however,
> the vendor driver, say i915.ko for intel case, have to call
> mdev_register_device from its module_init: at that time, mdev_init
> is still not called.
> 
> Not sure if this issue exists with nvidia.ko. Though in most cases we
> are expecting users select mdev as a standalone module, we still won't
> break builtin case.
> 
> 
> Hi Alex, do you have any suggestion here?

To fully solve the problem of built-in drivers making use of the mdev
infrastructure we'd need to make mdev itself builtin and possibly a
subsystem that is initialized prior to device drivers.  Is that really
necessary?  Even though i915.ko is often loaded as part of an
initramfs, most systems still build it as a module.  I would expect
that standard module dependencies will pull in the necessary mdev and
vfio modules to make this work correctly.  I can't say that I'm

Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-20 Thread Alex Williamson
On Thu, 20 Oct 2016 15:23:53 +0800
Jike Song  wrote:

> On 10/18/2016 05:22 AM, Kirti Wankhede wrote:
> > diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> > new file mode 100644
> > index ..7db5ec164aeb
> > --- /dev/null
> > +++ b/drivers/vfio/mdev/mdev_core.c
> > @@ -0,0 +1,372 @@
> > +/*
> > + * Mediated device Core Driver
> > + *
> > + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
> > + * Author: Neo Jia 
> > + *Kirti Wankhede 
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "mdev_private.h"
> > +
> > +#define DRIVER_VERSION "0.1"
> > +#define DRIVER_AUTHOR  "NVIDIA Corporation"
> > +#define DRIVER_DESC"Mediated device Core Driver"
> > +
> > +static LIST_HEAD(parent_list);
> > +static DEFINE_MUTEX(parent_list_lock);
> > +static struct class_compat *mdev_bus_compat_class;
> > +  
> 
> > +
> > +/*
> > + * mdev_register_device : Register a device
> > + * @dev: device structure representing parent device.
> > + * @ops: Parent device operation structure to be registered.
> > + *
> > + * Add device to list of registered parent devices.
> > + * Returns a negative value on error, otherwise 0.
> > + */
> > +int mdev_register_device(struct device *dev, const struct parent_ops *ops)
> > +{
> > +   int ret = 0;
> > +   struct parent_device *parent;
> > +
> > +   /* check for mandatory ops */
> > +   if (!ops || !ops->create || !ops->remove || !ops->supported_type_groups)
> > +   return -EINVAL;
> > +
> > +   dev = get_device(dev);
> > +   if (!dev)
> > +   return -EINVAL;
> > +
> > +   mutex_lock(_list_lock);
> > +
> > +   /* Check for duplicate */
> > +   parent = __find_parent_device(dev);
> > +   if (parent) {
> > +   ret = -EEXIST;
> > +   goto add_dev_err;
> > +   }
> > +
> > +   parent = kzalloc(sizeof(*parent), GFP_KERNEL);
> > +   if (!parent) {
> > +   ret = -ENOMEM;
> > +   goto add_dev_err;
> > +   }
> > +
> > +   kref_init(>ref);
> > +
> > +   parent->dev = dev;
> > +   parent->ops = ops;
> > +
> > +   ret = parent_create_sysfs_files(parent);
> > +   if (ret) {
> > +   mutex_unlock(_list_lock);
> > +   mdev_put_parent(parent);
> > +   return ret;
> > +   }
> > +
> > +   ret = class_compat_create_link(mdev_bus_compat_class, dev, NULL);
> > +   if (ret)
> > +   dev_warn(dev, "Failed to create compatibility class link\n");
> > +
> > +   list_add(>next, _list);
> > +   mutex_unlock(_list_lock);
> > +
> > +   dev_info(dev, "MDEV: Registered\n");
> > +   return 0;
> > +
> > +add_dev_err:
> > +   mutex_unlock(_list_lock);
> > +   put_device(dev);
> > +   return ret;
> > +}
> > +EXPORT_SYMBOL(mdev_register_device);  
> 
> > +static int __init mdev_init(void)
> > +{
> > +   int ret;
> > +
> > +   ret = mdev_bus_register();
> > +   if (ret) {
> > +   pr_err("Failed to register mdev bus\n");
> > +   return ret;
> > +   }
> > +
> > +   mdev_bus_compat_class = class_compat_register("mdev_bus");
> > +   if (!mdev_bus_compat_class) {
> > +   mdev_bus_unregister();
> > +   return -ENOMEM;
> > +   }
> > +
> > +   /*
> > +* Attempt to load known vfio_mdev.  This gives us a working environment
> > +* without the user needing to explicitly load vfio_mdev driver.
> > +*/
> > +   request_module_nowait("vfio_mdev");
> > +
> > +   return ret;
> > +}
> > +
> > +static void __exit mdev_exit(void)
> > +{
> > +   class_compat_unregister(mdev_bus_compat_class);
> > +   mdev_bus_unregister();
> > +}
> > +
> > +module_init(mdev_init)
> > +module_exit(mdev_exit)  
> 
> Hi Kirti,
> 
> There is a possible issue: mdev_bus_register is called from mdev_init,
> a module_init, equal to device_initcall if builtin to vmlinux; however,
> the vendor driver, say i915.ko for intel case, have to call
> mdev_register_device from its module_init: at that time, mdev_init
> is still not called.
> 
> Not sure if this issue exists with nvidia.ko. Though in most cases we
> are expecting users select mdev as a standalone module, we still won't
> break builtin case.
> 
> 
> Hi Alex, do you have any suggestion here?

To fully solve the problem of built-in drivers making use of the mdev
infrastructure we'd need to make mdev itself builtin and possibly a
subsystem that is initialized prior to device drivers.  Is that really
necessary?  Even though i915.ko is often loaded as part of an
initramfs, most systems still build it as a module.  I would expect
that standard module dependencies will pull in the necessary mdev and
vfio modules to make this work correctly.  I can't say that I'm
prepared to make mdev be a subsystem as would be necessary for 

Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-20 Thread Jike Song
On 10/18/2016 05:22 AM, Kirti Wankhede wrote:
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> new file mode 100644
> index ..7db5ec164aeb
> --- /dev/null
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -0,0 +1,372 @@
> +/*
> + * Mediated device Core Driver
> + *
> + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
> + * Author: Neo Jia 
> + *  Kirti Wankhede 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "mdev_private.h"
> +
> +#define DRIVER_VERSION   "0.1"
> +#define DRIVER_AUTHOR"NVIDIA Corporation"
> +#define DRIVER_DESC  "Mediated device Core Driver"
> +
> +static LIST_HEAD(parent_list);
> +static DEFINE_MUTEX(parent_list_lock);
> +static struct class_compat *mdev_bus_compat_class;
> +

> +
> +/*
> + * mdev_register_device : Register a device
> + * @dev: device structure representing parent device.
> + * @ops: Parent device operation structure to be registered.
> + *
> + * Add device to list of registered parent devices.
> + * Returns a negative value on error, otherwise 0.
> + */
> +int mdev_register_device(struct device *dev, const struct parent_ops *ops)
> +{
> + int ret = 0;
> + struct parent_device *parent;
> +
> + /* check for mandatory ops */
> + if (!ops || !ops->create || !ops->remove || !ops->supported_type_groups)
> + return -EINVAL;
> +
> + dev = get_device(dev);
> + if (!dev)
> + return -EINVAL;
> +
> + mutex_lock(_list_lock);
> +
> + /* Check for duplicate */
> + parent = __find_parent_device(dev);
> + if (parent) {
> + ret = -EEXIST;
> + goto add_dev_err;
> + }
> +
> + parent = kzalloc(sizeof(*parent), GFP_KERNEL);
> + if (!parent) {
> + ret = -ENOMEM;
> + goto add_dev_err;
> + }
> +
> + kref_init(>ref);
> +
> + parent->dev = dev;
> + parent->ops = ops;
> +
> + ret = parent_create_sysfs_files(parent);
> + if (ret) {
> + mutex_unlock(_list_lock);
> + mdev_put_parent(parent);
> + return ret;
> + }
> +
> + ret = class_compat_create_link(mdev_bus_compat_class, dev, NULL);
> + if (ret)
> + dev_warn(dev, "Failed to create compatibility class link\n");
> +
> + list_add(>next, _list);
> + mutex_unlock(_list_lock);
> +
> + dev_info(dev, "MDEV: Registered\n");
> + return 0;
> +
> +add_dev_err:
> + mutex_unlock(_list_lock);
> + put_device(dev);
> + return ret;
> +}
> +EXPORT_SYMBOL(mdev_register_device);

> +static int __init mdev_init(void)
> +{
> + int ret;
> +
> + ret = mdev_bus_register();
> + if (ret) {
> + pr_err("Failed to register mdev bus\n");
> + return ret;
> + }
> +
> + mdev_bus_compat_class = class_compat_register("mdev_bus");
> + if (!mdev_bus_compat_class) {
> + mdev_bus_unregister();
> + return -ENOMEM;
> + }
> +
> + /*
> +  * Attempt to load known vfio_mdev.  This gives us a working environment
> +  * without the user needing to explicitly load vfio_mdev driver.
> +  */
> + request_module_nowait("vfio_mdev");
> +
> + return ret;
> +}
> +
> +static void __exit mdev_exit(void)
> +{
> + class_compat_unregister(mdev_bus_compat_class);
> + mdev_bus_unregister();
> +}
> +
> +module_init(mdev_init)
> +module_exit(mdev_exit)

Hi Kirti,

There is a possible issue: mdev_bus_register is called from mdev_init,
a module_init, equal to device_initcall if builtin to vmlinux; however,
the vendor driver, say i915.ko for intel case, have to call
mdev_register_device from its module_init: at that time, mdev_init
is still not called.

Not sure if this issue exists with nvidia.ko. Though in most cases we
are expecting users select mdev as a standalone module, we still won't
break builtin case.


Hi Alex, do you have any suggestion here?


--
Thanks,
Jike


Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-20 Thread Jike Song
On 10/18/2016 05:22 AM, Kirti Wankhede wrote:
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> new file mode 100644
> index ..7db5ec164aeb
> --- /dev/null
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -0,0 +1,372 @@
> +/*
> + * Mediated device Core Driver
> + *
> + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
> + * Author: Neo Jia 
> + *  Kirti Wankhede 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "mdev_private.h"
> +
> +#define DRIVER_VERSION   "0.1"
> +#define DRIVER_AUTHOR"NVIDIA Corporation"
> +#define DRIVER_DESC  "Mediated device Core Driver"
> +
> +static LIST_HEAD(parent_list);
> +static DEFINE_MUTEX(parent_list_lock);
> +static struct class_compat *mdev_bus_compat_class;
> +

> +
> +/*
> + * mdev_register_device : Register a device
> + * @dev: device structure representing parent device.
> + * @ops: Parent device operation structure to be registered.
> + *
> + * Add device to list of registered parent devices.
> + * Returns a negative value on error, otherwise 0.
> + */
> +int mdev_register_device(struct device *dev, const struct parent_ops *ops)
> +{
> + int ret = 0;
> + struct parent_device *parent;
> +
> + /* check for mandatory ops */
> + if (!ops || !ops->create || !ops->remove || !ops->supported_type_groups)
> + return -EINVAL;
> +
> + dev = get_device(dev);
> + if (!dev)
> + return -EINVAL;
> +
> + mutex_lock(_list_lock);
> +
> + /* Check for duplicate */
> + parent = __find_parent_device(dev);
> + if (parent) {
> + ret = -EEXIST;
> + goto add_dev_err;
> + }
> +
> + parent = kzalloc(sizeof(*parent), GFP_KERNEL);
> + if (!parent) {
> + ret = -ENOMEM;
> + goto add_dev_err;
> + }
> +
> + kref_init(>ref);
> +
> + parent->dev = dev;
> + parent->ops = ops;
> +
> + ret = parent_create_sysfs_files(parent);
> + if (ret) {
> + mutex_unlock(_list_lock);
> + mdev_put_parent(parent);
> + return ret;
> + }
> +
> + ret = class_compat_create_link(mdev_bus_compat_class, dev, NULL);
> + if (ret)
> + dev_warn(dev, "Failed to create compatibility class link\n");
> +
> + list_add(>next, _list);
> + mutex_unlock(_list_lock);
> +
> + dev_info(dev, "MDEV: Registered\n");
> + return 0;
> +
> +add_dev_err:
> + mutex_unlock(_list_lock);
> + put_device(dev);
> + return ret;
> +}
> +EXPORT_SYMBOL(mdev_register_device);

> +static int __init mdev_init(void)
> +{
> + int ret;
> +
> + ret = mdev_bus_register();
> + if (ret) {
> + pr_err("Failed to register mdev bus\n");
> + return ret;
> + }
> +
> + mdev_bus_compat_class = class_compat_register("mdev_bus");
> + if (!mdev_bus_compat_class) {
> + mdev_bus_unregister();
> + return -ENOMEM;
> + }
> +
> + /*
> +  * Attempt to load known vfio_mdev.  This gives us a working environment
> +  * without the user needing to explicitly load vfio_mdev driver.
> +  */
> + request_module_nowait("vfio_mdev");
> +
> + return ret;
> +}
> +
> +static void __exit mdev_exit(void)
> +{
> + class_compat_unregister(mdev_bus_compat_class);
> + mdev_bus_unregister();
> +}
> +
> +module_init(mdev_init)
> +module_exit(mdev_exit)

Hi Kirti,

There is a possible issue: mdev_bus_register is called from mdev_init,
a module_init, equal to device_initcall if builtin to vmlinux; however,
the vendor driver, say i915.ko for intel case, have to call
mdev_register_device from its module_init: at that time, mdev_init
is still not called.

Not sure if this issue exists with nvidia.ko. Though in most cases we
are expecting users select mdev as a standalone module, we still won't
break builtin case.


Hi Alex, do you have any suggestion here?


--
Thanks,
Jike


Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-19 Thread Alex Williamson
On Thu, 20 Oct 2016 00:46:48 +0530
Kirti Wankhede  wrote:

> On 10/19/2016 4:46 AM, Alex Williamson wrote:
> > On Tue, 18 Oct 2016 02:52:01 +0530
> > Kirti Wankhede  wrote:
> >   
> ...
> >> +static struct mdev_device *__find_mdev_device(struct parent_device 
> >> *parent,
> >> +uuid_le uuid)
> >> +{
> >> +  struct device *dev;
> >> +
> >> +  dev = device_find_child(parent->dev, , _find_mdev_device);
> >> +  if (!dev)
> >> +  return NULL;
> >> +
> >> +  put_device(dev);
> >> +
> >> +  return to_mdev_device(dev);
> >> +}  
> > 
> > This function is only used by mdev_device_create() for the purpose of
> > checking whether a given uuid for a parent already exists, so the
> > returned device is not actually used.  However, at the point where
> > we're using to_mdev_device() here, we don't actually hold a reference to
> > the device, so that function call and any possible use of the returned
> > pointer by the callee is invalid.  I would either turn this into a
> > "get" function where the callee has a device reference and needs to do
> > a "put" on it or change this to a "exists" test where true/false is
> > returned and the function cannot be later mis-used to do a device
> > lookup where the reference isn't actually valid.
> >   
> 
> I'll change it to return 0 if not found and -EEXIST if found.
> 
> 
> >> +int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le 
> >> uuid)
> >> +{
> >> +  int ret;
> >> +  struct mdev_device *mdev;
> >> +  struct parent_device *parent;
> >> +  struct mdev_type *type = to_mdev_type(kobj);
> >> +
> >> +  parent = mdev_get_parent(type->parent);
> >> +  if (!parent)
> >> +  return -EINVAL;
> >> +
> >> +  /* Check for duplicate */
> >> +  mdev = __find_mdev_device(parent, uuid);
> >> +  if (mdev) {
> >> +  ret = -EEXIST;
> >> +  goto create_err;
> >> +  }  
> > 
> > We check here whether the {parent,uuid} already exists, but what
> > prevents us racing with another create call with the same uuid?  ie.
> > neither exists at this point.  Will device_register() fail if the
> > device name already exists?  If so, should we just rely on the error
> > there and skip this duplicate check?  If not, we need a mutex to avoid
> > the race.
> >  
> 
> Yes, device_register() fails if device exists already with below
> warning. Is it ok to dump such warning? I think, this should be fine,
> right? then we can remove duplicate check.
> 
> If we want to avoid such warning, we should have duplication check.

We should avoid such warnings, bugs will get filed otherwise.  Thanks
for checking.  Thanks,

Alex
 
> [  610.847958] [ cut here ]
> [  610.855377] WARNING: CPU: 15 PID: 19839 at fs/sysfs/dir.c:31
> sysfs_warn_dup+0x64/0x80
> [  610.865798] sysfs: cannot create duplicate filename
> '/devices/pci:80/:80:02.0/:83:00.0/:84:08.0/:85:00.0/83b8f4f2-509f-382f-3c1e-e6bfe0fa1234'
> [  610.885101] Modules linked in:[  610.888039]  nvidia(POE)
> vfio_iommu_type1 vfio_mdev mdev vfio nfsv4 dns_resolver nfs fscache
> sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp
> kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel
> ghash_clmulni_intel aesni_intel glue_helper lrw gf128mul ablk_helper
> cryptd nfsd auth_rpcgss nfs_acl lockd mei_me grace iTCO_wdt
> iTCO_vendor_support mei ipmi_si pcspkr ioatdma i2c_i801 lpc_ich shpchp
> i2c_smbus mfd_core ipmi_msghandler acpi_pad uinput sunrpc xfs libcrc32c
> sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt
> fb_sys_fops ttm drm igb ahci libahci ptp libata pps_core dca
> i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last
> unloaded: mdev]
> [  610.963835] CPU: 15 PID: 19839 Comm: bash Tainted: P   OE
> 4.8.0-next-20161013+ #0
> [  610.973779] Hardware name: Supermicro
> SYS-2027GR-5204A-NC024/X9DRG-HF, BIOS 1.0c 02/28/2013
> [  610.983769]  c90009323ae0 813568bf c90009323b30
> 
> [  610.992867]  c90009323b20 81085511 001f1000
> 8808839ef000
> [  611.001954]  88108b30f900 88109ae368e8 88109ae580b0
> 881099cc0818
> [  611.011055] Call Trace:
> [  611.015087]  [] dump_stack+0x63/0x84
> [  611.021784]  [] __warn+0xd1/0xf0
> [  611.028115]  [] warn_slowpath_fmt+0x5f/0x80
> [  611.035379]  [] ? kernfs_path_from_node+0x50/0x60
> [  611.043148]  [] sysfs_warn_dup+0x64/0x80
> [  611.050109]  [] sysfs_create_dir_ns+0x7e/0x90
> [  611.057481]  [] kobject_add_internal+0xc1/0x340
> [  611.065018]  [] kobject_add+0x75/0xd0
> [  611.071635]  [] device_add+0x119/0x610
> [  611.078314]  [] device_register+0x1a/0x20
> [  611.085261]  [] mdev_device_create+0xdd/0x200 [mdev]
> [  611.093143]  [] create_store+0xa8/0xe0 [mdev]
> [  611.100385]  [] mdev_type_attr_store+0x1b/0x30 [mdev]
> [  611.108309]  [] sysfs_kf_write+0x3a/0x50
> [  611.115096]  [] kernfs_fop_write+0x10b/0x190

Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-19 Thread Alex Williamson
On Thu, 20 Oct 2016 00:46:48 +0530
Kirti Wankhede  wrote:

> On 10/19/2016 4:46 AM, Alex Williamson wrote:
> > On Tue, 18 Oct 2016 02:52:01 +0530
> > Kirti Wankhede  wrote:
> >   
> ...
> >> +static struct mdev_device *__find_mdev_device(struct parent_device 
> >> *parent,
> >> +uuid_le uuid)
> >> +{
> >> +  struct device *dev;
> >> +
> >> +  dev = device_find_child(parent->dev, , _find_mdev_device);
> >> +  if (!dev)
> >> +  return NULL;
> >> +
> >> +  put_device(dev);
> >> +
> >> +  return to_mdev_device(dev);
> >> +}  
> > 
> > This function is only used by mdev_device_create() for the purpose of
> > checking whether a given uuid for a parent already exists, so the
> > returned device is not actually used.  However, at the point where
> > we're using to_mdev_device() here, we don't actually hold a reference to
> > the device, so that function call and any possible use of the returned
> > pointer by the callee is invalid.  I would either turn this into a
> > "get" function where the callee has a device reference and needs to do
> > a "put" on it or change this to a "exists" test where true/false is
> > returned and the function cannot be later mis-used to do a device
> > lookup where the reference isn't actually valid.
> >   
> 
> I'll change it to return 0 if not found and -EEXIST if found.
> 
> 
> >> +int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le 
> >> uuid)
> >> +{
> >> +  int ret;
> >> +  struct mdev_device *mdev;
> >> +  struct parent_device *parent;
> >> +  struct mdev_type *type = to_mdev_type(kobj);
> >> +
> >> +  parent = mdev_get_parent(type->parent);
> >> +  if (!parent)
> >> +  return -EINVAL;
> >> +
> >> +  /* Check for duplicate */
> >> +  mdev = __find_mdev_device(parent, uuid);
> >> +  if (mdev) {
> >> +  ret = -EEXIST;
> >> +  goto create_err;
> >> +  }  
> > 
> > We check here whether the {parent,uuid} already exists, but what
> > prevents us racing with another create call with the same uuid?  ie.
> > neither exists at this point.  Will device_register() fail if the
> > device name already exists?  If so, should we just rely on the error
> > there and skip this duplicate check?  If not, we need a mutex to avoid
> > the race.
> >  
> 
> Yes, device_register() fails if device exists already with below
> warning. Is it ok to dump such warning? I think, this should be fine,
> right? then we can remove duplicate check.
> 
> If we want to avoid such warning, we should have duplication check.

We should avoid such warnings, bugs will get filed otherwise.  Thanks
for checking.  Thanks,

Alex
 
> [  610.847958] [ cut here ]
> [  610.855377] WARNING: CPU: 15 PID: 19839 at fs/sysfs/dir.c:31
> sysfs_warn_dup+0x64/0x80
> [  610.865798] sysfs: cannot create duplicate filename
> '/devices/pci:80/:80:02.0/:83:00.0/:84:08.0/:85:00.0/83b8f4f2-509f-382f-3c1e-e6bfe0fa1234'
> [  610.885101] Modules linked in:[  610.888039]  nvidia(POE)
> vfio_iommu_type1 vfio_mdev mdev vfio nfsv4 dns_resolver nfs fscache
> sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp
> kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel
> ghash_clmulni_intel aesni_intel glue_helper lrw gf128mul ablk_helper
> cryptd nfsd auth_rpcgss nfs_acl lockd mei_me grace iTCO_wdt
> iTCO_vendor_support mei ipmi_si pcspkr ioatdma i2c_i801 lpc_ich shpchp
> i2c_smbus mfd_core ipmi_msghandler acpi_pad uinput sunrpc xfs libcrc32c
> sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt
> fb_sys_fops ttm drm igb ahci libahci ptp libata pps_core dca
> i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last
> unloaded: mdev]
> [  610.963835] CPU: 15 PID: 19839 Comm: bash Tainted: P   OE
> 4.8.0-next-20161013+ #0
> [  610.973779] Hardware name: Supermicro
> SYS-2027GR-5204A-NC024/X9DRG-HF, BIOS 1.0c 02/28/2013
> [  610.983769]  c90009323ae0 813568bf c90009323b30
> 
> [  610.992867]  c90009323b20 81085511 001f1000
> 8808839ef000
> [  611.001954]  88108b30f900 88109ae368e8 88109ae580b0
> 881099cc0818
> [  611.011055] Call Trace:
> [  611.015087]  [] dump_stack+0x63/0x84
> [  611.021784]  [] __warn+0xd1/0xf0
> [  611.028115]  [] warn_slowpath_fmt+0x5f/0x80
> [  611.035379]  [] ? kernfs_path_from_node+0x50/0x60
> [  611.043148]  [] sysfs_warn_dup+0x64/0x80
> [  611.050109]  [] sysfs_create_dir_ns+0x7e/0x90
> [  611.057481]  [] kobject_add_internal+0xc1/0x340
> [  611.065018]  [] kobject_add+0x75/0xd0
> [  611.071635]  [] device_add+0x119/0x610
> [  611.078314]  [] device_register+0x1a/0x20
> [  611.085261]  [] mdev_device_create+0xdd/0x200 [mdev]
> [  611.093143]  [] create_store+0xa8/0xe0 [mdev]
> [  611.100385]  [] mdev_type_attr_store+0x1b/0x30 [mdev]
> [  611.108309]  [] sysfs_kf_write+0x3a/0x50
> [  611.115096]  [] kernfs_fop_write+0x10b/0x190
> [  611.122231]  [] __vfs_write+0x37/0x140

Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-19 Thread Kirti Wankhede


On 10/19/2016 4:46 AM, Alex Williamson wrote:
> On Tue, 18 Oct 2016 02:52:01 +0530
> Kirti Wankhede  wrote:
> 
...
>> +static struct mdev_device *__find_mdev_device(struct parent_device *parent,
>> +  uuid_le uuid)
>> +{
>> +struct device *dev;
>> +
>> +dev = device_find_child(parent->dev, , _find_mdev_device);
>> +if (!dev)
>> +return NULL;
>> +
>> +put_device(dev);
>> +
>> +return to_mdev_device(dev);
>> +}
> 
> This function is only used by mdev_device_create() for the purpose of
> checking whether a given uuid for a parent already exists, so the
> returned device is not actually used.  However, at the point where
> we're using to_mdev_device() here, we don't actually hold a reference to
> the device, so that function call and any possible use of the returned
> pointer by the callee is invalid.  I would either turn this into a
> "get" function where the callee has a device reference and needs to do
> a "put" on it or change this to a "exists" test where true/false is
> returned and the function cannot be later mis-used to do a device
> lookup where the reference isn't actually valid.
> 

I'll change it to return 0 if not found and -EEXIST if found.


>> +int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le 
>> uuid)
>> +{
>> +int ret;
>> +struct mdev_device *mdev;
>> +struct parent_device *parent;
>> +struct mdev_type *type = to_mdev_type(kobj);
>> +
>> +parent = mdev_get_parent(type->parent);
>> +if (!parent)
>> +return -EINVAL;
>> +
>> +/* Check for duplicate */
>> +mdev = __find_mdev_device(parent, uuid);
>> +if (mdev) {
>> +ret = -EEXIST;
>> +goto create_err;
>> +}
> 
> We check here whether the {parent,uuid} already exists, but what
> prevents us racing with another create call with the same uuid?  ie.
> neither exists at this point.  Will device_register() fail if the
> device name already exists?  If so, should we just rely on the error
> there and skip this duplicate check?  If not, we need a mutex to avoid
> the race.
>

Yes, device_register() fails if device exists already with below
warning. Is it ok to dump such warning? I think, this should be fine,
right? then we can remove duplicate check.

If we want to avoid such warning, we should have duplication check.

[  610.847958] [ cut here ]
[  610.855377] WARNING: CPU: 15 PID: 19839 at fs/sysfs/dir.c:31
sysfs_warn_dup+0x64/0x80
[  610.865798] sysfs: cannot create duplicate filename
'/devices/pci:80/:80:02.0/:83:00.0/:84:08.0/:85:00.0/83b8f4f2-509f-382f-3c1e-e6bfe0fa1234'
[  610.885101] Modules linked in:[  610.888039]  nvidia(POE)
vfio_iommu_type1 vfio_mdev mdev vfio nfsv4 dns_resolver nfs fscache
sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp
kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel
ghash_clmulni_intel aesni_intel glue_helper lrw gf128mul ablk_helper
cryptd nfsd auth_rpcgss nfs_acl lockd mei_me grace iTCO_wdt
iTCO_vendor_support mei ipmi_si pcspkr ioatdma i2c_i801 lpc_ich shpchp
i2c_smbus mfd_core ipmi_msghandler acpi_pad uinput sunrpc xfs libcrc32c
sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops ttm drm igb ahci libahci ptp libata pps_core dca
i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: mdev]
[  610.963835] CPU: 15 PID: 19839 Comm: bash Tainted: P   OE
4.8.0-next-20161013+ #0
[  610.973779] Hardware name: Supermicro
SYS-2027GR-5204A-NC024/X9DRG-HF, BIOS 1.0c 02/28/2013
[  610.983769]  c90009323ae0 813568bf c90009323b30

[  610.992867]  c90009323b20 81085511 001f1000
8808839ef000
[  611.001954]  88108b30f900 88109ae368e8 88109ae580b0
881099cc0818
[  611.011055] Call Trace:
[  611.015087]  [] dump_stack+0x63/0x84
[  611.021784]  [] __warn+0xd1/0xf0
[  611.028115]  [] warn_slowpath_fmt+0x5f/0x80
[  611.035379]  [] ? kernfs_path_from_node+0x50/0x60
[  611.043148]  [] sysfs_warn_dup+0x64/0x80
[  611.050109]  [] sysfs_create_dir_ns+0x7e/0x90
[  611.057481]  [] kobject_add_internal+0xc1/0x340
[  611.065018]  [] kobject_add+0x75/0xd0
[  611.071635]  [] device_add+0x119/0x610
[  611.078314]  [] device_register+0x1a/0x20
[  611.085261]  [] mdev_device_create+0xdd/0x200 [mdev]
[  611.093143]  [] create_store+0xa8/0xe0 [mdev]
[  611.100385]  [] mdev_type_attr_store+0x1b/0x30 [mdev]
[  611.108309]  [] sysfs_kf_write+0x3a/0x50
[  611.115096]  [] kernfs_fop_write+0x10b/0x190
[  611.122231]  [] __vfs_write+0x37/0x140
[  611.128817]  [] ? handle_mm_fault+0x724/0xd80
[  611.135976]  [] vfs_write+0xb2/0x1b0
[  611.142354]  [] ? syscall_trace_enter+0x1d0/0x2b0
[  611.149836]  [] SyS_write+0x55/0xc0
[  611.156065]  [] do_syscall_64+0x67/0x180
[  611.162734]  [] entry_SYSCALL64_slow_path+0x25/0x25
[  611.170345] ---[ end trace b05a73599da2ba3f ]---

Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-19 Thread Kirti Wankhede


On 10/19/2016 4:46 AM, Alex Williamson wrote:
> On Tue, 18 Oct 2016 02:52:01 +0530
> Kirti Wankhede  wrote:
> 
...
>> +static struct mdev_device *__find_mdev_device(struct parent_device *parent,
>> +  uuid_le uuid)
>> +{
>> +struct device *dev;
>> +
>> +dev = device_find_child(parent->dev, , _find_mdev_device);
>> +if (!dev)
>> +return NULL;
>> +
>> +put_device(dev);
>> +
>> +return to_mdev_device(dev);
>> +}
> 
> This function is only used by mdev_device_create() for the purpose of
> checking whether a given uuid for a parent already exists, so the
> returned device is not actually used.  However, at the point where
> we're using to_mdev_device() here, we don't actually hold a reference to
> the device, so that function call and any possible use of the returned
> pointer by the callee is invalid.  I would either turn this into a
> "get" function where the callee has a device reference and needs to do
> a "put" on it or change this to a "exists" test where true/false is
> returned and the function cannot be later mis-used to do a device
> lookup where the reference isn't actually valid.
> 

I'll change it to return 0 if not found and -EEXIST if found.


>> +int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le 
>> uuid)
>> +{
>> +int ret;
>> +struct mdev_device *mdev;
>> +struct parent_device *parent;
>> +struct mdev_type *type = to_mdev_type(kobj);
>> +
>> +parent = mdev_get_parent(type->parent);
>> +if (!parent)
>> +return -EINVAL;
>> +
>> +/* Check for duplicate */
>> +mdev = __find_mdev_device(parent, uuid);
>> +if (mdev) {
>> +ret = -EEXIST;
>> +goto create_err;
>> +}
> 
> We check here whether the {parent,uuid} already exists, but what
> prevents us racing with another create call with the same uuid?  ie.
> neither exists at this point.  Will device_register() fail if the
> device name already exists?  If so, should we just rely on the error
> there and skip this duplicate check?  If not, we need a mutex to avoid
> the race.
>

Yes, device_register() fails if device exists already with below
warning. Is it ok to dump such warning? I think, this should be fine,
right? then we can remove duplicate check.

If we want to avoid such warning, we should have duplication check.

[  610.847958] [ cut here ]
[  610.855377] WARNING: CPU: 15 PID: 19839 at fs/sysfs/dir.c:31
sysfs_warn_dup+0x64/0x80
[  610.865798] sysfs: cannot create duplicate filename
'/devices/pci:80/:80:02.0/:83:00.0/:84:08.0/:85:00.0/83b8f4f2-509f-382f-3c1e-e6bfe0fa1234'
[  610.885101] Modules linked in:[  610.888039]  nvidia(POE)
vfio_iommu_type1 vfio_mdev mdev vfio nfsv4 dns_resolver nfs fscache
sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp
kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel
ghash_clmulni_intel aesni_intel glue_helper lrw gf128mul ablk_helper
cryptd nfsd auth_rpcgss nfs_acl lockd mei_me grace iTCO_wdt
iTCO_vendor_support mei ipmi_si pcspkr ioatdma i2c_i801 lpc_ich shpchp
i2c_smbus mfd_core ipmi_msghandler acpi_pad uinput sunrpc xfs libcrc32c
sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops ttm drm igb ahci libahci ptp libata pps_core dca
i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last
unloaded: mdev]
[  610.963835] CPU: 15 PID: 19839 Comm: bash Tainted: P   OE
4.8.0-next-20161013+ #0
[  610.973779] Hardware name: Supermicro
SYS-2027GR-5204A-NC024/X9DRG-HF, BIOS 1.0c 02/28/2013
[  610.983769]  c90009323ae0 813568bf c90009323b30

[  610.992867]  c90009323b20 81085511 001f1000
8808839ef000
[  611.001954]  88108b30f900 88109ae368e8 88109ae580b0
881099cc0818
[  611.011055] Call Trace:
[  611.015087]  [] dump_stack+0x63/0x84
[  611.021784]  [] __warn+0xd1/0xf0
[  611.028115]  [] warn_slowpath_fmt+0x5f/0x80
[  611.035379]  [] ? kernfs_path_from_node+0x50/0x60
[  611.043148]  [] sysfs_warn_dup+0x64/0x80
[  611.050109]  [] sysfs_create_dir_ns+0x7e/0x90
[  611.057481]  [] kobject_add_internal+0xc1/0x340
[  611.065018]  [] kobject_add+0x75/0xd0
[  611.071635]  [] device_add+0x119/0x610
[  611.078314]  [] device_register+0x1a/0x20
[  611.085261]  [] mdev_device_create+0xdd/0x200 [mdev]
[  611.093143]  [] create_store+0xa8/0xe0 [mdev]
[  611.100385]  [] mdev_type_attr_store+0x1b/0x30 [mdev]
[  611.108309]  [] sysfs_kf_write+0x3a/0x50
[  611.115096]  [] kernfs_fop_write+0x10b/0x190
[  611.122231]  [] __vfs_write+0x37/0x140
[  611.128817]  [] ? handle_mm_fault+0x724/0xd80
[  611.135976]  [] vfs_write+0xb2/0x1b0
[  611.142354]  [] ? syscall_trace_enter+0x1d0/0x2b0
[  611.149836]  [] SyS_write+0x55/0xc0
[  611.156065]  [] do_syscall_64+0x67/0x180
[  611.162734]  [] entry_SYSCALL64_slow_path+0x25/0x25
[  611.170345] ---[ end trace b05a73599da2ba3f ]---
[  611.175940] 

Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-18 Thread Alex Williamson
On Tue, 18 Oct 2016 02:52:01 +0530
Kirti Wankhede  wrote:

> Design for Mediated Device Driver:
> Main purpose of this driver is to provide a common interface for mediated
> device management that can be used by different drivers of different
> devices.
> 
> This module provides a generic interface to create the device, add it to
> mediated bus, add device to IOMMU group and then add it to vfio group.
> 
> Below is the high Level block diagram, with Nvidia, Intel and IBM devices
> as example, since these are the devices which are going to actively use
> this module as of now.
> 
>  +---+
>  |   |
>  | +---+ |  mdev_register_driver() +--+
>  | |   | +<+ __init() |
>  | |  mdev | | |  |
>  | |  bus  | +>+  |<-> VFIO user
>  | |  driver   | | probe()/remove()| vfio_mdev.ko |APIs
>  | |   | | |  |
>  | +---+ | +--+
>  |   |
>  |  MDEV CORE|
>  |   MODULE  |
>  |   mdev.ko |
>  | +---+ |  mdev_register_device() +--+
>  | |   | +<+  |
>  | |   | | |  nvidia.ko   |<-> physical
>  | |   | +>+  |device
>  | |   | |callback +--+
>  | | Physical  | |
>  | |  device   | |  mdev_register_device() +--+
>  | | interface | |<+  |
>  | |   | | |  i915.ko |<-> physical
>  | |   | +>+  |device
>  | |   | |callback +--+
>  | |   | |
>  | |   | |  mdev_register_device() +--+
>  | |   | +<+  |
>  | |   | | | ccw_device.ko|<-> physical
>  | |   | +>+  |device
>  | |   | |callback +--+
>  | +---+ |
>  +---+
> 
> Core driver provides two types of registration interfaces:
> 1. Registration interface for mediated bus driver:
> 
> /**
>   * struct mdev_driver - Mediated device's driver
>   * @name: driver name
>   * @probe: called when new device created
>   * @remove:called when device removed
>   * @driver:device driver structure
>   *
>   **/
> struct mdev_driver {
>  const char *name;
>  int  (*probe)  (struct device *dev);
>  void (*remove) (struct device *dev);
>  struct device_driverdriver;
> };
> 
> Mediated bus driver for mdev device should use this interface to register
> and unregister with core driver respectively:
> 
> int  mdev_register_driver(struct mdev_driver *drv, struct module *owner);
> void mdev_unregister_driver(struct mdev_driver *drv);
> 
> Medisted bus driver is responsible to add/delete mediated devices to/from
> VFIO group when devices are bound and unbound to the driver.
> 
> 2. Physical device driver interface
> This interface provides vendor driver the set APIs to manage physical
> device related work in its driver. APIs are :
> 
> * dev_attr_groups: attributes of the parent device.
> * mdev_attr_groups: attributes of the mediated device.
> * supported_type_groups: attributes to define supported type. This is
>mandatory field.
> * create: to allocate basic resources in driver for a mediated device.
> * remove: to free resources in driver when mediated device is destroyed.
> * open: open callback of mediated device
> * release: release callback of mediated device
> * read : read emulation callback.
> * write: write emulation callback.
> * mmap: mmap emulation callback.
> * ioctl: ioctl callback.
> 
> Drivers should use these interfaces to register and unregister device to
> mdev core driver respectively:
> 
> extern int  mdev_register_device(struct device *dev,
>  const struct parent_ops *ops);
> extern void mdev_unregister_device(struct device *dev);
> 
> There are no locks to serialize above callbacks in mdev driver and
> vfio_mdev driver. If required, vendor driver can have locks to serialize
> above APIs in their driver.
> 
> Signed-off-by: Kirti Wankhede 
> Signed-off-by: Neo Jia 
> Change-Id: I73a5084574270b14541c529461ea2f03c292d510
> ---
>  drivers/vfio/Kconfig |   1 +
>  drivers/vfio/Makefile|   1 +
>  drivers/vfio/mdev/Kconfig|  11 ++
>  drivers/vfio/mdev/Makefile   |   4 +
>  drivers/vfio/mdev/mdev_core.c| 372 
> +++
>  drivers/vfio/mdev/mdev_driver.c  | 128 ++
>  drivers/vfio/mdev/mdev_private.h |  41 +
>  

Re: [PATCH v9 01/12] vfio: Mediated device Core driver

2016-10-18 Thread Alex Williamson
On Tue, 18 Oct 2016 02:52:01 +0530
Kirti Wankhede  wrote:

> Design for Mediated Device Driver:
> Main purpose of this driver is to provide a common interface for mediated
> device management that can be used by different drivers of different
> devices.
> 
> This module provides a generic interface to create the device, add it to
> mediated bus, add device to IOMMU group and then add it to vfio group.
> 
> Below is the high Level block diagram, with Nvidia, Intel and IBM devices
> as example, since these are the devices which are going to actively use
> this module as of now.
> 
>  +---+
>  |   |
>  | +---+ |  mdev_register_driver() +--+
>  | |   | +<+ __init() |
>  | |  mdev | | |  |
>  | |  bus  | +>+  |<-> VFIO user
>  | |  driver   | | probe()/remove()| vfio_mdev.ko |APIs
>  | |   | | |  |
>  | +---+ | +--+
>  |   |
>  |  MDEV CORE|
>  |   MODULE  |
>  |   mdev.ko |
>  | +---+ |  mdev_register_device() +--+
>  | |   | +<+  |
>  | |   | | |  nvidia.ko   |<-> physical
>  | |   | +>+  |device
>  | |   | |callback +--+
>  | | Physical  | |
>  | |  device   | |  mdev_register_device() +--+
>  | | interface | |<+  |
>  | |   | | |  i915.ko |<-> physical
>  | |   | +>+  |device
>  | |   | |callback +--+
>  | |   | |
>  | |   | |  mdev_register_device() +--+
>  | |   | +<+  |
>  | |   | | | ccw_device.ko|<-> physical
>  | |   | +>+  |device
>  | |   | |callback +--+
>  | +---+ |
>  +---+
> 
> Core driver provides two types of registration interfaces:
> 1. Registration interface for mediated bus driver:
> 
> /**
>   * struct mdev_driver - Mediated device's driver
>   * @name: driver name
>   * @probe: called when new device created
>   * @remove:called when device removed
>   * @driver:device driver structure
>   *
>   **/
> struct mdev_driver {
>  const char *name;
>  int  (*probe)  (struct device *dev);
>  void (*remove) (struct device *dev);
>  struct device_driverdriver;
> };
> 
> Mediated bus driver for mdev device should use this interface to register
> and unregister with core driver respectively:
> 
> int  mdev_register_driver(struct mdev_driver *drv, struct module *owner);
> void mdev_unregister_driver(struct mdev_driver *drv);
> 
> Medisted bus driver is responsible to add/delete mediated devices to/from
> VFIO group when devices are bound and unbound to the driver.
> 
> 2. Physical device driver interface
> This interface provides vendor driver the set APIs to manage physical
> device related work in its driver. APIs are :
> 
> * dev_attr_groups: attributes of the parent device.
> * mdev_attr_groups: attributes of the mediated device.
> * supported_type_groups: attributes to define supported type. This is
>mandatory field.
> * create: to allocate basic resources in driver for a mediated device.
> * remove: to free resources in driver when mediated device is destroyed.
> * open: open callback of mediated device
> * release: release callback of mediated device
> * read : read emulation callback.
> * write: write emulation callback.
> * mmap: mmap emulation callback.
> * ioctl: ioctl callback.
> 
> Drivers should use these interfaces to register and unregister device to
> mdev core driver respectively:
> 
> extern int  mdev_register_device(struct device *dev,
>  const struct parent_ops *ops);
> extern void mdev_unregister_device(struct device *dev);
> 
> There are no locks to serialize above callbacks in mdev driver and
> vfio_mdev driver. If required, vendor driver can have locks to serialize
> above APIs in their driver.
> 
> Signed-off-by: Kirti Wankhede 
> Signed-off-by: Neo Jia 
> Change-Id: I73a5084574270b14541c529461ea2f03c292d510
> ---
>  drivers/vfio/Kconfig |   1 +
>  drivers/vfio/Makefile|   1 +
>  drivers/vfio/mdev/Kconfig|  11 ++
>  drivers/vfio/mdev/Makefile   |   4 +
>  drivers/vfio/mdev/mdev_core.c| 372 
> +++
>  drivers/vfio/mdev/mdev_driver.c  | 128 ++
>  drivers/vfio/mdev/mdev_private.h |  41 +
>  drivers/vfio/mdev/mdev_sysfs.c   | 296 +++
>