Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes
On 12/11/2014 04:42, Zhang, Yang Z wrote: Personally, I think this feature will be helpful to the legacy device assignment. Agree, vfio is the right solution for future feature enabling. But the old kvm without the good vfio supporting is still used largely today. The user really looking for this feature but they will not upgrade their kernel. It's easy for us to backport this feature to old kvm with the legacy device assignment, but it is impossible to backport the whole vfio. You can certainly backport these patches to distros that do not have VFIO. But upstream we should work on VFIO first. VFIO has feature parity with legacy device assignment, and adding a new feature that is not in VFIO would be a bad idea. By the way, do you have benchmark results for it? We have not been able to see any performance improvement for APICv on e.g. netperf. Paolo ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes
-Original Message- From: Paolo Bonzini [mailto:pbonz...@redhat.com] Sent: Wednesday, November 12, 2014 5:14 PM To: Zhang, Yang Z; Wu, Feng; Alex Williamson Cc: g...@kernel.org; dw...@infradead.org; j...@8bytes.org; t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org; k...@vger.kernel.org; iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org Subject: Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes On 12/11/2014 04:42, Zhang, Yang Z wrote: Personally, I think this feature will be helpful to the legacy device assignment. Agree, vfio is the right solution for future feature enabling. But the old kvm without the good vfio supporting is still used largely today. The user really looking for this feature but they will not upgrade their kernel. It's easy for us to backport this feature to old kvm with the legacy device assignment, but it is impossible to backport the whole vfio. You can certainly backport these patches to distros that do not have VFIO. But upstream we should work on VFIO first. VFIO has feature parity with legacy device assignment, and adding a new feature that is not in VFIO would be a bad idea. By the way, do you have benchmark results for it? We have not been able to see any performance improvement for APICv on e.g. netperf. Do you mean benchmark results for APICv itself or VT-d Posted-Interrtups? Thanks, Feng Paolo ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes
On 12/11/2014 10:19, Wu, Feng wrote: You can certainly backport these patches to distros that do not have VFIO. But upstream we should work on VFIO first. VFIO has feature parity with legacy device assignment, and adding a new feature that is not in VFIO would be a bad idea. By the way, do you have benchmark results for it? We have not been able to see any performance improvement for APICv on e.g. netperf. Do you mean benchmark results for APICv itself or VT-d Posted-Interrtups? Especially for VT-d posted interrupts---but it'd be great to know which workloads see the biggest speedup from APICv. Paolo ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 03/19] vfio: platform: add the VFIO PLATFORM module to Kconfig
Hello Hongbo, On Wed, Nov 12, 2014 at 10:52 AM, Hongbo Zhang hongbo.zh...@linaro.org wrote: On 28 October 2014 02:07, Antonios Motakis a.mota...@virtualopensystems.com wrote: Enable building the VFIO PLATFORM driver that allows to use Linux platform devices with VFIO. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/Kconfig | 1 + drivers/vfio/Makefile | 1 + drivers/vfio/platform/Kconfig | 9 + drivers/vfio/platform/Makefile | 4 4 files changed, 15 insertions(+) create mode 100644 drivers/vfio/platform/Kconfig create mode 100644 drivers/vfio/platform/Makefile diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index a0abe04..962fb80 100644 --- a/drivers/vfio/Kconfig +++ b/drivers/vfio/Kconfig @@ -27,3 +27,4 @@ menuconfig VFIO If you don't know what to do here, say N. source drivers/vfio/pci/Kconfig +source drivers/vfio/platform/Kconfig diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 0b035b1..dadf0ca 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -3,3 +3,4 @@ obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o obj-$(CONFIG_VFIO_SPAPR_EEH) += vfio_spapr_eeh.o obj-$(CONFIG_VFIO_PCI) += pci/ +obj-$(CONFIG_VFIO_PLATFORM) += platform/ diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig new file mode 100644 index 000..c51af17 --- /dev/null +++ b/drivers/vfio/platform/Kconfig @@ -0,0 +1,9 @@ +config VFIO_PLATFORM + tristate VFIO support for platform devices + depends on VFIO EVENTFD ARM Hi Antonios, Is this only for ARM? how about X86 and PowerPC? On Freescale's PowerPC platform, the IOMMU is called PAMU (Peripheral Access Management Unit), and I am trying to use this VFIO framework on it. In principle it should be working on any platform with such devices; as long as you have a VFIO IOMMU driver for the PAMU (on ARM we use VFIO PLATFORM for the device, with VFIO IOMMU TYPE1 for the IOMMU). So if you have a suitable IOMMU driver for your target, feel free to test it, and let us know of the results. + help + Support for platform devices with VFIO. This is required to make + use of platform devices present on the system using the VFIO + framework. + + If you don't know what to do here, say N. diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile new file mode 100644 index 000..279862b --- /dev/null +++ b/drivers/vfio/platform/Makefile @@ -0,0 +1,4 @@ + +vfio-platform-y := vfio_platform.o vfio_platform_common.o + +obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 03/19] vfio: platform: add the VFIO PLATFORM module to Kconfig
On 10/27/2014 07:07 PM, Antonios Motakis wrote: Enable building the VFIO PLATFORM driver that allows to use Linux platform devices with VFIO. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/Kconfig | 1 + drivers/vfio/Makefile | 1 + drivers/vfio/platform/Kconfig | 9 + drivers/vfio/platform/Makefile | 4 4 files changed, 15 insertions(+) create mode 100644 drivers/vfio/platform/Kconfig create mode 100644 drivers/vfio/platform/Makefile diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index a0abe04..962fb80 100644 --- a/drivers/vfio/Kconfig +++ b/drivers/vfio/Kconfig @@ -27,3 +27,4 @@ menuconfig VFIO If you don't know what to do here, say N. source drivers/vfio/pci/Kconfig +source drivers/vfio/platform/Kconfig diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 0b035b1..dadf0ca 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -3,3 +3,4 @@ obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o obj-$(CONFIG_VFIO_SPAPR_EEH) += vfio_spapr_eeh.o obj-$(CONFIG_VFIO_PCI) += pci/ +obj-$(CONFIG_VFIO_PLATFORM) += platform/ diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig new file mode 100644 index 000..c51af17 --- /dev/null +++ b/drivers/vfio/platform/Kconfig @@ -0,0 +1,9 @@ +config VFIO_PLATFORM + tristate VFIO support for platform devices + depends on VFIO EVENTFD ARM In http://www.spinics.net/lists/kvm/msg106815.html, Joel proposed adding support for ARM64 too. I took part of his patch in irqfd next release and you may add support for ARM64 here or in another patch. What do you think? Best Regards Eric + help + Support for platform devices with VFIO. This is required to make + use of platform devices present on the system using the VFIO + framework. + + If you don't know what to do here, say N. diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile new file mode 100644 index 000..279862b --- /dev/null +++ b/drivers/vfio/platform/Makefile @@ -0,0 +1,4 @@ + +vfio-platform-y := vfio_platform.o vfio_platform_common.o + +obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v9 03/19] vfio: platform: add the VFIO PLATFORM module to Kconfig
Hi, This is not yet supported on Freescale PowerPC. I am still in process of upstreaming the FSL PAMU specific patches for same. Initial plan is to test with PCIe devices and then with Platform devices. Thanks -Bharat From: kvmarm-boun...@lists.cs.columbia.edu [mailto:kvmarm-boun...@lists.cs.columbia.edu] On Behalf Of Hongbo Zhang Sent: Wednesday, November 12, 2014 3:08 PM To: Antonios Motakis Cc: open list:VFIO DRIVER; will.dea...@arm.com; alex.william...@redhat.com; open list; iommu@lists.linux-foundation.org; t...@virtualopensystems.com; kvm...@lists.cs.columbia.edu Subject: Re: [PATCH v9 03/19] vfio: platform: add the VFIO PLATFORM module to Kconfig On 28 October 2014 02:07, Antonios Motakis a.mota...@virtualopensystems.commailto:a.mota...@virtualopensystems.com wrote: Enable building the VFIO PLATFORM driver that allows to use Linux platform devices with VFIO. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.commailto:a.mota...@virtualopensystems.com --- drivers/vfio/Kconfig | 1 + drivers/vfio/Makefile | 1 + drivers/vfio/platform/Kconfig | 9 + drivers/vfio/platform/Makefile | 4 4 files changed, 15 insertions(+) create mode 100644 drivers/vfio/platform/Kconfig create mode 100644 drivers/vfio/platform/Makefile diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index a0abe04..962fb80 100644 --- a/drivers/vfio/Kconfig +++ b/drivers/vfio/Kconfig @@ -27,3 +27,4 @@ menuconfig VFIO If you don't know what to do here, say N. source drivers/vfio/pci/Kconfig +source drivers/vfio/platform/Kconfig diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 0b035b1..dadf0ca 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -3,3 +3,4 @@ obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o obj-$(CONFIG_VFIO_SPAPR_EEH) += vfio_spapr_eeh.o obj-$(CONFIG_VFIO_PCI) += pci/ +obj-$(CONFIG_VFIO_PLATFORM) += platform/ diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig new file mode 100644 index 000..c51af17 --- /dev/null +++ b/drivers/vfio/platform/Kconfig @@ -0,0 +1,9 @@ +config VFIO_PLATFORM + tristate VFIO support for platform devices + depends on VFIO EVENTFD ARM Hi Antonios, Is this only for ARM? how about X86 and PowerPC? On Freescale's PowerPC platform, the IOMMU is called PAMU (Peripheral Access Management Unit), and I am trying to use this VFIO framework on it. + help + Support for platform devices with VFIO. This is required to make + use of platform devices present on the system using the VFIO + framework. + + If you don't know what to do here, say N. diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile new file mode 100644 index 000..279862b --- /dev/null +++ b/drivers/vfio/platform/Makefile @@ -0,0 +1,4 @@ + +vfio-platform-y := vfio_platform.o vfio_platform_common.o + +obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.orgmailto:majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 01/19] vfio/platform: initial skeleton of VFIO support for platform devices
Hi Antonios, On 10/27/2014 07:07 PM, Antonios Motakis wrote: This patch forms the common skeleton code for platform devices support with VFIO. This will include the core functionality of VFIO_PLATFORM, however binding to the device and discovering the device resources will be done with the help of a separate file where any Linux platform bus specific code will reside. This will allow us to implement support for also discovering AMBA devices and their resources, but still reuse a large part of the VFIO_PLATFORM implementation. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform_common.c | 126 ++ drivers/vfio/platform/vfio_platform_private.h | 36 2 files changed, 162 insertions(+) create mode 100644 drivers/vfio/platform/vfio_platform_common.c create mode 100644 drivers/vfio/platform/vfio_platform_private.h diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c new file mode 100644 index 000..e0fdbc8 --- /dev/null +++ b/drivers/vfio/platform/vfio_platform_common.c @@ -0,0 +1,126 @@ +/* + * Copyright (C) 2013 - Virtual Open Systems + * Author: Antonios Motakis a.mota...@virtualopensystems.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include linux/device.h +#include linux/interrupt.h +#include linux/iommu.h +#include linux/module.h +#include linux/mutex.h +#include linux/notifier.h +#include linux/pm_runtime.h +#include linux/slab.h +#include linux/types.h +#include linux/uaccess.h +#include linux/vfio.h +#include linux/io.h not sure at that state all the above includes are needed. + +#include vfio_platform_private.h + +static void vfio_platform_release(void *device_data) +{ + module_put(THIS_MODULE); +} + +static int vfio_platform_open(void *device_data) +{ + if (!try_module_get(THIS_MODULE)) + return -ENODEV; + + return 0; +} + +static long vfio_platform_ioctl(void *device_data, +unsigned int cmd, unsigned long arg) a minor style comment/question that applies on all the series. Shouldn't subsequent argument lines rather be aligned with the first char after (, as done in PCI code? +{ + if (cmd == VFIO_DEVICE_GET_INFO) + return -EINVAL; + + else if (cmd == VFIO_DEVICE_GET_REGION_INFO) + return -EINVAL; + + else if (cmd == VFIO_DEVICE_GET_IRQ_INFO) + return -EINVAL; + + else if (cmd == VFIO_DEVICE_SET_IRQS) + return -EINVAL; + + else if (cmd == VFIO_DEVICE_RESET) + return -EINVAL; + + return -ENOTTY; +} + +static ssize_t vfio_platform_read(void *device_data, char __user *buf, + size_t count, loff_t *ppos) +{ + return -EINVAL; +} + +static ssize_t vfio_platform_write(void *device_data, const char __user *buf, +size_t count, loff_t *ppos) +{ + return -EINVAL; +} + +static int vfio_platform_mmap(void *device_data, struct vm_area_struct *vma) +{ + return -EINVAL; +} + +static const struct vfio_device_ops vfio_platform_ops = { + .name = vfio-platform, + .open = vfio_platform_open, + .release= vfio_platform_release, + .ioctl = vfio_platform_ioctl, + .read = vfio_platform_read, + .write = vfio_platform_write, + .mmap = vfio_platform_mmap, +}; + +int vfio_platform_probe_common(struct vfio_platform_device *vdev, +struct device *dev) +{ + struct iommu_group *group; + int ret; + + if (!vdev) + return -EINVAL; + + group = iommu_group_get(dev); + if (!group) { + pr_err(VFIO: No IOMMU group for device %s\n, vdev-name); + return -EINVAL; + } I saw the above check also is done at beginning of vfio_add_group_dev. Added value however is pr_err which is useful and PCI code does the check too. Eric + + ret = vfio_add_group_dev(dev, vfio_platform_ops, vdev); + if (ret) { + iommu_group_put(group); + return ret; + } + + return 0; +} +EXPORT_SYMBOL_GPL(vfio_platform_probe_common); + +struct vfio_platform_device *vfio_platform_remove_common(struct device *dev) +{ + struct vfio_platform_device *vdev; + + vdev = vfio_del_group_dev(dev); + if (vdev) + iommu_group_put(dev-iommu_group); +
Re: [PATCH v9 06/19] vfio/platform: return info for bound device
On 10/27/2014 07:07 PM, Antonios Motakis wrote: A VFIO userspace driver will start by opening the VFIO device that corresponds to an IOMMU group, and will use the ioctl interface to get the basic device info, such as number of memory regions and interrupts, and their properties. This patch enables the VFIO_DEVICE_GET_INFO ioctl call. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform_common.c | 23 --- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c index e0fdbc8..cb20526 100644 --- a/drivers/vfio/platform/vfio_platform_common.c +++ b/drivers/vfio/platform/vfio_platform_common.c @@ -43,10 +43,27 @@ static int vfio_platform_open(void *device_data) static long vfio_platform_ioctl(void *device_data, unsigned int cmd, unsigned long arg) { - if (cmd == VFIO_DEVICE_GET_INFO) - return -EINVAL; + struct vfio_platform_device *vdev = device_data; + unsigned long minsz; + + if (cmd == VFIO_DEVICE_GET_INFO) { + struct vfio_device_info info; + + minsz = offsetofend(struct vfio_device_info, num_irqs); + + if (copy_from_user(info, (void __user *)arg, minsz)) + return -EFAULT; + + if (info.argsz minsz) + return -EINVAL; + + info.flags = vdev-flags; + info.num_regions = 0; + info.num_irqs = 0; Seems a bit weird to me to enable the modality but returning zeroed values. Shouldn't we put that patch after VFIO_DEVICE_GET_REGION_INFO and VFIO_DEVICE_GET_IRQ_INFO ones? Best Regards Eric + + return copy_to_user((void __user *)arg, info, minsz); - else if (cmd == VFIO_DEVICE_GET_REGION_INFO) + } else if (cmd == VFIO_DEVICE_GET_REGION_INFO) return -EINVAL; else if (cmd == VFIO_DEVICE_GET_IRQ_INFO) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 07/19] vfio/platform: return info for device memory mapped IO regions
On 10/27/2014 07:07 PM, Antonios Motakis wrote: This patch enables the IOCTLs VFIO_DEVICE_GET_REGION_INFO ioctl call, which allows the user to learn about the available MMIO resources of a device. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform_common.c | 110 +- drivers/vfio/platform/vfio_platform_private.h | 22 ++ 2 files changed, 128 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c index cb20526..82de752 100644 --- a/drivers/vfio/platform/vfio_platform_common.c +++ b/drivers/vfio/platform/vfio_platform_common.c @@ -27,17 +27,97 @@ #include vfio_platform_private.h +static DEFINE_MUTEX(driver_lock); + +static int vfio_platform_regions_init(struct vfio_platform_device *vdev) +{ + int cnt = 0, i; + + while (vdev-get_resource(vdev, cnt)) + cnt++; + + vdev-regions = kcalloc(cnt, sizeof(struct vfio_platform_region), + GFP_KERNEL); + if (!vdev-regions) + return -ENOMEM; + + for (i = 0; i cnt; i++) { + struct resource *res = + vdev-get_resource(vdev, i); + + if (!res) + goto err; + + vdev-regions[i].addr = res-start; + vdev-regions[i].size = resource_size(res); + vdev-regions[i].flags = 0; + + switch (resource_type(res)) { + case IORESOURCE_MEM: + vdev-regions[i].type = VFIO_PLATFORM_REGION_TYPE_MMIO; + break; + case IORESOURCE_IO: + vdev-regions[i].type = VFIO_PLATFORM_REGION_TYPE_PIO; + break; + default: + goto err; + } + } + + vdev-num_regions = cnt; + + return 0; +err: Isn't it safer to reset vdev-num_regions here? I think in a next patch you will iounmap the num_regions in vfio_platform_regions_cleanup. -- Eric + kfree(vdev-regions); + return -EINVAL; +} + +static void vfio_platform_regions_cleanup(struct vfio_platform_device *vdev) +{ + vdev-num_regions = 0; + kfree(vdev-regions); +} + static void vfio_platform_release(void *device_data) { + struct vfio_platform_device *vdev = device_data; + + mutex_lock(driver_lock); + + if (!(--vdev-refcnt)) { + vfio_platform_regions_cleanup(vdev); + } + + mutex_unlock(driver_lock); + module_put(THIS_MODULE); } static int vfio_platform_open(void *device_data) { + struct vfio_platform_device *vdev = device_data; + int ret; + if (!try_module_get(THIS_MODULE)) return -ENODEV; + mutex_lock(driver_lock); + + if (!vdev-refcnt) { + ret = vfio_platform_regions_init(vdev); + if (ret) + goto err_reg; + } + + vdev-refcnt++; + + mutex_unlock(driver_lock); return 0; + +err_reg: + mutex_unlock(driver_lock); + module_put(THIS_MODULE); + return ret; } static long vfio_platform_ioctl(void *device_data, @@ -58,15 +138,33 @@ static long vfio_platform_ioctl(void *device_data, return -EINVAL; info.flags = vdev-flags; - info.num_regions = 0; + info.num_regions = vdev-num_regions; info.num_irqs = 0; return copy_to_user((void __user *)arg, info, minsz); - } else if (cmd == VFIO_DEVICE_GET_REGION_INFO) - return -EINVAL; + } else if (cmd == VFIO_DEVICE_GET_REGION_INFO) { + struct vfio_region_info info; + + minsz = offsetofend(struct vfio_region_info, offset); + + if (copy_from_user(info, (void __user *)arg, minsz)) + return -EFAULT; - else if (cmd == VFIO_DEVICE_GET_IRQ_INFO) + if (info.argsz minsz) + return -EINVAL; + + if (info.index = vdev-num_regions) + return -EINVAL; + + /* map offset to the physical address */ + info.offset = VFIO_PLATFORM_INDEX_TO_OFFSET(info.index); + info.size = vdev-regions[info.index].size; + info.flags = vdev-regions[info.index].flags; + + return copy_to_user((void __user *)arg, info, minsz); + + } else if (cmd == VFIO_DEVICE_GET_IRQ_INFO) return -EINVAL; else if (cmd == VFIO_DEVICE_SET_IRQS) @@ -134,10 +232,14 @@ struct vfio_platform_device *vfio_platform_remove_common(struct device *dev) { struct vfio_platform_device *vdev; + mutex_lock(driver_lock); + vdev = vfio_del_group_dev(dev); if (vdev) iommu_group_put(dev-iommu_group); +
RE: [PATCH v9 03/19] vfio: platform: add the VFIO PLATFORM module to Kconfig
-Original Message- From: Hongbo Zhang [mailto:hongbo.zh...@linaro.org] Sent: Wednesday, November 12, 2014 4:09 PM To: Bhushan Bharat-R65777 Cc: Antonios Motakis; open list:VFIO DRIVER; will.dea...@arm.com; alex.william...@redhat.com; open list; iommu@lists.linux-foundation.org; t...@virtualopensystems.com; kvm...@lists.cs.columbia.edu Subject: Re: [PATCH v9 03/19] vfio: platform: add the VFIO PLATFORM module to Kconfig On 12 November 2014 18:05, bharat.bhus...@freescale.com bharat.bhus...@freescale.com wrote: Hi, This is not yet supported on Freescale PowerPC. I am still in process of upstreaming the FSL PAMU specific patches for same. Initial plan is to test with PCIe devices and then with Platform devices. I see there is already driver/iommu/fsl_pamu.c, doesn't it work? We need VFIO iommu driver for same. Could you explain briefly what is wrong? I've heard that the vfio pci works on powerpc platforms. Yes, patches are in Freescale internal git repository. But those patches are yet to be upstreamed. I have started working on same. Thanks -Bharat Thanks -Bharat From: kvmarm-boun...@lists.cs.columbia.edu [mailto:kvmarm-boun...@lists.cs.columbia.edu] On Behalf Of Hongbo Zhang Sent: Wednesday, November 12, 2014 3:08 PM To: Antonios Motakis Cc: open list:VFIO DRIVER; will.dea...@arm.com; alex.william...@redhat.com; open list; iommu@lists.linux-foundation.org; t...@virtualopensystems.com; kvm...@lists.cs.columbia.edu Subject: Re: [PATCH v9 03/19] vfio: platform: add the VFIO PLATFORM module to Kconfig On 28 October 2014 02:07, Antonios Motakis a.mota...@virtualopensystems.com wrote: Enable building the VFIO PLATFORM driver that allows to use Linux platform devices with VFIO. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/Kconfig | 1 + drivers/vfio/Makefile | 1 + drivers/vfio/platform/Kconfig | 9 + drivers/vfio/platform/Makefile | 4 4 files changed, 15 insertions(+) create mode 100644 drivers/vfio/platform/Kconfig create mode 100644 drivers/vfio/platform/Makefile diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index a0abe04..962fb80 100644 --- a/drivers/vfio/Kconfig +++ b/drivers/vfio/Kconfig @@ -27,3 +27,4 @@ menuconfig VFIO If you don't know what to do here, say N. source drivers/vfio/pci/Kconfig +source drivers/vfio/platform/Kconfig diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 0b035b1..dadf0ca 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -3,3 +3,4 @@ obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o obj-$(CONFIG_VFIO_SPAPR_EEH) += vfio_spapr_eeh.o obj-$(CONFIG_VFIO_PCI) += pci/ +obj-$(CONFIG_VFIO_PLATFORM) += platform/ diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig new file mode 100644 index 000..c51af17 --- /dev/null +++ b/drivers/vfio/platform/Kconfig @@ -0,0 +1,9 @@ +config VFIO_PLATFORM + tristate VFIO support for platform devices + depends on VFIO EVENTFD ARM Hi Antonios, Is this only for ARM? how about X86 and PowerPC? On Freescale's PowerPC platform, the IOMMU is called PAMU (Peripheral Access Management Unit), and I am trying to use this VFIO framework on it. + help + Support for platform devices with VFIO. This is required to make + use of platform devices present on the system using the VFIO + framework. + + If you don't know what to do here, say N. diff --git a/drivers/vfio/platform/Makefile b/drivers/vfio/platform/Makefile new file mode 100644 index 000..279862b --- /dev/null +++ b/drivers/vfio/platform/Makefile @@ -0,0 +1,4 @@ + +vfio-platform-y := vfio_platform.o vfio_platform_common.o + +obj-$(CONFIG_VFIO_PLATFORM) += vfio-platform.o -- 2.1.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: fix accounting of device_state
On Mon, Nov 10, 2014 at 12:21:39PM +0200, Oded Gabbay wrote: This patch fixes a bug in the accounting of the device_state. In the current code, the device_state was put (decremented) too many times, which sometimes lead to the driver getting stuck permanently in put_device_state_wait(). That happen because the device_state-count would go below zero, which is never supposed to happen. The root cause is that the device_state was decremented in put_pasid_state() and put_pasid_state_wait() but also in all the functions that call those functions. Therefore, the device_state was decremented twice in each of these code paths. The fix is to decouple the device_state accounting from the pasid_state accounting - remove the call to put_device_state() from the put_pasid_state() and the put_pasid_state_wait()) Right, there was a double drop of the reference to device state. An alternative would have been to remove the put_device_state call at the end of the amd_iommu_unbind_pasid() function. But this patch works as well and is slightly better, so: Applied, thanks. Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 05/12] memory: Add NVIDIA Tegra memory controller support
On Fri, Nov 07, 2014 at 05:00:56PM +0100, Thierry Reding wrote: drivers/iommu/tegra-smmu.c | 1295 -- drivers/memory/tegra/smmu.c | 716 + This new smmu.c is an IOMMU driver, why do you put it in drivers/memory and not in drivers/iommu? Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 12/19] vfio/platform: trigger an interrupt via eventfd
On 10/31/2014 08:36 PM, Alex Williamson wrote: On Mon, 2014-10-27 at 19:07 +0100, Antonios Motakis wrote: This patch allows to set an eventfd for a patform device's interrupt, platform device (typo) and also to trigger the interrupt eventfd from userspace for testing. Level sensitive interrupts are marked as maskable and are handled in a later patch. Edge triggered interrupts are not advertised as maskable and are implemented here using a simple and efficient IRQ handler. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform_irq.c | 93 ++- drivers/vfio/platform/vfio_platform_private.h | 2 + 2 files changed, 93 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c index 007b386..2ac8ed7 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -45,11 +45,91 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev, return -EINVAL; } +static irqreturn_t vfio_irq_handler(int irq, void *dev_id) +{ +struct vfio_platform_irq *irq_ctx = dev_id; + +eventfd_signal(irq_ctx-trigger, 1); + +return IRQ_HANDLED; +} + +static int vfio_set_trigger(struct vfio_platform_device *vdev, int index, +int fd, irq_handler_t handler) +{ +struct vfio_platform_irq *irq = vdev-irqs[index]; +struct eventfd_ctx *trigger; +int ret; + +if (irq-trigger) { +free_irq(irq-hwirq, irq); +kfree(irq-name); +eventfd_ctx_put(irq-trigger); +irq-trigger = NULL; +} + +if (fd 0) /* Disable only */ +return 0; + +irq-name = kasprintf(GFP_KERNEL, vfio-irq[%d](%s), +irq-hwirq, vdev-name); +if (!irq-name) +return -ENOMEM; + +trigger = eventfd_ctx_fdget(fd); +if (IS_ERR(trigger)) { +kfree(irq-name); +return PTR_ERR(trigger); +} + +irq-trigger = trigger; + +ret = request_irq(irq-hwirq, handler, 0, irq-name, irq); +if (ret) { +kfree(irq-name); +eventfd_ctx_put(trigger); +irq-trigger = NULL; +return ret; +} + +return 0; you may simply return ret here? +} + static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { -return -EINVAL; +struct vfio_platform_irq *irq = vdev-irqs[index]; +irq_handler_t handler; + +if (vdev-irqs[index].flags VFIO_IRQ_INFO_MASKABLE) +return -EINVAL; /* not implemented */ +else +handler = vfio_irq_handler; + +if (!count (flags VFIO_IRQ_SET_DATA_NONE)) +return vfio_set_trigger(vdev, index, -1, handler); + +if (start != 0 || count != 1) +return -EINVAL; + +if (flags VFIO_IRQ_SET_DATA_EVENTFD) { +int32_t fd = *(int32_t *)data; + +return vfio_set_trigger(vdev, index, fd, handler); +} + +if (flags VFIO_IRQ_SET_DATA_NONE) { +handler(irq-hwirq, irq); + +} else if (flags VFIO_IRQ_SET_DATA_BOOL) { +uint8_t trigger = *(uint8_t *)data; + +if (trigger) +handler(irq-hwirq, irq); +} + +return 0; } int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev, @@ -95,7 +175,11 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev) if (hwirq 0) goto err; -vdev-irqs[i].flags = 0; +vdev-irqs[i].flags = VFIO_IRQ_INFO_EVENTFD; + +if (irq_get_trigger_type(hwirq) IRQ_TYPE_LEVEL_MASK) +vdev-irqs[i].flags |= VFIO_IRQ_INFO_MASKABLE; This is a bit confusing because edge interrupts can support masking, but they don't require it. Level interrupts really must support masking because we need to mask them on the host and therefore the user needs to be able to unmask them (ignoring the irq prioritization thing you guys can do on arm). So this works, but I would really have expected VFIO_IRQ_INFO_AUTOMASKED here and in the above function. Shouldn't we have AUTOMASKED for level sensitive and MASKABLE for both level edge? For forwarded IRQ, may I enrich the external API with a new function enabling to turn the automasked flag off? Would that make sense? Thanks Eric + vdev-irqs[i].count = 1; vdev-irqs[i].hwirq = hwirq; } @@ -110,6 +194,11 @@ err: void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev) { +int i; + +for (i = 0; i vdev-num_irqs; i++) +vfio_set_trigger(vdev, i, -1, NULL); +
Re: [PATCH v6 05/12] memory: Add NVIDIA Tegra memory controller support
On Wed, Nov 12, 2014 at 03:21:50PM +0100, Joerg Roedel wrote: On Fri, Nov 07, 2014 at 05:00:56PM +0100, Thierry Reding wrote: drivers/iommu/tegra-smmu.c | 1295 -- drivers/memory/tegra/smmu.c | 716 + This new smmu.c is an IOMMU driver, why do you put it in drivers/memory and not in drivers/iommu? The SMMU is part of a larger IP block that's also a memory controller. Having it in drivers/iommu would mean that we need to provide a mechanism to synchronize between the two drivers. They also share a number of data tables, so that would need to be shared somehow as well. By keeping both in the same directory we don't have to expose any of this publicly. Thierry pgp3AKt3IkIPL.pgp Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 05/12] memory: Add NVIDIA Tegra memory controller support
On Wed, Nov 12, 2014 at 03:47:16PM +0100, Thierry Reding wrote: The SMMU is part of a larger IP block that's also a memory controller. Having it in drivers/iommu would mean that we need to provide a mechanism to synchronize between the two drivers. They also share a number of data tables, so that would need to be shared somehow as well. By keeping both in the same directory we don't have to expose any of this publicly. Since these parts are in different files you export the interfaces anyway through non-static symbols and those dependencys also exist for other IOMMU drivers. So I prefer this file to live in drivers/iommu. Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 13/19] vfio/platform: support for level sensitive interrupts
On 10/31/2014 08:36 PM, Alex Williamson wrote: On Mon, 2014-10-27 at 19:07 +0100, Antonios Motakis wrote: Level sensitive interrupts are exposed as maskable and automasked interrupts and are masked and disabled automatically when they fire. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform_irq.c | 102 +- drivers/vfio/platform/vfio_platform_private.h | 2 + 2 files changed, 100 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c index 2ac8ed7..563abf6 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -31,18 +31,108 @@ #include vfio_platform_private.h +static void vfio_platform_mask(struct vfio_platform_irq *irq_ctx) +{ +unsigned long flags; + +spin_lock_irqsave(irq_ctx-lock, flags); + +if (!irq_ctx-masked) { +disable_irq(irq_ctx-hwirq); +irq_ctx-masked = true; +} + +spin_unlock_irqrestore(irq_ctx-lock, flags); +} + static int vfio_platform_set_irq_mask(struct vfio_platform_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { -return -EINVAL; +if (start != 0 || count != 1) +return -EINVAL; + +if (!(vdev-irqs[index].flags VFIO_IRQ_INFO_MASKABLE)) +return -EINVAL; + +if (flags VFIO_IRQ_SET_DATA_EVENTFD) +return -EINVAL; /* not implemented yet */ + +if (flags VFIO_IRQ_SET_DATA_NONE) { +vfio_platform_mask(vdev-irqs[index]); + +} else if (flags VFIO_IRQ_SET_DATA_BOOL) { +uint8_t mask = *(uint8_t *)data; + +if (mask) +vfio_platform_mask(vdev-irqs[index]); +} + +return 0; +} + +static void vfio_platform_unmask(struct vfio_platform_irq *irq_ctx) +{ +unsigned long flags; + +spin_lock_irqsave(irq_ctx-lock, flags); + +if (irq_ctx-masked) { +enable_irq(irq_ctx-hwirq); +irq_ctx-masked = false; +} + +spin_unlock_irqrestore(irq_ctx-lock, flags); } static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { -return -EINVAL; +if (start != 0 || count != 1) +return -EINVAL; + +if (!(vdev-irqs[index].flags VFIO_IRQ_INFO_MASKABLE)) +return -EINVAL; + +if (flags VFIO_IRQ_SET_DATA_EVENTFD) +return -EINVAL; /* not implemented yet */ + +if (flags VFIO_IRQ_SET_DATA_NONE) { +vfio_platform_unmask(vdev-irqs[index]); + +} else if (flags VFIO_IRQ_SET_DATA_BOOL) { +uint8_t unmask = *(uint8_t *)data; + +if (unmask) +vfio_platform_unmask(vdev-irqs[index]); +} + +return 0; +} + +static irqreturn_t vfio_maskable_irq_handler(int irq, void *dev_id) +{ +struct vfio_platform_irq *irq_ctx = dev_id; +unsigned long flags; +int ret = IRQ_NONE; + +spin_lock_irqsave(irq_ctx-lock, flags); + +if (!irq_ctx-masked) { +ret = IRQ_HANDLED; + +/* automask maskable interrupts */ +disable_irq_nosync(irq_ctx-hwirq); +irq_ctx-masked = true; +} + +spin_unlock_irqrestore(irq_ctx-lock, flags); + +if (ret == IRQ_HANDLED) +eventfd_signal(irq_ctx-trigger, 1); + +return ret; This is not just a maskable irq handler, but specifically a level (aka automasked) handler. So this should only be used for AUTOMASKED irqs. } Antonios, Alex, When turning forwarded IRQ modality on/off, I guess I now should the swap the handlers (free_irq, request_irq) adding some extra complexity in the driver compared to the previous implementation. Do you confirm this is the preferred way? Thanks Eric static irqreturn_t vfio_irq_handler(int irq, void *dev_id) @@ -103,7 +193,7 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev, irq_handler_t handler; if (vdev-irqs[index].flags VFIO_IRQ_INFO_MASKABLE) -return -EINVAL; /* not implemented */ +handler = vfio_maskable_irq_handler; As noted in the previous patch, this should be a test for AUTOMASKED, not just MASKABLE. else handler = vfio_irq_handler; @@ -175,13 +265,17 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev) if (hwirq 0) goto err; +spin_lock_init(vdev-irqs[i].lock); + vdev-irqs[i].flags = VFIO_IRQ_INFO_EVENTFD; if (irq_get_trigger_type(hwirq) IRQ_TYPE_LEVEL_MASK) -
Re: [PATCH v9 13/19] vfio/platform: support for level sensitive interrupts
On 10/31/2014 08:36 PM, Alex Williamson wrote: On Mon, 2014-10-27 at 19:07 +0100, Antonios Motakis wrote: Level sensitive interrupts are exposed as maskable and automasked interrupts and are masked and disabled automatically when they fire. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform_irq.c | 102 +- drivers/vfio/platform/vfio_platform_private.h | 2 + 2 files changed, 100 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c index 2ac8ed7..563abf6 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -31,18 +31,108 @@ #include vfio_platform_private.h +static void vfio_platform_mask(struct vfio_platform_irq *irq_ctx) +{ +unsigned long flags; + +spin_lock_irqsave(irq_ctx-lock, flags); + +if (!irq_ctx-masked) { +disable_irq(irq_ctx-hwirq); +irq_ctx-masked = true; +} + +spin_unlock_irqrestore(irq_ctx-lock, flags); +} + static int vfio_platform_set_irq_mask(struct vfio_platform_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { -return -EINVAL; +if (start != 0 || count != 1) +return -EINVAL; + +if (!(vdev-irqs[index].flags VFIO_IRQ_INFO_MASKABLE)) +return -EINVAL; + +if (flags VFIO_IRQ_SET_DATA_EVENTFD) +return -EINVAL; /* not implemented yet */ + +if (flags VFIO_IRQ_SET_DATA_NONE) { +vfio_platform_mask(vdev-irqs[index]); + +} else if (flags VFIO_IRQ_SET_DATA_BOOL) { +uint8_t mask = *(uint8_t *)data; + +if (mask) +vfio_platform_mask(vdev-irqs[index]); +} + +return 0; +} + +static void vfio_platform_unmask(struct vfio_platform_irq *irq_ctx) +{ +unsigned long flags; + +spin_lock_irqsave(irq_ctx-lock, flags); + +if (irq_ctx-masked) { +enable_irq(irq_ctx-hwirq); +irq_ctx-masked = false; +} + +spin_unlock_irqrestore(irq_ctx-lock, flags); } static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { -return -EINVAL; +if (start != 0 || count != 1) +return -EINVAL; + +if (!(vdev-irqs[index].flags VFIO_IRQ_INFO_MASKABLE)) +return -EINVAL; + +if (flags VFIO_IRQ_SET_DATA_EVENTFD) +return -EINVAL; /* not implemented yet */ + +if (flags VFIO_IRQ_SET_DATA_NONE) { +vfio_platform_unmask(vdev-irqs[index]); + +} else if (flags VFIO_IRQ_SET_DATA_BOOL) { +uint8_t unmask = *(uint8_t *)data; + +if (unmask) +vfio_platform_unmask(vdev-irqs[index]); +} + +return 0; +} + +static irqreturn_t vfio_maskable_irq_handler(int irq, void *dev_id) +{ +struct vfio_platform_irq *irq_ctx = dev_id; +unsigned long flags; +int ret = IRQ_NONE; + +spin_lock_irqsave(irq_ctx-lock, flags); + +if (!irq_ctx-masked) { +ret = IRQ_HANDLED; + +/* automask maskable interrupts */ +disable_irq_nosync(irq_ctx-hwirq); +irq_ctx-masked = true; +} + +spin_unlock_irqrestore(irq_ctx-lock, flags); + +if (ret == IRQ_HANDLED) +eventfd_signal(irq_ctx-trigger, 1); + +return ret; This is not just a maskable irq handler, but specifically a level (aka automasked) handler. So this should only be used for AUTOMASKED irqs. } Antonios, Alex, When turning forwarded IRQ modality on/off, I guess I now should the swap the handlers (free_irq, request_irq) adding some extra complexity in the driver compared to the previous implementation. Do you confirm this is the prefered way? Thanks Eric static irqreturn_t vfio_irq_handler(int irq, void *dev_id) @@ -103,7 +193,7 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev, irq_handler_t handler; if (vdev-irqs[index].flags VFIO_IRQ_INFO_MASKABLE) -return -EINVAL; /* not implemented */ +handler = vfio_maskable_irq_handler; As noted in the previous patch, this should be a test for AUTOMASKED, not just MASKABLE. else handler = vfio_irq_handler; @@ -175,13 +265,17 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev) if (hwirq 0) goto err; +spin_lock_init(vdev-irqs[i].lock); + vdev-irqs[i].flags = VFIO_IRQ_INFO_EVENTFD; if (irq_get_trigger_type(hwirq) IRQ_TYPE_LEVEL_MASK) -
Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes
On Wed, 2014-11-12 at 10:14 +0100, Paolo Bonzini wrote: On 12/11/2014 04:42, Zhang, Yang Z wrote: Personally, I think this feature will be helpful to the legacy device assignment. Agree, vfio is the right solution for future feature enabling. But the old kvm without the good vfio supporting is still used largely today. The user really looking for this feature but they will not upgrade their kernel. It's easy for us to backport this feature to old kvm with the legacy device assignment, but it is impossible to backport the whole vfio. You can certainly backport these patches to distros that do not have VFIO. But upstream we should work on VFIO first. VFIO has feature parity with legacy device assignment, and adding a new feature that is not in VFIO would be a bad idea. Thanks Paolo, I agree. We should design the interfaces for VFIO since we expect legacy KVM assignment to be deprecated and eventually removed. I think that some of the platform device work for ARM's IRQ forwarding should probably be leveraged for this interface. IRQ forwarding effectively allows level triggered interrupts to be handled as edge, eliminating the mask/unmask overhead and EOI path entirely. To do this through VFIO they make use of the KVM-VFIO device to register the device and set attributes for the forwarded IRQ. This enables KVM to use the VFIO external user interfaces to acquire a VFIO device reference and access the struct device. From there it can do some IRQ manipulation on the device to reconfigure how the host handles the interrupt. Ideally we could use the same base KVM-VFIO device interface interface, perhaps with different attributes, and obviously with different architecture backing. Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 01/19] vfio/platform: initial skeleton of VFIO support for platform devices
On Wed, 2014-11-12 at 11:05 +0100, Eric Auger wrote: Hi Antonios, On 10/27/2014 07:07 PM, Antonios Motakis wrote: This patch forms the common skeleton code for platform devices support with VFIO. This will include the core functionality of VFIO_PLATFORM, however binding to the device and discovering the device resources will be done with the help of a separate file where any Linux platform bus specific code will reside. This will allow us to implement support for also discovering AMBA devices and their resources, but still reuse a large part of the VFIO_PLATFORM implementation. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform_common.c | 126 ++ drivers/vfio/platform/vfio_platform_private.h | 36 2 files changed, 162 insertions(+) create mode 100644 drivers/vfio/platform/vfio_platform_common.c create mode 100644 drivers/vfio/platform/vfio_platform_private.h diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c new file mode 100644 index 000..e0fdbc8 --- /dev/null +++ b/drivers/vfio/platform/vfio_platform_common.c @@ -0,0 +1,126 @@ +/* + * Copyright (C) 2013 - Virtual Open Systems + * Author: Antonios Motakis a.mota...@virtualopensystems.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include linux/device.h +#include linux/interrupt.h +#include linux/iommu.h +#include linux/module.h +#include linux/mutex.h +#include linux/notifier.h +#include linux/pm_runtime.h +#include linux/slab.h +#include linux/types.h +#include linux/uaccess.h +#include linux/vfio.h +#include linux/io.h not sure at that state all the above includes are needed. + +#include vfio_platform_private.h + +static void vfio_platform_release(void *device_data) +{ + module_put(THIS_MODULE); +} + +static int vfio_platform_open(void *device_data) +{ + if (!try_module_get(THIS_MODULE)) + return -ENODEV; + + return 0; +} + +static long vfio_platform_ioctl(void *device_data, + unsigned int cmd, unsigned long arg) a minor style comment/question that applies on all the series. Shouldn't subsequent argument lines rather be aligned with the first char after (, as done in PCI code? It's also my preferred style to indent to just after the open paren on wrapped lines where possible, but I don't think there are hard rules in CodingStyle or checkpatch that enforce this, so I often let it slide. Thanks, Alex +{ + if (cmd == VFIO_DEVICE_GET_INFO) + return -EINVAL; + + else if (cmd == VFIO_DEVICE_GET_REGION_INFO) + return -EINVAL; + + else if (cmd == VFIO_DEVICE_GET_IRQ_INFO) + return -EINVAL; + + else if (cmd == VFIO_DEVICE_SET_IRQS) + return -EINVAL; + + else if (cmd == VFIO_DEVICE_RESET) + return -EINVAL; + + return -ENOTTY; +} + +static ssize_t vfio_platform_read(void *device_data, char __user *buf, + size_t count, loff_t *ppos) +{ + return -EINVAL; +} + +static ssize_t vfio_platform_write(void *device_data, const char __user *buf, + size_t count, loff_t *ppos) +{ + return -EINVAL; +} + +static int vfio_platform_mmap(void *device_data, struct vm_area_struct *vma) +{ + return -EINVAL; +} + +static const struct vfio_device_ops vfio_platform_ops = { + .name = vfio-platform, + .open = vfio_platform_open, + .release= vfio_platform_release, + .ioctl = vfio_platform_ioctl, + .read = vfio_platform_read, + .write = vfio_platform_write, + .mmap = vfio_platform_mmap, +}; + +int vfio_platform_probe_common(struct vfio_platform_device *vdev, + struct device *dev) +{ + struct iommu_group *group; + int ret; + + if (!vdev) + return -EINVAL; + + group = iommu_group_get(dev); + if (!group) { + pr_err(VFIO: No IOMMU group for device %s\n, vdev-name); + return -EINVAL; + } I saw the above check also is done at beginning of vfio_add_group_dev. Added value however is pr_err which is useful and PCI code does the check too. Eric + + ret = vfio_add_group_dev(dev, vfio_platform_ops, vdev); + if (ret) { +
Re: [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
Hi Mitch, On Wed, Aug 13, 2014 at 01:51:38AM +0100, Mitchel Humpherys wrote: Add a workaround for some buggy hardware that requires a TLB invalidate operation to occur at map time. Activate the feature with the qcom,smmu-invalidate-on-map boolean DT property. I'm digging up an old thread here, but I've been working on a new page-table allocator for the SMMU and looked into implementing this workaround for you in there. When I do the TLBI on map after installing the new PTE, can I just invalidate the range mapped by that PTE, or does it need to be a full TLBI? Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[RFC][PATCH] iommu/arm-smmu: Huge page mapping support for ARM SMMU driver.
This patch adds huge page mapping support for the ARM SMMU Driver. Patch allows creation of 1G and 2MB page mappings. It's also possible to create contiguous huge page mappings. The loop in PMD/PUD intialization code has been removed, considering that iommu_map would work on page size granularity (exported by the arm smmu driver). The code doesn't support creation of new mapping, without unmaping an existing mapping. Code does support splitting up of huge page mappings in to smaller mappings, as a part of unmap. Signed-off-by: Varun Sethi varun.se...@freescale.com --- drivers/iommu/arm-smmu.c | 324 -- 1 file changed, 257 insertions(+), 67 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 60558f7..cf05a1f 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -81,16 +81,22 @@ #define ARM_SMMU_PTE_PAGE (((pteval_t)3) 0) #if PAGE_SIZE == SZ_4K -#define ARM_SMMU_PTE_CONT_ENTRIES 16 +#define ARM_SMMU_PGTBL_CONT_ENTRIES16 #elif PAGE_SIZE == SZ_64K -#define ARM_SMMU_PTE_CONT_ENTRIES 32 +#define ARM_SMMU_PGTBL_CONT_ENTRIES32 #else -#define ARM_SMMU_PTE_CONT_ENTRIES 1 +#define ARM_SMMU_PGTBL_CONT_ENTRIES1 #endif -#define ARM_SMMU_PTE_CONT_SIZE (PAGE_SIZE * ARM_SMMU_PTE_CONT_ENTRIES) +#define ARM_SMMU_PTE_CONT_SIZE (PAGE_SIZE * ARM_SMMU_PGTBL_CONT_ENTRIES) #define ARM_SMMU_PTE_CONT_MASK (~(ARM_SMMU_PTE_CONT_SIZE - 1)) +#define ARM_SMMU_PMD_CONT_SIZE (PMD_SIZE * ARM_SMMU_PGTBL_CONT_ENTRIES) +#define ARM_SMMU_PMD_CONT_MASK (~(ARM_SMMU_PMD_CONT_SIZE - 1)) + +#define ARM_SMMU_PUD_CONT_SIZE (PUD_SIZE * ARM_SMMU_PGTBL_CONT_ENTRIES) +#define ARM_SMMU_PUD_CONT_MASK (~(ARM_SMMU_PUD_CONT_SIZE - 1)) + /* Stage-1 PTE */ #define ARM_SMMU_PTE_AP_UNPRIV (((pteval_t)1) 6) #define ARM_SMMU_PTE_AP_RDONLY (((pteval_t)2) 6) @@ -1272,64 +1278,105 @@ static void arm_smmu_detach_dev(struct iommu_domain *domain, struct device *dev) static bool arm_smmu_pte_is_contiguous_range(unsigned long addr, unsigned long end) { - return !(addr ~ARM_SMMU_PTE_CONT_MASK) + return IS_ALIGNED(addr, ARM_SMMU_PTE_CONT_SIZE) (addr + ARM_SMMU_PTE_CONT_SIZE = end); } -static int arm_smmu_alloc_init_pte(struct arm_smmu_device *smmu, pmd_t *pmd, - unsigned long addr, unsigned long end, - unsigned long pfn, int prot, int stage) +static bool arm_smmu_pmd_is_contiguous_range(unsigned long addr, +unsigned long end) { - pte_t *pte, *start; - pteval_t pteval = ARM_SMMU_PTE_PAGE | ARM_SMMU_PTE_AF | ARM_SMMU_PTE_XN; - - if (pmd_none(*pmd)) { - /* Allocate a new set of tables */ - pgtable_t table = alloc_page(GFP_ATOMIC|__GFP_ZERO); + return IS_ALIGNED(addr, ARM_SMMU_PMD_CONT_SIZE) + (addr + ARM_SMMU_PMD_CONT_SIZE = end); +} - if (!table) - return -ENOMEM; +static bool arm_smmu_pud_is_contiguous_range(unsigned long addr, +unsigned long end) +{ + return IS_ALIGNED(addr, ARM_SMMU_PUD_CONT_SIZE) + (addr + ARM_SMMU_PUD_CONT_SIZE = end); +} - arm_smmu_flush_pgtable(smmu, page_address(table), PAGE_SIZE); - pmd_populate(NULL, pmd, table); - arm_smmu_flush_pgtable(smmu, pmd, sizeof(*pmd)); - } +static pteval_t arm_smmu_page_prot(int prot, int stage, u64 type) +{ + pteval_t pgprot = ARM_SMMU_PTE_AF | ARM_SMMU_PTE_XN | type; if (stage == 1) { - pteval |= ARM_SMMU_PTE_AP_UNPRIV | ARM_SMMU_PTE_nG; + pgprot |= ARM_SMMU_PTE_AP_UNPRIV | ARM_SMMU_PTE_nG; if (!(prot IOMMU_WRITE) (prot IOMMU_READ)) - pteval |= ARM_SMMU_PTE_AP_RDONLY; + pgprot |= ARM_SMMU_PTE_AP_RDONLY; if (prot IOMMU_CACHE) - pteval |= (MAIR_ATTR_IDX_CACHE + pgprot |= (MAIR_ATTR_IDX_CACHE ARM_SMMU_PTE_ATTRINDX_SHIFT); } else { - pteval |= ARM_SMMU_PTE_HAP_FAULT; + pgprot |= ARM_SMMU_PTE_HAP_FAULT; if (prot IOMMU_READ) - pteval |= ARM_SMMU_PTE_HAP_READ; + pgprot |= ARM_SMMU_PTE_HAP_READ; if (prot IOMMU_WRITE) - pteval |= ARM_SMMU_PTE_HAP_WRITE; + pgprot |= ARM_SMMU_PTE_HAP_WRITE; if (prot IOMMU_CACHE) - pteval |= ARM_SMMU_PTE_MEMATTR_OIWB; + pgprot |= ARM_SMMU_PTE_MEMATTR_OIWB; else - pteval |= ARM_SMMU_PTE_MEMATTR_NC; +
Re: [PATCH 5/6] iommu/arm-smmu: support buggy implementations with invalidate-on-map
On Wed, Nov 12 2014 at 10:26:43 AM, Will Deacon will.dea...@arm.com wrote: Hi Mitch, On Wed, Aug 13, 2014 at 01:51:38AM +0100, Mitchel Humpherys wrote: Add a workaround for some buggy hardware that requires a TLB invalidate operation to occur at map time. Activate the feature with the qcom,smmu-invalidate-on-map boolean DT property. I'm digging up an old thread here, but I've been working on a new page-table allocator for the SMMU and looked into implementing this workaround for you in there. When I do the TLBI on map after installing the new PTE, can I just invalidate the range mapped by that PTE, or does it need to be a full TLBI? I'm not totally sure on the history of the hardware errata but I believe it's just the range mapped by that pte. We use SMMU_CBn_TLBIVA in the our smmu driver. However, let's actually just drop this... It's looking like the targets we have that will use the arm-smmu driver thankfully won't need this workaround. Thanks for keeping this in mind though :) -Mitch -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes
Wu, Feng wrote on 2014-11-13: kvm-ow...@vger.kernel.org wrote on 2014-11-12: k...@vger.kernel.org; iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org Subject: Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes On 12/11/2014 10:19, Wu, Feng wrote: You can certainly backport these patches to distros that do not have VFIO. But upstream we should work on VFIO first. VFIO has feature parity with legacy device assignment, and adding a new feature that is not in VFIO would be a bad idea. By the way, do you have benchmark results for it? We have not been able to see any performance improvement for APICv on e.g. netperf. Do you mean benchmark results for APICv itself or VT-d Posted-Interrtups? Especially for VT-d posted interrupts---but it'd be great to know which workloads see the biggest speedup from APICv. We have some draft performance data internally, please see the attached. For VT-d PI, I think we can get the biggest performance gain if the VCPU is running in non-root mode for most of the time (not in HLT state), since external interrupt from assigned devices will be delivered by guest directly in this case. That means we can run some cpu intensive workload in the guests. Have you check that the CPU side posted interrupt is taking effect in w/o VT-D PI case? Per my understanding, the performance gap should be so large if you use CPU side posted interrupt. This data more like the VT-d PI vs non PI(both VT-d and CPU). Thanks, Feng Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Best regards, Yang ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes
-Original Message- From: Zhang, Yang Z Sent: Thursday, November 13, 2014 9:21 AM To: Wu, Feng; Paolo Bonzini; Alex Williamson Cc: g...@kernel.org; dw...@infradead.org; j...@8bytes.org; t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org; k...@vger.kernel.org; iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org Subject: RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes Wu, Feng wrote on 2014-11-13: kvm-ow...@vger.kernel.org wrote on 2014-11-12: k...@vger.kernel.org; iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org Subject: Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes On 12/11/2014 10:19, Wu, Feng wrote: You can certainly backport these patches to distros that do not have VFIO. But upstream we should work on VFIO first. VFIO has feature parity with legacy device assignment, and adding a new feature that is not in VFIO would be a bad idea. By the way, do you have benchmark results for it? We have not been able to see any performance improvement for APICv on e.g. netperf. Do you mean benchmark results for APICv itself or VT-d Posted-Interrtups? Especially for VT-d posted interrupts---but it'd be great to know which workloads see the biggest speedup from APICv. We have some draft performance data internally, please see the attached. For VT-d PI, I think we can get the biggest performance gain if the VCPU is running in non-root mode for most of the time (not in HLT state), since external interrupt from assigned devices will be delivered by guest directly in this case. That means we can run some cpu intensive workload in the guests. Have you check that the CPU side posted interrupt is taking effect in w/o VT-D PI case? Per my understanding, the performance gap should be so large if you use CPU side posted interrupt. This data more like the VT-d PI vs non PI(both VT-d and CPU). Yes, this data is VT-d PI vs Non VT-d PI. The CPU side APICv mechanism (including CPU side Posted-Interrtups) is enabled. Thanks, Feng Thanks, Feng Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Best regards, Yang ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes
Wu, Feng wrote on 2014-11-13: Zhang, Yang Z wrote on 2014-11-13: k...@vger.kernel.org; iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org Subject: RE: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes Wu, Feng wrote on 2014-11-13: kvm-ow...@vger.kernel.org wrote on 2014-11-12: k...@vger.kernel.org; iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org Subject: Re: [PATCH 05/13] KVM: Update IRTE according to guest interrupt configuration changes On 12/11/2014 10:19, Wu, Feng wrote: You can certainly backport these patches to distros that do not have VFIO. But upstream we should work on VFIO first. VFIO has feature parity with legacy device assignment, and adding a new feature that is not in VFIO would be a bad idea. By the way, do you have benchmark results for it? We have not been able to see any performance improvement for APICv on e.g. netperf. Do you mean benchmark results for APICv itself or VT-d Posted-Interrtups? Especially for VT-d posted interrupts---but it'd be great to know which workloads see the biggest speedup from APICv. We have some draft performance data internally, please see the attached. For VT-d PI, I think we can get the biggest performance gain if the VCPU is running in non-root mode for most of the time (not in HLT state), since external interrupt from assigned devices will be delivered by guest directly in this case. That means we can run some cpu intensive workload in the guests. Have you check that the CPU side posted interrupt is taking effect in w/o VT-D PI case? Per my understanding, the performance gap should be so large if you use CPU side posted interrupt. This data more like the VT-d PI vs non PI(both VT-d and CPU). Yes, this data is VT-d PI vs Non VT-d PI. The CPU side APICv mechanism (including CPU side Posted-Interrtups) is enabled. From the CPU utilization data, it seems the environment of APICv is not reasonable to me. with current APICv, the interrupt should not deliver to the PCPU where vcpu is running. Otherwise, it will force the vcpu vmexit and the CPU side posted interrupt cannot take effect. Do you set the interrupt affinity manually? Thanks, Feng Thanks, Feng Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Best regards, Yang Best regards, Yang ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu