Re: [PATCH 0/5] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO
On 10/22/14 at 10:22am, Li, Zhen-Hua wrote: Hi Baoquan, I tested it on 3.17, it does not have these faults. There are little differences between this version and Bill's last version. I will test it on 3.18.0-rc1+ on my system and let you know the result. And could you send me the result of lspci -vvv on your system? I have pasted them here. [~]$ lspci -vvv 00:00.0 Host bridge: Intel Corporation Xeon E5/Core i7 DMI2 (rev 07) Subsystem: Hewlett-Packard Company Device 1589 Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Interrupt: pin A routed to IRQ 0 Capabilities: access denied 00:01.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 1a (rev 07) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Bus: primary=00, secondary=03, subordinate=03, sec-latency=0 I/O behind bridge: f000-0fff Memory behind bridge: fff0-000f Prefetchable memory behind bridge: fff0-000f Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: access denied Kernel driver in use: pcieport 00:02.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 2a (rev 07) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Bus: primary=00, secondary=05, subordinate=05, sec-latency=0 I/O behind bridge: d000-dfff Memory behind bridge: d600-d70f Prefetchable memory behind bridge: d800-ddff Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort+ SERR- PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA+ MAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: access denied Kernel driver in use: pcieport 00:03.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express Root Port 3a in PCI Express Mode (rev 07) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Bus: primary=00, secondary=04, subordinate=04, sec-latency=0 I/O behind bridge: f000-0fff Memory behind bridge: fff0-000f Prefetchable memory behind bridge: fff0-000f Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: access denied Kernel driver in use: pcieport 00:05.0 System peripheral: Intel Corporation Xeon E5/Core i7 Address Map, VTd_Misc, System Management (rev 07) Subsystem: Hewlett-Packard Company Device 1589 Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Capabilities: access denied 00:05.2 System peripheral: Intel Corporation Xeon E5/Core i7 Control Status and Global Errors (rev 07) Subsystem: Hewlett-Packard Company Device 1589 Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Capabilities: access denied 00:05.4 PIC: Intel Corporation Xeon E5/Core i7 I/O APIC (rev 07) (prog-if 20 [IO(X)-APIC]) Subsystem: Intel Corporation Xeon E5/Core i7 I/O APIC Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Region 0: Memory at d7346000 (32-bit, non-prefetchable) [size=4K] Capabilities: access denied 00:11.0 PCI bridge: Intel Corporation
Re: [PATCH 0/8] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO
Hi Bjorn, On Tue, Oct 21, 2014 at 08:16:46PM -0600, Bjorn Helgaas wrote: I was looking at Zhen-Hua's recent patches, trying to figure out if I need to do anything with them. Resetting devices in the old kernel seems like a non-starter. Resetting devices in the new kernel, ..., well, maybe. It seems ugly, and it seems like the sort of problem that IOMMUs are designed to solve. Actually resetting the devices in the kdump kernel would be one of the better solutions for this problem. When we have a generic way to stop all in-flight DMA from the PCI endpoints we could safely disable and then re-enable the IOMMU. On Wed, Jul 2, 2014 at 7:32 AM, Joerg Roedel j...@8bytes.org wrote: That is a solution to prevent the in-flight DMA failures. But what happens when there is some in-flight DMA to a disk to write some inodes or a new superblock. Then this scratch address-space may cause filesystem corruption at worst. This in-flight DMA is from a device programmed by the old kernel, and it would be reading data from the old kernel's buffers. I think you're suggesting that we might want that DMA read to complete so the device can update filesystem metadata? Well, it is not about updating filesystem metadata. In the kdump kernel we have these options: 1) Disable the IOMMU. Problem here is, that DMA is now untranslated, so that any in-flight DMA might read or write from a random location in memory, corrupting the kdump or even the new kexec kernel memory. So this is a non-starter. 2) Re-program the IOMMU to block all DMA. This is safer as it does not corrupt any data in memory. But some devices react very poorly on a master abort from the IOMMU, so bad that the driver in the kdump kernel fails to initialize that device. In this case taking dump itself might fail (and often does, according to reports) 3) To prevent master aborts like in option (2), David suggested to map the whole DMA address space to a scratch page. This also prevents system memory corruption and the master aborts. But the problem is, that in-flight DMA will now read all zeros. This can corrupt disk and network data, at worst it nulls out the superblocks of your filesystem and you lose all data. So this is not an option too. What we currently do in the VT-d driver is a mixture of (1) and (2). The VT-d driver disables the IOMMU hardware (opening a race window for memory data corruption), re-initializes it to reject any ongoing DMA (which causes master aborts for in-flight DMA) and re-enables the IOMMU hardware. But this strategy fails in heavy IO environments quite often and we look into ways to solve the problem, or at least improve the current situation as good as we can. I talked to David about this at LPC and we came up with basically this procedure: 1. If the VT-d driver finds the IOMMU enabled, it reuses its root-context table. This way the IOMMU must not be disabled and re-enabled, eliminating the race we have now. Other data structures like the context-entries are copied over from the old kernel. We basically keep all mappings from the old kernel, allowing any in-flight DMA to succeed. No memory or disk data corruption. The issue here is, that we modify data from the old kernel which is about to be dumped. But there are ways to work around that. 2. When a device driver issues the first dma_map command for a device, we assign a new and empty page-table, thus removing all mappings from the old kernel for the device. Rationale is, that at this point the device driver should have reset the device to a point where all in-flight DMA is canceled. This approach goes into the same direction as Bill Sumners patch-set, which Li took over. But it goes not as far as keeping the old mappings while the kdump kernel is still working with the devices (which might introduce new issues and corner cases). So with this in mind I would prefer initially taking over the page-tables from the old kernel before the device drivers re-initialize the devices. This makes the dump kernel more dependent on data from the old kernel, which we obviously want to avoid when possible. Sure, but this is not really possible here (unless we have a generic and reliable way to reset all PCI endpoint devices and cancel all in-flight DMA before we disable the IOMMU in the kdump kernel). Otherwise we always risk data corruption somewhere, in system memory or on disk. I didn't find the previous discussion where pointing every virtual bus address at the same physical scratch page was proposed. Why was that better than programming the IOMMU to reject every DMA? As I said, the problem is that this
Re: [PATCH 12/17] iommu/omap: Integrate omap-iommu-debug into omap-iommu
On Tue, Oct 21, 2014 at 04:28:27PM -0500, Suman Anna wrote: I am looking to refresh this series for 3.19, and this is the only patch that may need some changes. Please let me know what your preference is, and I can rework this patch if needed. Either way, the plan is to not have an OMAP IOMMU debugfs module, but only to have an option not to build the debugfs portions. Can I apply this series without this patch and you send me an updated version of only this patch? If yes, please let me know, otherwise please rebase the series on v3.18-rc1 and resend. Thanks, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 02/18] vfio: platform: probe to devices on the platform bus
On Tue, Oct 21, 2014 at 6:37 PM, Eric Auger eric.au...@linaro.org wrote: On 10/21/2014 06:17 PM, Alex Williamson wrote: On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote: Driver to bind to Linux platform devices, and callbacks to discover their resources to be used by the main VFIO PLATFORM code. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform.c | 107 ++ include/uapi/linux/vfio.h | 1 + 2 files changed, 108 insertions(+) create mode 100644 drivers/vfio/platform/vfio_platform.c diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c new file mode 100644 index 000..baeb7da --- /dev/null +++ b/drivers/vfio/platform/vfio_platform.c @@ -0,0 +1,107 @@ +/* + * Copyright (C) 2013 - Virtual Open Systems + * Author: Antonios Motakis a.mota...@virtualopensystems.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include linux/device.h +#include linux/eventfd.h +#include linux/interrupt.h +#include linux/iommu.h +#include linux/module.h +#include linux/mutex.h +#include linux/notifier.h +#include linux/pm_runtime.h +#include linux/slab.h +#include linux/types.h +#include linux/uaccess.h +#include linux/vfio.h +#include linux/io.h +#include linux/platform_device.h +#include linux/irq.h + +#include vfio_platform_private.h + +#define DRIVER_VERSION 0.8 +#define DRIVER_AUTHOR Antonios Motakis a.mota...@virtualopensystems.com +#define DRIVER_DESC VFIO for platform devices - User Level meta-driver + +/* probing devices from the linux platform bus */ + +static struct resource *get_platform_resource(struct vfio_platform_device *vdev, +int num) +{ +struct platform_device *dev = (struct platform_device *) vdev-opaque; +int i; + +for (i = 0; i dev-num_resources; i++) { +struct resource *r = dev-resource[i]; + +if (resource_type(r) (IORESOURCE_MEM|IORESOURCE_IO)) { +num--; + +if (!num) +return r; Has this been tested? What happens when we call this with num = 0? Yep. I confirm I enter that case with my xgmac where the IORESOURCE_MEM is the 1st resource. I Just ended to the same cause ;-) Right, num-- should be after the check, or we miss one region. Best Regards Eric +} +} +return NULL; +} + +static int get_platform_irq(struct vfio_platform_device *vdev, int i) +{ +struct platform_device *pdev = (struct platform_device *) vdev-opaque; + +return platform_get_irq(pdev, i); +} + + +static int vfio_platform_probe(struct platform_device *pdev) +{ +struct vfio_platform_device *vdev; +int ret; + +vdev = kzalloc(sizeof(*vdev), GFP_KERNEL); +if (!vdev) +return -ENOMEM; + +vdev-opaque = (void *) pdev; +vdev-name = pdev-name; +vdev-flags = VFIO_DEVICE_FLAGS_PLATFORM; +vdev-get_resource = get_platform_resource; +vdev-get_irq = get_platform_irq; + +ret = vfio_platform_probe_common(vdev, pdev-dev); +if (ret) +kfree(vdev); + +return ret; +} + +static int vfio_platform_remove(struct platform_device *pdev) +{ +return vfio_platform_remove_common(pdev-dev); +} + +static struct platform_driver vfio_platform_driver = { +.probe = vfio_platform_probe, +.remove = vfio_platform_remove, +.driver = { +.name = vfio-platform, +.owner = THIS_MODULE, +}, +}; + +module_platform_driver(vfio_platform_driver); + +MODULE_VERSION(DRIVER_VERSION); +MODULE_LICENSE(GPL v2); +MODULE_AUTHOR(DRIVER_AUTHOR); +MODULE_DESCRIPTION(DRIVER_DESC); diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 111b5e8..aca6d3e 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -158,6 +158,7 @@ struct vfio_device_info { __u32 flags; #define VFIO_DEVICE_FLAGS_RESET (1 0)/* Device supports reset */ #define VFIO_DEVICE_FLAGS_PCI (1 1)/* vfio-pci device */ +#define VFIO_DEVICE_FLAGS_PLATFORM (1 2) /* vfio-platform device */ __u32 num_regions;/* Max region index + 1 */ __u32 num_irqs; /* Max IRQ index + 1 */ }; -- Antonios Motakis Virtual Open Systems ___ iommu mailing list iommu@lists.linux-foundation.org
Re: [PATCH v8 07/18] vfio/platform: return info for device memory mapped IO regions
On Tue, Oct 21, 2014 at 6:34 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote: This patch enables the IOCTLs VFIO_DEVICE_GET_REGION_INFO ioctl call, which allows the user to learn about the available MMIO resources of a device. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform_common.c | 93 +-- drivers/vfio/platform/vfio_platform_private.h | 22 +++ 2 files changed, 111 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c index 1e4073f..8a7e474 100644 --- a/drivers/vfio/platform/vfio_platform_common.c +++ b/drivers/vfio/platform/vfio_platform_common.c @@ -27,17 +27,84 @@ #include vfio_platform_private.h +static int vfio_platform_regions_init(struct vfio_platform_device *vdev) +{ + int cnt = 0, i; + + while (vdev-get_resource(vdev, cnt)) + cnt++; + + vdev-regions = kcalloc(cnt, sizeof(struct vfio_platform_region), + GFP_KERNEL); + if (!vdev-regions) + return -ENOMEM; + + for (i = 0; i cnt; i++) { + struct resource *res = + vdev-get_resource(vdev, i); + + if (!res) + goto err; + + vdev-regions[i].addr = res-start; + vdev-regions[i].size = resource_size(res); + vdev-regions[i].flags = 0; + + switch (resource_type(res)) { + case IORESOURCE_MEM: + vdev-regions[i].type = VFIO_PLATFORM_REGION_TYPE_MMIO; + break; + case IORESOURCE_IO: + vdev-regions[i].type = VFIO_PLATFORM_REGION_TYPE_PIO; + break; Ok, we have support for PIO in platform now (thanks!), does the user know what type of region they're dealing with? Do they care? For PCI the user tests the PCI BAR in config space to determine which type it is. I'm guessing that platform would do something similar against the device tree or ACPI, right? Maybe is worthwhile to add an explicit flag in VFIO_DEVICE_GET_REGION_INFO for PIO regions. For platform devices I don't know if we can always rely on DT or ACPI info to be available. For VFIO PCI the BAR is always implemented, and while I have proposed an RFC to return DT information, I don't think we can assume how a device is described in the host, whether DT, ACPI, or dark magic. + default: + goto err; + } + } + + vdev-num_regions = cnt; + + return 0; +err: + kfree(vdev-regions); + return -EINVAL; +} + +static void vfio_platform_regions_cleanup(struct vfio_platform_device *vdev) +{ + vdev-num_regions = 0; + kfree(vdev-regions); +} + static void vfio_platform_release(void *device_data) { + struct vfio_platform_device *vdev = device_data; + + if (atomic_dec_and_test(vdev-refcnt)) + vfio_platform_regions_cleanup(vdev); + module_put(THIS_MODULE); } static int vfio_platform_open(void *device_data) { + struct vfio_platform_device *vdev = device_data; + int ret; + if (!try_module_get(THIS_MODULE)) return -ENODEV; + if (atomic_inc_return(vdev-refcnt) == 1) { + ret = vfio_platform_regions_init(vdev); + if (ret) + goto err_reg; + } + return 0; + +err_reg: + module_put(THIS_MODULE); + return ret; Note that if vfio_platform_regions_init() fails then your refcnt is wrong. We switched to a mutex in vfio-pci to avoid this. See 61d792562b53. Ack. } static long vfio_platform_ioctl(void *device_data, @@ -58,15 +125,33 @@ static long vfio_platform_ioctl(void *device_data, return -EINVAL; info.flags = vdev-flags; - info.num_regions = 0; + info.num_regions = vdev-num_regions; info.num_irqs = 0; return copy_to_user((void __user *)arg, info, minsz); - } else if (cmd == VFIO_DEVICE_GET_REGION_INFO) - return -EINVAL; + } else if (cmd == VFIO_DEVICE_GET_REGION_INFO) { + struct vfio_region_info info; + + minsz = offsetofend(struct vfio_region_info, offset); + + if (copy_from_user(info, (void __user *)arg, minsz)) + return -EFAULT; + + if (info.argsz minsz) + return -EINVAL; + + if (info.index = vdev-num_regions) + return -EINVAL; + + /* map offset to the physical address */ + info.offset = VFIO_PLATFORM_INDEX_TO_OFFSET(info.index); + info.size = vdev-regions[info.index].size; + info.flags =
Re: [PATCH v8 09/18] vfio/platform: support MMAP of MMIO regions
On Tue, Oct 21, 2014 at 6:51 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote: Allow to memory map the MMIO regions of the device so userspace can directly access them. PIO regions are not being handled at this point. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform_common.c | 57 1 file changed, 57 insertions(+) diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c index ac74710..4db7187 100644 --- a/drivers/vfio/platform/vfio_platform_common.c +++ b/drivers/vfio/platform/vfio_platform_common.c @@ -57,6 +57,16 @@ static int vfio_platform_regions_init(struct vfio_platform_device *vdev) if (!(res-flags IORESOURCE_READONLY)) vdev-regions[i].flags |= VFIO_REGION_INFO_FLAG_WRITE; + + /* + * Only regions addressed with PAGE granularity may be + * MMAPed securely. + */ + if (!(vdev-regions[i].addr ~PAGE_MASK) + !(vdev-regions[i].size ~PAGE_MASK)) + vdev-regions[i].flags |= + VFIO_REGION_INFO_FLAG_MMAP; + Should this be included in the above !readonly test? I don't see that we're doing anything below that would prevent writes to the mmap for a readonly resource. I suspect that just like PCI, it's not all that useful to provide mmap support for read-only regions. They're not typically performance paths. Indeed. Alternatively we could just MMAP the region as read only as well. Even if typically they are not performance critical areas, maybe it doesn't hurt to allow MMAP when possible. break; case IORESOURCE_IO: vdev-regions[i].type = VFIO_PLATFORM_REGION_TYPE_PIO; @@ -325,8 +335,55 @@ static ssize_t vfio_platform_write(void *device_data, const char __user *buf, return -EINVAL; } +static int vfio_platform_mmap_mmio(struct vfio_platform_region region, +struct vm_area_struct *vma) +{ + u64 req_len, pgoff, req_start; + + req_len = vma-vm_end - vma-vm_start; + pgoff = vma-vm_pgoff + ((1U (VFIO_PLATFORM_OFFSET_SHIFT - PAGE_SHIFT)) - 1); + req_start = pgoff PAGE_SHIFT; + + if (region.size PAGE_SIZE || req_start + req_len region.size) + return -EINVAL; + + vma-vm_page_prot = pgprot_noncached(vma-vm_page_prot); + vma-vm_pgoff = (region.addr PAGE_SHIFT) + pgoff; + + return remap_pfn_range(vma, vma-vm_start, vma-vm_pgoff, +req_len, vma-vm_page_prot); +} + static int vfio_platform_mmap(void *device_data, struct vm_area_struct *vma) { + struct vfio_platform_device *vdev = device_data; + unsigned int index; + + index = vma-vm_pgoff (VFIO_PLATFORM_OFFSET_SHIFT - PAGE_SHIFT); + + if (vma-vm_end vma-vm_start) + return -EINVAL; + if ((vma-vm_flags VM_SHARED) == 0) + return -EINVAL; + if (index = vdev-num_regions) + return -EINVAL; + if (vma-vm_start ~PAGE_MASK) + return -EINVAL; + if (vma-vm_end ~PAGE_MASK) + return -EINVAL; + + if (!(vdev-regions[index].flags VFIO_REGION_INFO_FLAG_MMAP)) + return -EINVAL; + + vma-vm_private_data = vdev; + + if (vdev-regions[index].type VFIO_PLATFORM_REGION_TYPE_MMIO) + return vfio_platform_mmap_mmio(vdev-regions[index], vma); + + else if (vdev-regions[index].type VFIO_PLATFORM_REGION_TYPE_PIO) + return -EINVAL; /* not implemented */ + return -EINVAL; } -- Antonios Motakis Virtual Open Systems ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 13/18] vfio/platform: support for maskable and automasked interrupts
On Tue, Oct 21, 2014 at 7:47 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote: Adds support to mask interrupts, and also for automasked interrupts. Level sensitive interrupts are exposed as automasked interrupts and are masked and disabled automatically when they fire. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform_irq.c | 94 +-- drivers/vfio/platform/vfio_platform_private.h | 2 + 2 files changed, 91 insertions(+), 5 deletions(-) diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c index 4359b9c..7620a17 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -31,27 +31,103 @@ #include vfio_platform_private.h +static void vfio_platform_mask(struct vfio_platform_irq *irq_ctx) +{ + unsigned long flags; + + spin_lock_irqsave(irq_ctx-lock, flags); + + if (!irq_ctx-masked) { + disable_irq(irq_ctx-hwirq); + irq_ctx-masked = true; + } + + spin_unlock_irqrestore(irq_ctx-lock, flags); +} + static int vfio_platform_set_irq_mask(struct vfio_platform_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { - return -EINVAL; + if (start != 0 || count != 1) + return -EINVAL; + + if (flags VFIO_IRQ_SET_DATA_EVENTFD) + return -EINVAL; /* not implemented yet */ + + if (flags VFIO_IRQ_SET_DATA_NONE) { + vfio_platform_mask(vdev-irqs[index]); + + } else if (flags VFIO_IRQ_SET_DATA_BOOL) { + uint8_t mask = *(uint8_t *)data; + + if (mask) + vfio_platform_mask(vdev-irqs[index]); + } + + return 0; +} + +static void vfio_platform_unmask(struct vfio_platform_irq *irq_ctx) +{ + unsigned long flags; + + spin_lock_irqsave(irq_ctx-lock, flags); + + if (irq_ctx-masked) { + enable_irq(irq_ctx-hwirq); + irq_ctx-masked = false; + } + + spin_unlock_irqrestore(irq_ctx-lock, flags); } static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev, unsigned index, unsigned start, unsigned count, uint32_t flags, void *data) { - return -EINVAL; + if (start != 0 || count != 1) + return -EINVAL; + + if (flags VFIO_IRQ_SET_DATA_EVENTFD) + return -EINVAL; /* not implemented yet */ + + if (flags VFIO_IRQ_SET_DATA_NONE) { + vfio_platform_unmask(vdev-irqs[index]); + + } else if (flags VFIO_IRQ_SET_DATA_BOOL) { + uint8_t unmask = *(uint8_t *)data; + + if (unmask) + vfio_platform_unmask(vdev-irqs[index]); + } + + return 0; } static irqreturn_t vfio_irq_handler(int irq, void *dev_id) { struct vfio_platform_irq *irq_ctx = dev_id; + unsigned long flags; + int ret = IRQ_NONE; - eventfd_signal(irq_ctx-trigger, 1); + spin_lock_irqsave(irq_ctx-lock, flags); + + if (!irq_ctx-masked) { + ret = IRQ_HANDLED; + + if (irq_ctx-flags VFIO_IRQ_INFO_AUTOMASKED) { + disable_irq_nosync(irq_ctx-hwirq); + irq_ctx-masked = true; + } + } - return IRQ_HANDLED; + spin_unlock_irqrestore(irq_ctx-lock, flags); + + if (ret == IRQ_HANDLED) + eventfd_signal(irq_ctx-trigger, 1); + + return ret; } If you actually have edge interrupts, you're unnecessarily penalizing them with the spinlock here. You could do like vfio-pci and only advertise level interrupts as maskable then use separate edge vs level handlers. Ok, I shall implement separate handlers then. static int vfio_set_trigger(struct vfio_platform_device *vdev, @@ -169,9 +245,17 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev) if (hwirq 0) goto err; - vdev-irqs[i].flags = VFIO_IRQ_INFO_EVENTFD; + spin_lock_init(vdev-irqs[i].lock); + + vdev-irqs[i].flags = VFIO_IRQ_INFO_EVENTFD + | VFIO_IRQ_INFO_MASKABLE; + + if (irq_get_trigger_type(hwirq) IRQ_TYPE_LEVEL_MASK) + vdev-irqs[i].flags |= VFIO_IRQ_INFO_AUTOMASKED; + vdev-irqs[i].count = 1; vdev-irqs[i].hwirq = hwirq; + vdev-irqs[i].masked = false; } vdev-num_irqs = cnt; diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h index 47af6e0..65e80e7 100644 ---
Re: [PATCH v8 14/18] vfio: move eventfd support code for VFIO_PCI to a separate file
On Tue, Oct 21, 2014 at 7:55 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote: The virqfd functionality that is used by VFIO_PCI to implement interrupt masking and unmasking via an eventfd, is generic enough and can be reused by another driver. Move it to a separate file in order to allow the code to be shared. Also properly export virqfd_enable and virqfd_disable in the process. Let's be friendly to the global namespace and add vfio_ prefixes to these as we export them. Ack. -- Antonios Motakis Virtual Open Systems ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/5] vfio: type1: replace domain wide protection flags with supported capabilities
On Wed, Oct 22, 2014 at 11:08 AM, Eric Auger eric.au...@linaro.org wrote: On 10/13/2014 03:09 PM, Antonios Motakis wrote: VFIO_IOMMU_TYPE1 keeps track for each domain it knows a list of protection flags it always applies to all mappings in the domain. This is used for domains that support IOMMU_CAP_CACHE_COHERENCY. Refactor this slightly, by keeping track instead that a given domain supports the capability, and applying the IOMMU_CACHE protection flag when doing the actual DMA mappings. This will allow us to reuse the behavior for IOMMU_CAP_NOEXEC, which we also want to keep track of, but without applying it to all domains that support it unless the user explicitly requests it. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/vfio_iommu_type1.c | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 562f686..62a8b4d 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -64,7 +64,7 @@ struct vfio_domain { struct iommu_domain *domain; struct list_headnext; struct list_headgroup_list; - int prot; /* IOMMU_CACHE */ + int caps; }; struct vfio_dma { @@ -485,7 +485,7 @@ static int map_try_harder(struct vfio_domain *domain, dma_addr_t iova, for (i = 0; i npage; i++, pfn++, iova += PAGE_SIZE) { ret = iommu_map(domain-domain, iova, (phys_addr_t)pfn PAGE_SHIFT, - PAGE_SIZE, prot | domain-prot); + PAGE_SIZE, prot); if (ret) break; } @@ -503,11 +503,16 @@ static int vfio_iommu_map(struct vfio_iommu *iommu, dma_addr_t iova, int ret; list_for_each_entry(d, iommu-domain_list, next) { + int dprot = prot; + + if (d-caps | IOMMU_CAP_CACHE_COHERENCY) should be + dprot |= IOMMU_CACHE; + ret = iommu_map(d-domain, iova, (phys_addr_t)pfn PAGE_SHIFT, - npage PAGE_SHIFT, prot | d-prot); + npage PAGE_SHIFT, dprot); if (ret) { if (ret != -EBUSY || - map_try_harder(d, iova, pfn, npage, prot)) + map_try_harder(d, iova, pfn, npage, dprot)) goto unwind; } } @@ -620,6 +625,10 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu, struct vfio_domain *d; struct rb_node *n; int ret; + int dprot = 0; + + if (domain-caps | IOMMU_CAP_CACHE_COHERENCY) same to be fixed here. With the 3 | corrections and num-- fix in get_platform_resource, v8 is functional with Calxeda xgmac QEMU VFIO device. Ack. Best Regards Eric + dprot |= IOMMU_CACHE; /* Arbitrarily pick the first domain in the list for lookups */ d = list_first_entry(iommu-domain_list, struct vfio_domain, next); @@ -653,7 +662,7 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu, size += PAGE_SIZE; ret = iommu_map(domain-domain, iova, phys, - size, dma-prot | domain-prot); + size, dma-prot | dprot); if (ret) return ret; @@ -721,7 +730,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, } if (iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY)) - domain-prot |= IOMMU_CACHE; + domain-caps |= IOMMU_CAP_CACHE_COHERENCY; /* * Try to match an existing compatible domain. We don't want to @@ -732,7 +741,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, */ list_for_each_entry(d, iommu-domain_list, next) { if (d-domain-ops == domain-domain-ops - d-prot == domain-prot) { + d-caps == domain-caps) { iommu_detach_group(domain-domain, iommu_group); if (!iommu_attach_group(d-domain, iommu_group)) { list_add(group-next, d-group_list); @@ -865,7 +874,7 @@ static int vfio_domains_have_iommu_cache(struct vfio_iommu *iommu) mutex_lock(iommu-lock); list_for_each_entry(domain, iommu-domain_list, next) { - if (!(domain-prot IOMMU_CACHE)) { + if (!(domain-caps IOMMU_CAP_CACHE_COHERENCY)) { ret = 0; break; } -- Antonios Motakis Virtual Open Systems ___ iommu mailing list iommu@lists.linux-foundation.org
Re: [PATCH v2 11/18] iommu: exynos: remove useless device_add/remove callbacks
Le mardi 16 septembre 2014 à 13:54 +0200, Marek Szyprowski a écrit : The driver doesn't need to do anything important in device add/remove callbacks, because initialization will be done from device-tree specific callbacks added later. IOMMU groups created by current code were never used. Signed-off-by: Marek Szyprowski m.szyprow...@samsung.com The exyons iommu init fails if those are removed, that is it never reach init_done: 1. exynos_iommu_setup 2. - exynos_iommu_init 3. ---bus_set_iommu 4. -- add_iommu_group that is (4) add_iommu_group returns ENODEV to bus_set_iommu, the latter doing so to exynos_iommu_init. Which thus error out before the init_done is set to true. Mind restoring device_add/remove is not that easy. Either I did not and modified add_iommu_group to return 0 if add_device was not defined. This works. Otherwise I did revert this patch but had to also move the iommu_init from iommu.c before of_iommu_init in arch/arm/kernel/setup.c (switching iommu_init to global namespace in the process). This to avoid use of not yet initialized mutex iommu_group_mutex and crash that follow suit. This mutex is used in iommu_group_add_device called by exynos_iommu_add_device. The logs of the first issue, ie exynos_iommo_init and bus_set_iommu returns ENODEV [0.602816] exynos_iommu_init: Failed to register exynos-iommu driver. [0.603358] platform 1362.sysmmu: device is not dma coherent [0.603384] platform 1362.sysmmu: device is not behind an iommu [0.605243] exynos-sysmmu 1362.sysmmu: genpd_dev_pm_attach() failed to find PM domain: -2 [0.606399] exynos_iommu_init: Failed to register exynos-iommu driver. [0.607263] platform 1363.sysmmu: device is not dma coherent [0.607290] platform 1363.sysmmu: device is not behind an iommu [0.609111] exynos-sysmmu 1362.sysmmu: genpd_dev_pm_attach() failed to find PM domain: -2 [0.609464] exynos-sysmmu 1362.sysmmu: Unbalanced pm_runtime_enable! [0.609868] exynos-sysmmu 1363.sysmmu: genpd_dev_pm_attach() failed to find PM domain: -2 (...) [0.946506] platform 1080.g2d: device is not dma coherent [0.946553] platform 1080.g2d: device is not behind an iommu (...) Mind the power domain failure comes from me not finding a way to apply the power domain registration this early yet . Best regards, Alban Broweys --- drivers/iommu/exynos-iommu.c | 28 1 file changed, 28 deletions(-) diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c index b271348a4ec1..1b3f00726cd4 100644 --- a/drivers/iommu/exynos-iommu.c +++ b/drivers/iommu/exynos-iommu.c @@ -1055,32 +1055,6 @@ static phys_addr_t exynos_iommu_iova_to_phys(struct iommu_domain *domain, return phys; } -static int exynos_iommu_add_device(struct device *dev) -{ - struct iommu_group *group; - int ret; - - group = iommu_group_get(dev); - - if (!group) { - group = iommu_group_alloc(); - if (IS_ERR(group)) { - dev_err(dev, Failed to allocate IOMMU group\n); - return PTR_ERR(group); - } - } - - ret = iommu_group_add_device(group, dev); - iommu_group_put(group); - - return ret; -} - -static void exynos_iommu_remove_device(struct device *dev) -{ - iommu_group_remove_device(dev); -} - static const struct iommu_ops exynos_iommu_ops = { .domain_init = exynos_iommu_domain_init, .domain_destroy = exynos_iommu_domain_destroy, @@ -1089,8 +1063,6 @@ static const struct iommu_ops exynos_iommu_ops = { .map = exynos_iommu_map, .unmap = exynos_iommu_unmap, .iova_to_phys = exynos_iommu_iova_to_phys, - .add_device = exynos_iommu_add_device, - .remove_device = exynos_iommu_remove_device, .pgsize_bitmap = SECT_SIZE | LPAGE_SIZE | SPAGE_SIZE, }; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH linux-next] iommu: add iommu for s390 platform
Add a basic iommu for the s390 platform. The code is pretty simple since on s390 each PCI device has its own virtual io address space starting at the same vio address. For this a domain could hold only one pci device. Also there is no relation between pci devices so each device belongs to a separate iommu group. Signed-off-by: Frank Blaschka blasc...@linux.vnet.ibm.com --- MAINTAINERS | 8 ++ arch/s390/include/asm/pci.h | 3 + arch/s390/pci/pci_dma.c | 19 - drivers/iommu/Kconfig | 9 +++ drivers/iommu/Makefile | 1 + drivers/iommu/s390-iommu.c | 179 6 files changed, 217 insertions(+), 2 deletions(-) create mode 100644 drivers/iommu/s390-iommu.c diff --git a/MAINTAINERS b/MAINTAINERS index bc69ca4..a3ba11b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7935,6 +7935,14 @@ F: drivers/s390/net/*iucv* F: include/net/iucv/ F: net/iucv/ +S390 IOMMU +M: Frank Blaschka blasc...@linux.vnet.ibm.com +M: linux...@de.ibm.com +L: linux-s...@vger.kernel.org +W: http://www.ibm.com/developerworks/linux/linux390/ +S: Supported +F: drivers/iommu/s390-iommu.c + S3C24XX SD/MMC Driver M: Ben Dooks ben-li...@fluff.org L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers) diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h index c030900..6790d0d 100644 --- a/arch/s390/include/asm/pci.h +++ b/arch/s390/include/asm/pci.h @@ -177,6 +177,9 @@ struct zpci_dev *get_zdev_by_fid(u32); /* DMA */ int zpci_dma_init(void); void zpci_dma_exit(void); +int dma_update_trans(struct zpci_dev *zdev, unsigned long pa, +dma_addr_t dma_addr, size_t size, int flags); +void dma_purge_rto_entries(struct zpci_dev *zdev); /* FMB */ int zpci_fmb_enable_device(struct zpci_dev *); diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c index 4cbb29a..a4db33e 100644 --- a/arch/s390/pci/pci_dma.c +++ b/arch/s390/pci/pci_dma.c @@ -139,8 +139,8 @@ static void dma_update_cpu_trans(struct zpci_dev *zdev, void *page_addr, entry_clr_protected(entry); } -static int dma_update_trans(struct zpci_dev *zdev, unsigned long pa, - dma_addr_t dma_addr, size_t size, int flags) +int dma_update_trans(struct zpci_dev *zdev, unsigned long pa, +dma_addr_t dma_addr, size_t size, int flags) { unsigned int nr_pages = PAGE_ALIGN(size) PAGE_SHIFT; u8 *page_addr = (u8 *) (pa PAGE_MASK); @@ -210,6 +210,21 @@ static void dma_cleanup_tables(struct zpci_dev *zdev) zdev-dma_table = NULL; } +void dma_purge_rto_entries(struct zpci_dev *zdev) +{ + unsigned long *table; + int rtx; + + if (!zdev || !zdev-dma_table) + return; + table = zdev-dma_table; + for (rtx = 0; rtx ZPCI_TABLE_ENTRIES; rtx++) + if (reg_entry_isvalid(table[rtx])) { + dma_free_seg_table(table[rtx]); + invalidate_table_entry(table[rtx]); + } +} + static unsigned long __dma_alloc_iommu(struct zpci_dev *zdev, unsigned long start, int size) { diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index dd51122..545e3fd 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -302,4 +302,13 @@ config ARM_SMMU Say Y here if your SoC includes an IOMMU device implementing the ARM SMMU architecture. +config S390_IOMMU +bool s390 IOMMU Support +depends on S390 PCI +select IOMMU_API +help + Support for the IBM s/390 IOMMU + + If unsure, say N here. + endif # IOMMU_SUPPORT diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 16edef7..1278aad 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -19,3 +19,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o obj-$(CONFIG_SHMOBILE_IOMMU) += shmobile-iommu.o obj-$(CONFIG_SHMOBILE_IPMMU) += shmobile-ipmmu.o obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o +obj-$(CONFIG_S390_IOMMU) += s390-iommu.o diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c new file mode 100644 index 000..f9f048d --- /dev/null +++ b/drivers/iommu/s390-iommu.c @@ -0,0 +1,179 @@ +#include linux/io.h +#include linux/interrupt.h +#include linux/platform_device.h +#include linux/slab.h +#include linux/pm_runtime.h +#include linux/clk.h +#include linux/err.h +#include linux/mm.h +#include linux/iommu.h +#include linux/errno.h +#include linux/list.h +#include linux/memblock.h +#include linux/export.h +#include linux/pci.h +#include linux/sizes.h +#include asm/pci_dma.h + +#define S390_IOMMU_PGSIZES SZ_4K + +struct s390_domain { + struct zpci_dev *zdev; +}; + +static int s390_iommu_domain_init(struct iommu_domain *domain) +{ + struct s390_domain *priv; + + priv = kzalloc(sizeof(*priv), GFP_KERNEL); +
Re: [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver
On Tue, Oct 14, 2014 at 04:02:40PM +0800, Daniel Kurtz wrote: +static void rk_iommu_detach_device(struct iommu_domain *domain, +struct device *dev) +{ + struct rk_iommu *iommu = dev_get_drvdata(dev-archdata.iommu); + struct rk_iommu_domain *rk_domain = domain-priv; + unsigned long flags; + + /* Allow 'virtual devices' (eg drm) to detach from domain */ + if (!iommu) + return; + + iommu-domain = NULL; I guess this line is a left-over? Setting iommu-domain to NULL here before you disabled the IOMMU interrupt is racy. To be fully secure, you should make sure that no interrupt handler is still running after you disabled the IOMMU irq and before setting iommu-domain = NULL. Other than that the code looks good. Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/3] Rockchip IOMMU driver and devicetree bindings
On Wed, Oct 01, 2014 at 06:20:40PM +0800, Daniel Kurtz wrote: Add a driver and devicetree bindings for the IOMMU found in Rockchip RK3288 SoCs. Daniel Kurtz (3): iommu/rockchip: rk3288 iommu driver dt-bindings: iommu: Add documentation for rockchip iommu ARM: dts: rk3288: add VOP iommu nodes .../devicetree/bindings/iommu/rockchip,iommu.txt | 26 + arch/arm/boot/dts/rk3288.dtsi | 18 + drivers/iommu/Kconfig | 11 + drivers/iommu/Makefile | 1 + drivers/iommu/rockchip-iommu.c | 924 + 5 files changed, 980 insertions(+) create mode 100644 Documentation/devicetree/bindings/iommu/rockchip,iommu.txt create mode 100644 drivers/iommu/rockchip-iommu.c Please make sure you also get Acks for the DT bindings. With that and the issue I mentioned in the other email fixed, I will apply these patches? Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/3] iommu: replace IOMMU_EXEC with IOMMU_EXEC and update ARM SMMU driver
On Mon, Oct 20, 2014 at 07:42:01PM +0100, Will Deacon wrote: On Mon, Oct 20, 2014 at 04:39:15PM +0100, Will Deacon wrote: On Mon, Oct 13, 2014 at 02:06:15PM +0100, Antonios Motakis wrote: This patch series applies to Joerg Roedel's iommu/next branch, commit 09b5269a. It replaces the IOMMU_EXEC flag used by the ARM SMMU driver to IOMMU_NOEXEC. This is more enforceable, since the lack of the flag on hardware that doesn't support it implies that the target memory will be executable. Looks good to me; I'll take this via the arm-smmu tree and send it to Joerg along with anything else that gets queued for 3.19. The 0-day builder spotted a new warning from this series: drivers/iommu/amd_iommu.c: In function 'amd_iommu_capable': drivers/iommu/amd_iommu.c:3409:2: warning: enumeration value 'IOMMU_CAP_NOEXEC' not handled in switch [-Wswitch] switch (cap) { ^ I fixed it with the patch below, but I'd appreciate you and Joerg taking a look too. Cheers, Will ---8 diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 505a9adac2d5..3d78a8fb5a6a 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -3411,6 +3411,8 @@ static bool amd_iommu_capable(enum iommu_cap cap) return true; case IOMMU_CAP_INTR_REMAP: return (irq_remapping_enabled == 1); + case IOMMU_CAP_NOEXEC: + return false; } Looks good to me. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH linux-next] iommu: add iommu for s390 platform
On Wed, Oct 22, 2014 at 04:17:29PM +0200, Joerg Roedel wrote: Hi Frank, On Tue, Oct 21, 2014 at 01:57:25PM +0200, Frank Blaschka wrote: Add a basic iommu for the s390 platform. The code is pretty simple since on s390 each PCI device has its own virtual io address space starting at the same vio address. Are there any limitations on IOVA address space for the devices or can be really any system physical address mapped starting from 0 to 2^64? Hi Joerg, Basically there are no limitations. Depending on the s390 maschine generation a device starts its IOVA at a specific address (announced by the HW). But as I already told each device starts at the same address. I think this prevents having multiple devices on the same IOMMU domain. For this a domain could hold only one pci device. This bothers me, as it is not compatible with the IOMMU-API. I looked a little bit into how the mappings are created, and it seems there is a per-device dma_table. yes, you are absolutely right. There is a per-device dma_table. There is no general IOMMU device but each pci device has its own IOMMU translation capability. Is there any reason a dma_table can't be per IOMMU domain and assigned to multiple devices at the same time? Is there a possibility the IOMMU domain can support e.g. something like VIOA 0x1 - pci device 1 VIOA 0x1 - pci device 2 Otherwise the code looks quite simple and straight forward. Thx for your review and help Frank Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 07/18] vfio/platform: return info for device memory mapped IO regions
On Wed, 2014-10-22 at 15:54 +0200, Antonios Motakis wrote: On Tue, Oct 21, 2014 at 6:34 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote: This patch enables the IOCTLs VFIO_DEVICE_GET_REGION_INFO ioctl call, which allows the user to learn about the available MMIO resources of a device. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform_common.c | 93 +-- drivers/vfio/platform/vfio_platform_private.h | 22 +++ 2 files changed, 111 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c index 1e4073f..8a7e474 100644 --- a/drivers/vfio/platform/vfio_platform_common.c +++ b/drivers/vfio/platform/vfio_platform_common.c @@ -27,17 +27,84 @@ #include vfio_platform_private.h +static int vfio_platform_regions_init(struct vfio_platform_device *vdev) +{ + int cnt = 0, i; + + while (vdev-get_resource(vdev, cnt)) + cnt++; + + vdev-regions = kcalloc(cnt, sizeof(struct vfio_platform_region), + GFP_KERNEL); + if (!vdev-regions) + return -ENOMEM; + + for (i = 0; i cnt; i++) { + struct resource *res = + vdev-get_resource(vdev, i); + + if (!res) + goto err; + + vdev-regions[i].addr = res-start; + vdev-regions[i].size = resource_size(res); + vdev-regions[i].flags = 0; + + switch (resource_type(res)) { + case IORESOURCE_MEM: + vdev-regions[i].type = VFIO_PLATFORM_REGION_TYPE_MMIO; + break; + case IORESOURCE_IO: + vdev-regions[i].type = VFIO_PLATFORM_REGION_TYPE_PIO; + break; Ok, we have support for PIO in platform now (thanks!), does the user know what type of region they're dealing with? Do they care? For PCI the user tests the PCI BAR in config space to determine which type it is. I'm guessing that platform would do something similar against the device tree or ACPI, right? Maybe is worthwhile to add an explicit flag in VFIO_DEVICE_GET_REGION_INFO for PIO regions. For platform devices I don't know if we can always rely on DT or ACPI info to be available. For VFIO PCI the BAR is always implemented, and while I have proposed an RFC to return DT information, I don't think we can assume how a device is described in the host, whether DT, ACPI, or dark magic. Is this already handled by the fact that vfio-platform is not meant to be a generic meta driver to the same extent as vfio-pci? There is no self-describing config space on platform devices like there is on PCI, so the user will need to know in advance somehow what the device is and what resources/irqs it uses. We do need to make sure though that we provide a userspace ABI that allows them to match VFIO indexes to the device in a predictable way. It's not fully clear to me how that works. Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/18] vfio/platform: support MMAP of MMIO regions
On Wed, 2014-10-22 at 15:55 +0200, Antonios Motakis wrote: On Tue, Oct 21, 2014 at 6:51 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote: Allow to memory map the MMIO regions of the device so userspace can directly access them. PIO regions are not being handled at this point. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform_common.c | 57 1 file changed, 57 insertions(+) diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c index ac74710..4db7187 100644 --- a/drivers/vfio/platform/vfio_platform_common.c +++ b/drivers/vfio/platform/vfio_platform_common.c @@ -57,6 +57,16 @@ static int vfio_platform_regions_init(struct vfio_platform_device *vdev) if (!(res-flags IORESOURCE_READONLY)) vdev-regions[i].flags |= VFIO_REGION_INFO_FLAG_WRITE; + + /* + * Only regions addressed with PAGE granularity may be + * MMAPed securely. + */ + if (!(vdev-regions[i].addr ~PAGE_MASK) + !(vdev-regions[i].size ~PAGE_MASK)) + vdev-regions[i].flags |= + VFIO_REGION_INFO_FLAG_MMAP; + Should this be included in the above !readonly test? I don't see that we're doing anything below that would prevent writes to the mmap for a readonly resource. I suspect that just like PCI, it's not all that useful to provide mmap support for read-only regions. They're not typically performance paths. Indeed. Alternatively we could just MMAP the region as read only as well. Even if typically they are not performance critical areas, maybe it doesn't hurt to allow MMAP when possible. Sure, we could allow a read-only mmap. On PCI the only read-only region is the PCI option ROM. The PCI spec indicates that devices may use the same address decoders for the ROM as for BARs, so the ROM and BAR may not be simultaneously accessible. QEMU also caches the ROM once the guest does read it, so it's never been an issue to allow mmap of that resource. I would probably only bother to provide read-only mmap if you know vfio-platform devices are going to make more significant use of read-only regions. Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] pci: fix dmar fault for kdump kernel
On 10/21/2014 07:47 PM, Bjorn Helgaas wrote: [+cc Joerg, Eric, Tom, David, iommu list] On Wed, Oct 15, 2014 at 2:14 AM, Takao Indoh indou.ta...@jp.fujitsu.com wrote: (2014/10/14 18:34), Li, ZhenHua wrote: I tested on the latest stable version 3.17, it works well. On 10/10/2014 03:13 PM, Li, Zhen-Hua wrote: On a HP system with Intel vt-d supported and many PCI devices on it, when kernel crashed and the kdump kernel boots with intel_iommu=on, there may be some unexpected DMA requests on this adapter, which will cause DMA Remapping faults like: dmar: DRHD: handling fault status reg 102 dmar: DMAR:[DMA Read] Request device [41:00.0] fault addr fff81000 DMAR:[fault reason 01] Present bit in root entry is clear This bug may happen on *any* PCI device. Analysis for this bug: The present bit is set in this function: static struct context_entry * device_to_context_entry( struct intel_iommu *iommu, u8 bus, u8 devfn) { .. set_root_present(root); .. } Calling tree: device driver intel_alloc_coherent __intel_map_single domain_context_mapping domain_context_mapping_one device_to_context_entry This means, the present bit in root entry will not be set until the device driver is loaded. But in the kdump kernel, hardware devices are not aware that control has transferred to the second kernel, and those drivers must initialize again. Consequently there may be unexpected DMA requests from devices activity initiated in the first kernel leading to the DMA Remapping errors in the second kernel. To fix this DMAR fault, we need to reset the bus that this device on. Reset the device itself does not work. You have not explained why the DMAR faults are a problem. The fault is just an indication that the IOMMU prevented a DMA from completing. If the DMA is an artifact of the crashed kernel, we probably don't *want* it to complete, so taking a DMAR fault seems like exactly the right thing. If the problem is that we're being flooded with messages, it's easy enough to just tone down the printks. As I recall what we have seen in the past with the network controllers is that they get stuck in a state where they can no longer perform any DMA due to the fact that some of the transactions have returned errors from the IOMMU being reset. The only way out is to perform a PCIe reset on the part after the IOMMU has been enabled which doesn't occur automatically unless AER or EEH is enabled in the system. One thought would be to take a look at the IOMMU reset code. Is there any way to go through and make sure that all of the PCI devices that make use of the IOMMU have the bus mastering disabled prior to the IOMMU being reset? For example could we suspend all of the parts in order to force them to hold off any transactions, and then resume them after the IOMMU has been reset? If we could do at least that much that would prevent the errors and should allow for a graceful reset. Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] pci: fix dmar fault for kdump kernel
On Wed, Oct 22, 2014 at 10:54 AM, Alexander Duyck alexander.du...@gmail.com wrote: On 10/21/2014 07:47 PM, Bjorn Helgaas wrote: [+cc Joerg, Eric, Tom, David, iommu list] On Wed, Oct 15, 2014 at 2:14 AM, Takao Indoh indou.ta...@jp.fujitsu.com wrote: (2014/10/14 18:34), Li, ZhenHua wrote: I tested on the latest stable version 3.17, it works well. On 10/10/2014 03:13 PM, Li, Zhen-Hua wrote: To fix this DMAR fault, we need to reset the bus that this device on. Reset the device itself does not work. You have not explained why the DMAR faults are a problem. The fault is just an indication that the IOMMU prevented a DMA from completing. If the DMA is an artifact of the crashed kernel, we probably don't *want* it to complete, so taking a DMAR fault seems like exactly the right thing. If the problem is that we're being flooded with messages, it's easy enough to just tone down the printks. As I recall what we have seen in the past with the network controllers is that they get stuck in a state where they can no longer perform any DMA due to the fact that some of the transactions have returned errors from the IOMMU being reset. The only way out is to perform a PCIe reset on the part after the IOMMU has been enabled which doesn't occur automatically unless AER or EEH is enabled in the system. OK, now we're talking about a real issue, the sort of thing that should be in the changelog for a change like this. I'm uneasy about the strategy of it hurts when an IOMMU fault occurs, therefore we need to avoid all IOMMU faults. Isn't the whole *point* of an IOMMU to generate faults? It seems like we need to be able to handle faults gracefully. If having AER or EEH enabled in the kdump kernel is part of what's required to recover, I don't see a problem with requiring that. Don't we have to be able to recover from IOMMU faults for the device pass-through case anyway? If a NIC is passed through to a malicious guest, I assume the guest can cause IOMMU faults. I assume we handle this today by resetting the NIC when the guest exits. One thought would be to take a look at the IOMMU reset code. Is there any way to go through and make sure that all of the PCI devices that make use of the IOMMU have the bus mastering disabled prior to the IOMMU being reset? For example could we suspend all of the parts in order to force them to hold off any transactions, and then resume them after the IOMMU has been reset? If we could do at least that much that would prevent the errors and should allow for a graceful reset. Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 07/18] vfio/platform: return info for device memory mapped IO regions
On Wed, Oct 22, 2014 at 6:46 PM, Alex Williamson alex.william...@redhat.com wrote: On Wed, 2014-10-22 at 15:54 +0200, Antonios Motakis wrote: On Tue, Oct 21, 2014 at 6:34 PM, Alex Williamson alex.william...@redhat.com wrote: On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote: This patch enables the IOCTLs VFIO_DEVICE_GET_REGION_INFO ioctl call, which allows the user to learn about the available MMIO resources of a device. Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com --- drivers/vfio/platform/vfio_platform_common.c | 93 +-- drivers/vfio/platform/vfio_platform_private.h | 22 +++ 2 files changed, 111 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/platform/vfio_platform_common.c b/drivers/vfio/platform/vfio_platform_common.c index 1e4073f..8a7e474 100644 --- a/drivers/vfio/platform/vfio_platform_common.c +++ b/drivers/vfio/platform/vfio_platform_common.c @@ -27,17 +27,84 @@ #include vfio_platform_private.h +static int vfio_platform_regions_init(struct vfio_platform_device *vdev) +{ + int cnt = 0, i; + + while (vdev-get_resource(vdev, cnt)) + cnt++; + + vdev-regions = kcalloc(cnt, sizeof(struct vfio_platform_region), + GFP_KERNEL); + if (!vdev-regions) + return -ENOMEM; + + for (i = 0; i cnt; i++) { + struct resource *res = + vdev-get_resource(vdev, i); + + if (!res) + goto err; + + vdev-regions[i].addr = res-start; + vdev-regions[i].size = resource_size(res); + vdev-regions[i].flags = 0; + + switch (resource_type(res)) { + case IORESOURCE_MEM: + vdev-regions[i].type = VFIO_PLATFORM_REGION_TYPE_MMIO; + break; + case IORESOURCE_IO: + vdev-regions[i].type = VFIO_PLATFORM_REGION_TYPE_PIO; + break; Ok, we have support for PIO in platform now (thanks!), does the user know what type of region they're dealing with? Do they care? For PCI the user tests the PCI BAR in config space to determine which type it is. I'm guessing that platform would do something similar against the device tree or ACPI, right? Maybe is worthwhile to add an explicit flag in VFIO_DEVICE_GET_REGION_INFO for PIO regions. For platform devices I don't know if we can always rely on DT or ACPI info to be available. For VFIO PCI the BAR is always implemented, and while I have proposed an RFC to return DT information, I don't think we can assume how a device is described in the host, whether DT, ACPI, or dark magic. Is this already handled by the fact that vfio-platform is not meant to be a generic meta driver to the same extent as vfio-pci? There is no self-describing config space on platform devices like there is on PCI, so the user will need to know in advance somehow what the device is and what resources/irqs it uses. We do need to make sure though that we provide a userspace ABI that allows them to match VFIO indexes to the device in a predictable way. It's not fully clear to me how that works. Thanks, Yeah, it is a fact of life that the user needs to know what regions he is accessing, just from their ordering. Even with the extension for accessing device tree data, the user still needs to know what each region is and what info he is looking for from the device tree. In that respect we could delegate the responsibility to the user to just know what kind of device he is accessing and what kind of regions it features (and in what order). Alex -- Antonios Motakis Virtual Open Systems ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] IOMMU-MSM: Deletion of unnecessary checks before the function call clk_disable
If you are convinced that dropping the null tests is a good idea, then you can submit the patch that makes the change to the relevant maintainers and mailing lists. From af73fb59d5d4b2c289fb236d0752522b6b38 Mon Sep 17 00:00:00 2001 From: Markus Elfring elfr...@users.sourceforge.net Date: Wed, 22 Oct 2014 19:39:21 +0200 Subject: [PATCH] IOMMU-MSM: Deletion of unnecessary checks before the function call clk_disable A semantic patch approach was proposed with the subject [PATCH with Coccinelle?] Deletion of unnecessary checks before specific function calls on 2014-03-05. https://lkml.org/lkml/2014/3/5/344 http://article.gmane.org/gmane.comp.version-control.coccinelle/3513/ This patch pattern application was repeated with the help of the software Coccinelle 1.0.0-rc22 on the source files for Linux 3.17.1. An extract of the automatically generated update suggestions is shown here. It was determined that the affected source code places call functions which perform input parameter validation already. It is therefore not needed that a similar safety check is repeated at the call site. Signed-off-by: Markus Elfring elfr...@users.sourceforge.net --- drivers/iommu/msm_iommu.c | 3 +-- drivers/iommu/msm_iommu_dev.c | 6 ++ 2 files changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c index 6e3dcc28..3e4d888 100644 --- a/drivers/iommu/msm_iommu.c +++ b/drivers/iommu/msm_iommu.c @@ -73,8 +73,7 @@ fail: static void __disable_clocks(struct msm_iommu_drvdata *drvdata) { - if (drvdata-clk) - clk_disable(drvdata-clk); + clk_disable(drvdata-clk); clk_disable(drvdata-pclk); } diff --git a/drivers/iommu/msm_iommu_dev.c b/drivers/iommu/msm_iommu_dev.c index 61def7cb..9574d21 100644 --- a/drivers/iommu/msm_iommu_dev.c +++ b/drivers/iommu/msm_iommu_dev.c @@ -224,8 +224,7 @@ static int msm_iommu_probe(struct platform_device *pdev) platform_set_drvdata(pdev, drvdata); - if (iommu_clk) - clk_disable(iommu_clk); + clk_disable(iommu_clk); clk_disable(iommu_pclk); @@ -323,8 +322,7 @@ static int msm_iommu_ctx_probe(struct platform_device *pdev) SET_NSCFG(drvdata-base, mid, 3); } - if (drvdata-clk) - clk_disable(drvdata-clk); + clk_disable(drvdata-clk); clk_disable(drvdata-pclk); dev_info(pdev-dev, context %s using bank %d\n, c-name, c-num); -- 2.1.2 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/8] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO
On Wed, Oct 22, 2014 at 7:21 AM, Joerg Roedel j...@8bytes.org wrote: Hi Bjorn, On Tue, Oct 21, 2014 at 08:16:46PM -0600, Bjorn Helgaas wrote: I was looking at Zhen-Hua's recent patches, trying to figure out if I need to do anything with them. Resetting devices in the old kernel seems like a non-starter. Resetting devices in the new kernel, ..., well, maybe. It seems ugly, and it seems like the sort of problem that IOMMUs are designed to solve. Actually resetting the devices in the kdump kernel would be one of the better solutions for this problem. When we have a generic way to stop all in-flight DMA from the PCI endpoints we could safely disable and then re-enable the IOMMU. On Wed, Jul 2, 2014 at 7:32 AM, Joerg Roedel j...@8bytes.org wrote: That is a solution to prevent the in-flight DMA failures. But what happens when there is some in-flight DMA to a disk to write some inodes or a new superblock. Then this scratch address-space may cause filesystem corruption at worst. This in-flight DMA is from a device programmed by the old kernel, and it would be reading data from the old kernel's buffers. I think you're suggesting that we might want that DMA read to complete so the device can update filesystem metadata? Well, it is not about updating filesystem metadata. In the kdump kernel we have these options: 1) Disable the IOMMU. Problem here is, that DMA is now untranslated, so that any in-flight DMA might read or write from a random location in memory, corrupting the kdump or even the new kexec kernel memory. So this is a non-starter. Agreed (at least if the IOMMU was enabled in the crashed kernel). 2) Re-program the IOMMU to block all DMA. This is safer as it does not corrupt any data in memory. But some devices react very poorly on a master abort from the IOMMU, so bad that the driver in the kdump kernel fails to initialize that device. In this case taking dump itself might fail (and often does, according to reports) Sounds like an option, even though broken devices work poorly. 3) To prevent master aborts like in option (2), David suggested to map the whole DMA address space to a scratch page. This also prevents system memory corruption and the master aborts. But the problem is, that in-flight DMA will now read all zeros. This can corrupt disk and network data, at worst it nulls out the superblocks of your filesystem and you lose all data. So this is not an option too. Ah, yes, I see your point now. This allows corrupted data, e.g., all zeroes, to be written to disk or network after the kernel crash. I agree; this doesn't sound like a good option. And the proposal below is a 4th option (leave IOMMU enabled, reusing crashed kernel's mappings until drivers make new mappings). What we currently do in the VT-d driver is a mixture of (1) and (2). The VT-d driver disables the IOMMU hardware (opening a race window for memory data corruption), re-initializes it to reject any ongoing DMA (which causes master aborts for in-flight DMA) and re-enables the IOMMU hardware. But this strategy fails in heavy IO environments quite often and we look into ways to solve the problem, or at least improve the current situation as good as we can. I talked to David about this at LPC and we came up with basically this procedure: 1. If the VT-d driver finds the IOMMU enabled, it reuses its root-context table. This way the IOMMU must not be disabled and re-enabled, eliminating the race we have now. Other data structures like the context-entries are copied over from the old kernel. We basically keep all mappings from the old kernel, allowing any in-flight DMA to succeed. No memory or disk data corruption. If the crashed kernel had corrupted memory, couldn't an in-flight DMA read that corrupted data from memory and write it to disk? I guess you could argue that this is merely a race, and the in-flight DMA could just as easily have happened before the kernel crash, so there's always a window and the only question is whether it closes when the IOMMU driver starts up or when the device driver starts up. The issue here is, that we modify data from the old kernel which is about to be dumped. But there are ways to work around that. 2. When a device driver issues the first dma_map command for a device, we assign a new and empty page-table, thus removing all mappings from the old kernel for the device. Rationale is, that at this point the device driver should have reset the device to a point where all in-flight DMA is canceled. This approach goes into the same direction as Bill Sumners patch-set, which Li
Re: [PATCH 12/17] iommu/omap: Integrate omap-iommu-debug into omap-iommu
Hi Joerg, On 10/22/2014 08:29 AM, Joerg Roedel wrote: On Tue, Oct 21, 2014 at 04:28:27PM -0500, Suman Anna wrote: I am looking to refresh this series for 3.19, and this is the only patch that may need some changes. Please let me know what your preference is, and I can rework this patch if needed. Either way, the plan is to not have an OMAP IOMMU debugfs module, but only to have an option not to build the debugfs portions. Can I apply this series without this patch and you send me an updated version of only this patch? If yes, please let me know, otherwise please rebase the series on v3.18-rc1 and resend. Some of the patches following this patch don't apply without this patch, so this patch alone cannot be dropped. The preceding patches (Patches 1 through 11) are not affected though. I will rebase and resend the entire series. regards Suman ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH v2 0/4] vfio: platform: return device tree info for a platform device node
On Thu, 2014-10-16 at 17:54 +0200, Antonios Motakis wrote: This RFC's intention is to show what an interface to access device node properties for the VFIO platform driver can look like. If a device tree node corresponding to a platform device bound by VFIO PLATFORM or VFIO AMBA is available, this patch series will allow the user to query the properties associated with this device node. This can be useful for userspace drivers to automatically query parameters related to the device. An API to return data from a device's device tree has been proposed before on these lists. The API proposed here is slightly different. Properties to parse from the device tree are not indexed by a numerical id. The host system doesn't guarantee any specific ordering for the available properties, or that those will remain the same; while this does not happen in practice, there is nothing from the host changing the device nodes during operation. For this reason in this RFC properties are accessed by property name. The type of the property accessed must also be known by the user. Properties types implemented in this RFC: - VFIO_DEVTREE_TYPE_STRINGS (strings sepparated by the null character) - VFIO_DEVTREE_TYPE_U32 - VFIO_DEVTREE_TYPE_U16 - VFIO_DEVTREE_TYPE_U8 These can all be access by the ioctl VFIO_DEVICE_GET_DEVTREE_INFO. A new ioctl was preferred instead of shoehorning the functionality in VFIO_DEVICE_GET_INFO. The structure exchanged with userspace looks like this: /** * VFIO_DEVICE_GET_DEVTREE_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 16, *struct vfio_devtree_info) * * Retrieve information from the device's device tree, if available. * Caller will initialize data[] with a single string with the requested * devicetree property name, and type depending on whether a array of strings * or an array of u32 values is expected. On success, data[] will be extended * with the requested information, either as an array of u32, or with a list * of strings sepparated by the NULL terminating character. * Return: 0 on success, -errno on failure. */ struct vfio_devtree_info { __u32 argsz; __u32 type; #define VFIO_DEVTREE_PROP_LIST0 #define VFIO_DEVTREE_TYPE_STRINGS 1 #define VFIO_DEVTREE_TYPE_U8 2 #define VFIO_DEVTREE_TYPE_U16 3 #define VFIO_DEVTREE_TYPE_U32 4 __u32 length; __u8data[]; }; #define VFIO_DEVICE_GET_DEVTREE_INFO _IO(VFIO_TYPE, VFIO_BASE + 17) Seems pretty reasonable. We should add a flags variable to vfio_devtree_info as well. General question, in order to get a VFIO device file descriptor, the user already needs to know the name of the device. Presumably they'll get this from sysfs or otherwise just know it in advance. Is that redundant information to what we're providing here? What other information already exists in sysfs for device tree devices that we're duplicating here for the sake of creating a VFIO mechanism? If it's not already exposed in sysfs, should it be? Thanks, Alex The length of the property will be reported in length, so the user can reallocate the structure if the data does not fit the first time the call is used. Specifically for QEMU, reading the compatible property of the device tree node could be of use to find out what device is being assigned to the guest and handle appropriately a wider range of devices in the future, and to generate an appropriate device tree for the guest. TODOs: - 64 bit values support - We can consider to rebase this on the patch series by Rafael J. Wysocki: [PATCH v4 00/13] Add ACPI _DSD and unified device properties support This would make 64 bit support more straightforward as it already includes the APIs we need for 64 bit OF values. Changes since v1: - Updated for VFIO platform patch series v8: VFIO AMBA devices now supported in addition to VFIO PLATFORM devices - Refactored and cleaned up the code Antonios Motakis (4): vfio: platform: add device tree info API and skeleton vfio: platform: devtree: return available property names vfio: platform: devtree: access property as a list of strings vfio: platform: devtree: return arrays of u32, u16, or u8 data drivers/vfio/platform/Makefile| 3 +- drivers/vfio/platform/devtree.c | 199 ++ drivers/vfio/platform/vfio_platform_common.c | 39 + drivers/vfio/platform/vfio_platform_private.h | 6 + include/uapi/linux/vfio.h | 26 5 files changed, 272 insertions(+), 1 deletion(-) create mode 100644 drivers/vfio/platform/devtree.c ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 08/17] iommu/omap: Simplify omap2_iommu_fault_isr()
The function omap2_iommu_fault_isr() does an unnecessary recomputation of the return value. The logic relies on setting the same bit fields as the MMU fault error status bits, so simplify this function and remove the unneeded macros. These macros were originally exported to notify MMU faults to users prior to the IOMMU framework adaptation, but are now redundant. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu2.c | 20 +--- 1 file changed, 1 insertion(+), 19 deletions(-) diff --git a/drivers/iommu/omap-iommu2.c b/drivers/iommu/omap-iommu2.c index 372141b..ce2fff3 100644 --- a/drivers/iommu/omap-iommu2.c +++ b/drivers/iommu/omap-iommu2.c @@ -53,13 +53,6 @@ ((pgsz) == MMU_CAM_PGSZ_64K) ? 0x :\ ((pgsz) == MMU_CAM_PGSZ_4K) ? 0xf000 : 0) -/* IOMMU errors */ -#define OMAP_IOMMU_ERR_TLB_MISS(1 0) -#define OMAP_IOMMU_ERR_TRANS_FAULT (1 1) -#define OMAP_IOMMU_ERR_EMU_MISS(1 2) -#define OMAP_IOMMU_ERR_TBLWALK_FAULT (1 3) -#define OMAP_IOMMU_ERR_MULTIHIT_FAULT (1 4) - static void __iommu_set_twl(struct omap_iommu *obj, bool on) { u32 l = iommu_read_reg(obj, MMU_CNTL); @@ -122,7 +115,6 @@ static void omap2_iommu_set_twl(struct omap_iommu *obj, bool on) static u32 omap2_iommu_fault_isr(struct omap_iommu *obj, u32 *ra) { u32 stat, da; - u32 errs = 0; stat = iommu_read_reg(obj, MMU_IRQSTATUS); stat = MMU_IRQ_MASK; @@ -134,19 +126,9 @@ static u32 omap2_iommu_fault_isr(struct omap_iommu *obj, u32 *ra) da = iommu_read_reg(obj, MMU_FAULT_AD); *ra = da; - if (stat MMU_IRQ_TLBMISS) - errs |= OMAP_IOMMU_ERR_TLB_MISS; - if (stat MMU_IRQ_TRANSLATIONFAULT) - errs |= OMAP_IOMMU_ERR_TRANS_FAULT; - if (stat MMU_IRQ_EMUMISS) - errs |= OMAP_IOMMU_ERR_EMU_MISS; - if (stat MMU_IRQ_TABLEWALKFAULT) - errs |= OMAP_IOMMU_ERR_TBLWALK_FAULT; - if (stat MMU_IRQ_MULTIHITFAULT) - errs |= OMAP_IOMMU_ERR_MULTIHIT_FAULT; iommu_write_reg(obj, stat, MMU_IRQSTATUS); - return errs; + return stat; } static void omap2_tlb_read_cr(struct omap_iommu *obj, struct cr_regs *cr) -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 10/17] iommu/omap: Fix the permissions on nr_tlb_entries
The permissions on the debugfs entry nr_tlb_entries should have been octal, not decimal, so fix it. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu-debug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c index 0fb92aa..a520438 100644 --- a/drivers/iommu/omap-iommu-debug.c +++ b/drivers/iommu/omap-iommu-debug.c @@ -256,7 +256,7 @@ static int iommu_debug_register(struct device *dev, void *data) goto nomem; parent = d; - d = debugfs_create_u8(nr_tlb_entries, 400, parent, + d = debugfs_create_u8(nr_tlb_entries, 0400, parent, (u8 *)obj-nr_tlb_entries); if (!d) goto nomem; -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 05/17] iommu/omap: Remove ver debugfs entry
The debugfs entry 'ver' to read the OMAP IOMMU version is not much useful for developers, so it has been removed. The same can be deduced from the register dump, provided by the debugfs entry 'regs', REVISION register. This also allows us to remove the omap_iommu_arch_revision() which is currently returning a fixed value. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu-debug.c | 13 - 1 file changed, 13 deletions(-) diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c index 531658d..0fb92aa 100644 --- a/drivers/iommu/omap-iommu-debug.c +++ b/drivers/iommu/omap-iommu-debug.c @@ -30,17 +30,6 @@ static DEFINE_MUTEX(iommu_debug_lock); static struct dentry *iommu_debug_root; -static ssize_t debug_read_ver(struct file *file, char __user *userbuf, - size_t count, loff_t *ppos) -{ - u32 ver = omap_iommu_arch_version(); - char buf[MAXCOLUMN], *p = buf; - - p += sprintf(p, H/W version: %d.%d\n, (ver 4) 0xf , ver 0xf); - - return simple_read_from_buffer(userbuf, count, ppos, buf, p - buf); -} - static ssize_t debug_read_regs(struct file *file, char __user *userbuf, size_t count, loff_t *ppos) { @@ -228,7 +217,6 @@ static ssize_t debug_read_pagetable(struct file *file, char __user *userbuf, .llseek = generic_file_llseek, \ }; -DEBUG_FOPS_RO(ver); DEBUG_FOPS_RO(regs); DEBUG_FOPS_RO(tlb); DEBUG_FOPS(pagetable); @@ -273,7 +261,6 @@ static int iommu_debug_register(struct device *dev, void *data) if (!d) goto nomem; - DEBUG_ADD_FILE_RO(ver); DEBUG_ADD_FILE_RO(regs); DEBUG_ADD_FILE_RO(tlb); DEBUG_ADD_FILE(pagetable); -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 11/17] iommu/omap: Make pagetable debugfs entry read-only
Remove the writeability on the 'pagetable' debugfs entry, so that the mapping/unmapping into an OMAP IOMMU is only limited to actual client devices/drivers at kernel-level. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu-debug.c | 48 ++-- 1 file changed, 2 insertions(+), 46 deletions(-) diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c index a520438..28de657 100644 --- a/drivers/iommu/omap-iommu-debug.c +++ b/drivers/iommu/omap-iommu-debug.c @@ -24,8 +24,6 @@ #include omap-iopgtable.h #include omap-iommu.h -#define MAXCOLUMN 100 /* for short messages */ - static DEFINE_MUTEX(iommu_debug_lock); static struct dentry *iommu_debug_root; @@ -82,39 +80,6 @@ static ssize_t debug_read_tlb(struct file *file, char __user *userbuf, return bytes; } -static ssize_t debug_write_pagetable(struct file *file, -const char __user *userbuf, size_t count, loff_t *ppos) -{ - struct iotlb_entry e; - struct cr_regs cr; - int err; - struct device *dev = file-private_data; - struct omap_iommu *obj = dev_to_omap_iommu(dev); - char buf[MAXCOLUMN], *p = buf; - - count = min(count, sizeof(buf)); - - mutex_lock(iommu_debug_lock); - if (copy_from_user(p, userbuf, count)) { - mutex_unlock(iommu_debug_lock); - return -EFAULT; - } - - sscanf(p, %x %x, cr.cam, cr.ram); - if (!cr.cam || !cr.ram) { - mutex_unlock(iommu_debug_lock); - return -EINVAL; - } - - omap_iotlb_cr_to_e(cr, e); - err = omap_iopgtable_store_entry(obj, e); - if (err) - dev_err(obj-dev, %s: fail to store cr\n, __func__); - - mutex_unlock(iommu_debug_lock); - return count; -} - #define dump_ioptable_entry_one(lv, da, val) \ ({ \ int __err = 0; \ @@ -202,14 +167,6 @@ static ssize_t debug_read_pagetable(struct file *file, char __user *userbuf, return bytes; } -#define DEBUG_FOPS(name) \ - static const struct file_operations debug_##name##_fops = { \ - .open = simple_open,\ - .read = debug_read_##name, \ - .write = debug_write_##name,\ - .llseek = generic_file_llseek, \ - }; - #define DEBUG_FOPS_RO(name)\ static const struct file_operations debug_##name##_fops = { \ .open = simple_open,\ @@ -219,7 +176,7 @@ static ssize_t debug_read_pagetable(struct file *file, char __user *userbuf, DEBUG_FOPS_RO(regs); DEBUG_FOPS_RO(tlb); -DEBUG_FOPS(pagetable); +DEBUG_FOPS_RO(pagetable); #define __DEBUG_ADD_FILE(attr, mode) \ { \ @@ -230,7 +187,6 @@ DEBUG_FOPS(pagetable); return -ENOMEM; \ } -#define DEBUG_ADD_FILE(name) __DEBUG_ADD_FILE(name, 0600) #define DEBUG_ADD_FILE_RO(name) __DEBUG_ADD_FILE(name, 0400) static int iommu_debug_register(struct device *dev, void *data) @@ -263,7 +219,7 @@ static int iommu_debug_register(struct device *dev, void *data) DEBUG_ADD_FILE_RO(regs); DEBUG_ADD_FILE_RO(tlb); - DEBUG_ADD_FILE(pagetable); + DEBUG_ADD_FILE_RO(pagetable); return 0; -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 03/17] iommu/omap: Remove duplicate declarations
The omap_iommu_save_ctx() and omap_iommu_restore_ctx() declarations are defined in include/linux/omap-iommu.h and do not belong in the internal drivers/iommu/omap-iommu.h header, so remove them. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h index 18a0f3a..4fc51c8 100644 --- a/drivers/iommu/omap-iommu.h +++ b/drivers/iommu/omap-iommu.h @@ -197,9 +197,6 @@ extern void omap_iotlb_cr_to_e(struct cr_regs *cr, struct iotlb_entry *e); extern int omap_iopgtable_store_entry(struct omap_iommu *obj, struct iotlb_entry *e); -extern void omap_iommu_save_ctx(struct device *dev); -extern void omap_iommu_restore_ctx(struct device *dev); - extern int omap_foreach_iommu_device(void *data, int (*fn)(struct device *, void *)); -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 12/17] iommu/omap: Integrate omap-iommu-debug into omap-iommu
The debugfs support for OMAP IOMMU is currently implemented as a module, warranting certain OMAP-specific IOMMU API to be exported. The OMAP IOMMU, when enabled, can only be built-in into the kernel, so integrate the OMAP IOMMU debug module into the OMAP IOMMU driver. This helps in eliminating the need to export most of the current OMAP IOMMU API. The following are the main changes: - The debugfs directory and entry creation logic is reversed, the calls are invoked by the OMAP IOMMU driver now. - The current iffy circular logic of adding IOMMU archdata to the IOMMU devices itself to get a pointer to the omap_iommu object in the debugfs support code is replaced by directly using the omap_iommu structure while creating the debugfs entries. - The debugfs root directory is renamed from the generic name iommu to a specific name omap_iommu. - Unneeded headers have also been cleaned up while at this. - There will no longer be a omap-iommu-debug.ko module after this patch. - The OMAP_IOMMU_DEBUG Kconfig option is converted to boolean only, the OMAP IOMMU debugfs support is built alongside the OMAP IOMMU driver only when this option is enabled. Signed-off-by: Suman Anna s-a...@ti.com --- drivers/iommu/Kconfig| 12 ++--- drivers/iommu/omap-iommu-debug.c | 100 +++ drivers/iommu/omap-iommu.c | 11 - drivers/iommu/omap-iommu.h | 15 ++ 4 files changed, 58 insertions(+), 80 deletions(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index dd51122..1d54996 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -144,13 +144,13 @@ config OMAP_IOMMU select IOMMU_API config OMAP_IOMMU_DEBUG - tristate Export OMAP IOMMU internals in DebugFS - depends on OMAP_IOMMU DEBUG_FS - help - Select this to see extensive information about - the internal state of OMAP IOMMU in debugfs. + bool Export OMAP IOMMU internals in DebugFS + depends on OMAP_IOMMU DEBUG_FS + ---help--- + Select this to see extensive information about + the internal state of OMAP IOMMU in debugfs. - Say N unless you know you need this. + Say N unless you know you need this. config TEGRA_IOMMU_GART bool Tegra GART IOMMU Support diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c index 28de657..4813d3a 100644 --- a/drivers/iommu/omap-iommu-debug.c +++ b/drivers/iommu/omap-iommu-debug.c @@ -10,15 +10,11 @@ * published by the Free Software Foundation. */ -#include linux/module.h #include linux/err.h -#include linux/clk.h #include linux/io.h #include linux/slab.h #include linux/uaccess.h -#include linux/platform_device.h #include linux/debugfs.h -#include linux/omap-iommu.h #include linux/platform_data/iommu-omap.h #include omap-iopgtable.h @@ -31,8 +27,7 @@ static struct dentry *iommu_debug_root; static ssize_t debug_read_regs(struct file *file, char __user *userbuf, size_t count, loff_t *ppos) { - struct device *dev = file-private_data; - struct omap_iommu *obj = dev_to_omap_iommu(dev); + struct omap_iommu *obj = file-private_data; char *p, *buf; ssize_t bytes; @@ -55,8 +50,7 @@ static ssize_t debug_read_regs(struct file *file, char __user *userbuf, static ssize_t debug_read_tlb(struct file *file, char __user *userbuf, size_t count, loff_t *ppos) { - struct device *dev = file-private_data; - struct omap_iommu *obj = dev_to_omap_iommu(dev); + struct omap_iommu *obj = file-private_data; char *p, *buf; ssize_t bytes, rest; @@ -141,8 +135,7 @@ out: static ssize_t debug_read_pagetable(struct file *file, char __user *userbuf, size_t count, loff_t *ppos) { - struct device *dev = file-private_data; - struct omap_iommu *obj = dev_to_omap_iommu(dev); + struct omap_iommu *obj = file-private_data; char *p, *buf; size_t bytes; @@ -181,93 +174,56 @@ DEBUG_FOPS_RO(pagetable); #define __DEBUG_ADD_FILE(attr, mode) \ { \ struct dentry *dent;\ - dent = debugfs_create_file(#attr, mode, parent, \ - dev, debug_##attr##_fops); \ + dent = debugfs_create_file(#attr, mode, obj-debug_dir, \ + obj, debug_##attr##_fops); \ if (!dent) \ - return -ENOMEM; \ + goto err; \ } #define DEBUG_ADD_FILE_RO(name) __DEBUG_ADD_FILE(name, 0400) -static int
[PATCH v2 17/17] iommu/omap: Switch pagetable debugfs entry to use seq_file
The debugfs entry 'pagetable' that shows the page table entry (PTE) data currently outputs only data that can be fit into a page. Switch the entry to use the seq_file interface so that it can show all the valid page table entries. The patch also corrected the output for L2 entries, and prints the proper L2 PTE instead of the previous L1 page descriptor pointer. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu-debug.c | 81 ++-- 1 file changed, 28 insertions(+), 53 deletions(-) diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c index 41b09a1..f3d20a2 100644 --- a/drivers/iommu/omap-iommu-debug.c +++ b/drivers/iommu/omap-iommu-debug.c @@ -85,95 +85,70 @@ static ssize_t debug_read_tlb(struct file *file, char __user *userbuf, return bytes; } -#define dump_ioptable_entry_one(lv, da, val) \ - ({ \ - int __err = 0; \ - ssize_t bytes; \ - const int maxcol = 22; \ - const char *str = %d: %08x %08x\n;\ - bytes = snprintf(p, maxcol, str, lv, da, val); \ - p += bytes; \ - len -= bytes; \ - if (len maxcol) \ - __err = -ENOMEM;\ - __err; \ - }) - -static ssize_t dump_ioptable(struct omap_iommu *obj, char *buf, ssize_t len) +static void dump_ioptable(struct seq_file *s) { - int i; - u32 *iopgd; - char *p = buf; + int i, j; + u32 da; + u32 *iopgd, *iopte; + struct omap_iommu *obj = s-private; spin_lock(obj-page_table_lock); iopgd = iopgd_offset(obj, 0); for (i = 0; i PTRS_PER_IOPGD; i++, iopgd++) { - int j, err; - u32 *iopte; - u32 da; - if (!*iopgd) continue; if (!(*iopgd IOPGD_TABLE)) { da = i IOPGD_SHIFT; - - err = dump_ioptable_entry_one(1, da, *iopgd); - if (err) - goto out; + seq_printf(s, 1: 0x%08x 0x%08x\n, da, *iopgd); continue; } iopte = iopte_offset(iopgd, 0); - for (j = 0; j PTRS_PER_IOPTE; j++, iopte++) { if (!*iopte) continue; da = (i IOPGD_SHIFT) + (j IOPTE_SHIFT); - err = dump_ioptable_entry_one(2, da, *iopgd); - if (err) - goto out; + seq_printf(s, 2: 0x%08x 0x%08x\n, da, *iopte); } } -out: - spin_unlock(obj-page_table_lock); - return p - buf; + spin_unlock(obj-page_table_lock); } -static ssize_t debug_read_pagetable(struct file *file, char __user *userbuf, - size_t count, loff_t *ppos) +static int debug_read_pagetable(struct seq_file *s, void *data) { - struct omap_iommu *obj = file-private_data; - char *p, *buf; - size_t bytes; + struct omap_iommu *obj = s-private; if (is_omap_iommu_detached(obj)) return -EPERM; - buf = (char *)__get_free_page(GFP_KERNEL); - if (!buf) - return -ENOMEM; - p = buf; - - p += sprintf(p, L: %8s %8s\n, da:, pa:); - p += sprintf(p, -\n); - mutex_lock(iommu_debug_lock); - bytes = PAGE_SIZE - (p - buf); - p += dump_ioptable(obj, p, bytes); - - bytes = simple_read_from_buffer(userbuf, count, ppos, buf, p - buf); + seq_printf(s, L: %8s %8s\n, da:, pte:); + seq_puts(s, --\n); + dump_ioptable(s); mutex_unlock(iommu_debug_lock); - free_page((unsigned long)buf); - return bytes; + return 0; } +#define DEBUG_SEQ_FOPS_RO(name) \ + static int debug_open_##name(struct inode *inode, struct file *file) \ + { \ + return single_open(file, debug_read_##name, inode-i_private); \ + } \ + \ + static const struct file_operations debug_##name##_fops = {\ + .open
[PATCH v2 14/17] iommu/omap: Do not export unneeded functions
The following functions were exported previously for usage by the OMAP IOMMU debug module: omap_iommu_dump_ctx() omap_dump_tlb_entries() omap_iopgtable_store_entry() These functions need not be exported anymore as the OMAP IOMMU debugfs code is integrated with the OMAP IOMMU driver, and there won't be external users for these functions. So, remove the EXPORT_SYMBOL_GPL on these. The omap_iopgtable_store_entry() is also made internal only, after making the 'pagetable' debugfs entry read-only. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu.c | 6 ++ drivers/iommu/omap-iommu.h | 3 --- 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c index 9171112..3dcaef0 100644 --- a/drivers/iommu/omap-iommu.c +++ b/drivers/iommu/omap-iommu.c @@ -511,7 +511,6 @@ ssize_t omap_iommu_dump_ctx(struct omap_iommu *obj, char *buf, ssize_t bytes) return bytes; } -EXPORT_SYMBOL_GPL(omap_iommu_dump_ctx); static int __dump_tlb_entries(struct omap_iommu *obj, struct cr_regs *crs, int num) @@ -579,7 +578,6 @@ size_t omap_dump_tlb_entries(struct omap_iommu *obj, char *buf, ssize_t bytes) return p - buf; } -EXPORT_SYMBOL_GPL(omap_dump_tlb_entries); #endif /* CONFIG_OMAP_IOMMU_DEBUG */ @@ -764,7 +762,8 @@ iopgtable_store_entry_core(struct omap_iommu *obj, struct iotlb_entry *e) * @obj: target iommu * @e: an iommu tlb entry info **/ -int omap_iopgtable_store_entry(struct omap_iommu *obj, struct iotlb_entry *e) +static int +omap_iopgtable_store_entry(struct omap_iommu *obj, struct iotlb_entry *e) { int err; @@ -774,7 +773,6 @@ int omap_iopgtable_store_entry(struct omap_iommu *obj, struct iotlb_entry *e) prefetch_iotlb_entry(obj, e); return err; } -EXPORT_SYMBOL_GPL(omap_iopgtable_store_entry); /** * iopgtable_lookup_entry - Lookup an iommu pte entry diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h index b18cecc..d736630 100644 --- a/drivers/iommu/omap-iommu.h +++ b/drivers/iommu/omap-iommu.h @@ -190,9 +190,6 @@ static inline struct omap_iommu *dev_to_omap_iommu(struct device *dev) /* * global functions */ -extern int -omap_iopgtable_store_entry(struct omap_iommu *obj, struct iotlb_entry *e); - #ifdef CONFIG_OMAP_IOMMU_DEBUG extern ssize_t omap_iommu_dump_ctx(struct omap_iommu *obj, char *buf, ssize_t len); -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 15/17] iommu/omap: Reset the domain field upon detaching
The .domain field in omap_iommu struct is set properly when the OMAP IOMMU device is attached to, but is never reset properly on detach. Reset this properly so that the OMAP IOMMU debugfs logic can depend on this field before allowing the debugfs operations. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c index 3dcaef0..2ba3219 100644 --- a/drivers/iommu/omap-iommu.c +++ b/drivers/iommu/omap-iommu.c @@ -1206,6 +1206,7 @@ static void _omap_iommu_detach_dev(struct omap_iommu_domain *omap_domain, omap_domain-iommu_dev = arch_data-iommu_dev = NULL; omap_domain-dev = NULL; + oiommu-domain = NULL; } static void omap_iommu_detach_dev(struct iommu_domain *domain, -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 01/17] iommu/omap: Remove refcount field from omap_iommu object
The refcount field in omap_iommu object is primarily used to check if an IOMMU device has already been enabled, but this is already implicit in the omap_iommu_attach_dev() which ensures that only a single device can attach to an IOMMU. This field is redundant, and so has been cleaned up. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu.c | 15 +++ drivers/iommu/omap-iommu.h | 1 - 2 files changed, 3 insertions(+), 13 deletions(-) diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c index 3627887..ea04e4d 100644 --- a/drivers/iommu/omap-iommu.c +++ b/drivers/iommu/omap-iommu.c @@ -819,8 +819,9 @@ static irqreturn_t iommu_fault_handler(int irq, void *data) u32 *iopgd, *iopte; struct omap_iommu *obj = data; struct iommu_domain *domain = obj-domain; + struct omap_iommu_domain *omap_domain = domain-priv; - if (!obj-refcount) + if (!omap_domain-iommu_dev) return IRQ_NONE; errs = iommu_report_fault(obj, da); @@ -880,13 +881,6 @@ static struct omap_iommu *omap_iommu_attach(const char *name, u32 *iopgd) spin_lock(obj-iommu_lock); - /* an iommu device can only be attached once */ - if (++obj-refcount 1) { - dev_err(dev, %s: already attached!\n, obj-name); - err = -EBUSY; - goto err_enable; - } - obj-iopgd = iopgd; err = iommu_enable(obj); if (err) @@ -899,7 +893,6 @@ static struct omap_iommu *omap_iommu_attach(const char *name, u32 *iopgd) return obj; err_enable: - obj-refcount--; spin_unlock(obj-iommu_lock); return ERR_PTR(err); } @@ -915,9 +908,7 @@ static void omap_iommu_detach(struct omap_iommu *obj) spin_lock(obj-iommu_lock); - if (--obj-refcount == 0) - iommu_disable(obj); - + iommu_disable(obj); obj-iopgd = NULL; spin_unlock(obj-iommu_lock); diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h index 4f1b68c..5c14000 100644 --- a/drivers/iommu/omap-iommu.h +++ b/drivers/iommu/omap-iommu.h @@ -33,7 +33,6 @@ struct omap_iommu { void*isr_priv; struct iommu_domain *domain; - unsigned intrefcount; spinlock_t iommu_lock; /* global for this whole object */ /* -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 04/17] iommu/omap: Remove conditional definition of dev_to_omap_iommu()
The dev_to_omap_iommu() is local to the OMAP IOMMU modules, and need not be defined conditionally. The CONFIG_IOMMU_API dependency check was added in the past to fix a compilation issue back when the header resided in the arch/arm layers, and is no longer needed. While at this, fix the header against double inclusion as well. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu.h | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h index 4fc51c8..d7c5132 100644 --- a/drivers/iommu/omap-iommu.h +++ b/drivers/iommu/omap-iommu.h @@ -10,6 +10,9 @@ * published by the Free Software Foundation. */ +#ifndef _OMAP_IOMMU_H +#define _OMAP_IOMMU_H + #if defined(CONFIG_ARCH_OMAP1) #error iommu for this processor not implemented yet #endif @@ -92,7 +95,6 @@ struct iommu_functions { ssize_t (*dump_ctx)(struct omap_iommu *obj, char *buf, ssize_t len); }; -#ifdef CONFIG_IOMMU_API /** * dev_to_omap_iommu() - retrieves an omap iommu object from a user device * @dev: iommu client device @@ -103,7 +105,6 @@ static inline struct omap_iommu *dev_to_omap_iommu(struct device *dev) return arch_data-iommu_dev; } -#endif /* * MMU Register offsets @@ -220,3 +221,5 @@ static inline void iommu_write_reg(struct omap_iommu *obj, u32 val, size_t offs) { __raw_writel(val, obj-regbase + offs); } + +#endif /* _OMAP_IOMMU_H */ -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 07/17] iommu/omap: Remove bogus version check in context save/restore
The omap2_iommu_save_ctx() and omap2_iommu_restore_ctx() performs a sanity version check against a fixed value that is correct only for OMAP2/OMAP3 IOMMUs. This fixed check does not scale for all OMAP2+ IOMMUs and is not absolutely required, so it has been removed. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu2.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/drivers/iommu/omap-iommu2.c b/drivers/iommu/omap-iommu2.c index 2f6a9f7..372141b 100644 --- a/drivers/iommu/omap-iommu2.c +++ b/drivers/iommu/omap-iommu2.c @@ -26,8 +26,6 @@ /* * omap2 architecture specific register bit definitions */ -#define IOMMU_ARCH_VERSION 0x0011 - /* IRQSTATUS IRQENABLE */ #define MMU_IRQ_MULTIHITFAULT (1 4) #define MMU_IRQ_TABLEWALKFAULT (1 3) @@ -268,8 +266,6 @@ static void omap2_iommu_save_ctx(struct omap_iommu *obj) p[i] = iommu_read_reg(obj, i * sizeof(u32)); dev_dbg(obj-dev, %s\t[%02d] %08x\n, __func__, i, p[i]); } - - BUG_ON(p[0] != IOMMU_ARCH_VERSION); } static void omap2_iommu_restore_ctx(struct omap_iommu *obj) @@ -281,8 +277,6 @@ static void omap2_iommu_restore_ctx(struct omap_iommu *obj) iommu_write_reg(obj, p[i], i * sizeof(u32)); dev_dbg(obj-dev, %s\t[%02d] %08x\n, __func__, i, p[i]); } - - BUG_ON(p[0] != IOMMU_ARCH_VERSION); } static void omap2_cr_to_e(struct cr_regs *cr, struct iotlb_entry *e) -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 06/17] iommu/omap: Remove omap_iommu_arch_version() and version field
The function omap_iommu_arch_version() is not used anymore, and is not required either, so remove it. The .version field in struct iommu_functions that this function uses is also removed, as it is not really an ops to retrieve a version and there won't be any usage for this field either. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu.c | 9 - drivers/iommu/omap-iommu.h | 4 drivers/iommu/omap-iommu2.c | 2 -- 3 files changed, 15 deletions(-) diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c index ea04e4d..f9efa6b 100644 --- a/drivers/iommu/omap-iommu.c +++ b/drivers/iommu/omap-iommu.c @@ -138,15 +138,6 @@ void omap_iommu_restore_ctx(struct device *dev) } EXPORT_SYMBOL_GPL(omap_iommu_restore_ctx); -/** - * omap_iommu_arch_version - Return running iommu arch version - **/ -u32 omap_iommu_arch_version(void) -{ - return arch_iommu-version; -} -EXPORT_SYMBOL_GPL(omap_iommu_arch_version); - static int iommu_enable(struct omap_iommu *obj) { int err; diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h index d7c5132..45fe67d 100644 --- a/drivers/iommu/omap-iommu.h +++ b/drivers/iommu/omap-iommu.h @@ -70,8 +70,6 @@ struct cr_regs { /* architecture specific functions */ struct iommu_functions { - unsigned long version; - int (*enable)(struct omap_iommu *obj); void (*disable)(struct omap_iommu *obj); void (*set_twl)(struct omap_iommu *obj, bool on); @@ -191,8 +189,6 @@ static inline struct omap_iommu *dev_to_omap_iommu(struct device *dev) /* * global functions */ -extern u32 omap_iommu_arch_version(void); - extern void omap_iotlb_cr_to_e(struct cr_regs *cr, struct iotlb_entry *e); extern int diff --git a/drivers/iommu/omap-iommu2.c b/drivers/iommu/omap-iommu2.c index 5e1ea3b..2f6a9f7 100644 --- a/drivers/iommu/omap-iommu2.c +++ b/drivers/iommu/omap-iommu2.c @@ -297,8 +297,6 @@ static void omap2_cr_to_e(struct cr_regs *cr, struct iotlb_entry *e) } static const struct iommu_functions omap2_iommu_ops = { - .version= IOMMU_ARCH_VERSION, - .enable = omap2_iommu_enable, .disable= omap2_iommu_disable, .set_twl= omap2_iommu_set_twl, -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 02/17] iommu/omap: Remove unused isr_priv field from omap_iommu
The isr_priv field is a left-over from before the IOMMU API adaptation, this was used to store the callback data. This is no longer relevant, so remove it. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h index 5c14000..18a0f3a 100644 --- a/drivers/iommu/omap-iommu.h +++ b/drivers/iommu/omap-iommu.h @@ -30,7 +30,6 @@ struct omap_iommu { const char *name; void __iomem*regbase; struct device *dev; - void*isr_priv; struct iommu_domain *domain; spinlock_t iommu_lock; /* global for this whole object */ -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 13/17] iommu/omap: Remove couple of unused exported functions
The exported functions omap_foreach_iommu_device() and omap_iotlb_cr_to_e() have been deleted, as they are no longer needed. The function omap_foreach_iommu_device() is not required after the consolidation of the OMAP IOMMU debug module, and the function omap_iotlb_cr_to_e() is not required after making the debugfs entry 'pagetable' read-only. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu.c | 21 - drivers/iommu/omap-iommu.h | 5 - 2 files changed, 26 deletions(-) diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c index b92b6fc..9171112 100644 --- a/drivers/iommu/omap-iommu.c +++ b/drivers/iommu/omap-iommu.c @@ -203,20 +203,6 @@ static void iommu_disable(struct omap_iommu *obj) /* * TLB operations */ -void omap_iotlb_cr_to_e(struct cr_regs *cr, struct iotlb_entry *e) -{ - BUG_ON(!cr || !e); - - e-da = cr-cam MMU_CAM_VATAG_MASK; - e-pa = cr-ram MMU_RAM_PADDR_MASK; - e-valid= cr-cam MMU_CAM_V; - e-pgsz = cr-cam MMU_CAM_PGSZ_MASK; - e-endian = cr-ram MMU_RAM_ENDIAN_MASK; - e-elsz = cr-ram MMU_RAM_ELSZ_MASK; - e-mixed= cr-ram MMU_RAM_MIXED; -} -EXPORT_SYMBOL_GPL(omap_iotlb_cr_to_e); - static inline int iotlb_cr_valid(struct cr_regs *cr) { if (!cr) @@ -595,13 +581,6 @@ size_t omap_dump_tlb_entries(struct omap_iommu *obj, char *buf, ssize_t bytes) } EXPORT_SYMBOL_GPL(omap_dump_tlb_entries); -int omap_foreach_iommu_device(void *data, int (*fn)(struct device *, void *)) -{ - return driver_for_each_device(omap_iommu_driver.driver, - NULL, data, fn); -} -EXPORT_SYMBOL_GPL(omap_foreach_iommu_device); - #endif /* CONFIG_OMAP_IOMMU_DEBUG */ /* diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h index 4783779..b18cecc 100644 --- a/drivers/iommu/omap-iommu.h +++ b/drivers/iommu/omap-iommu.h @@ -190,14 +190,9 @@ static inline struct omap_iommu *dev_to_omap_iommu(struct device *dev) /* * global functions */ -extern void omap_iotlb_cr_to_e(struct cr_regs *cr, struct iotlb_entry *e); - extern int omap_iopgtable_store_entry(struct omap_iommu *obj, struct iotlb_entry *e); -extern int omap_foreach_iommu_device(void *data, - int (*fn)(struct device *, void *)); - #ifdef CONFIG_OMAP_IOMMU_DEBUG extern ssize_t omap_iommu_dump_ctx(struct omap_iommu *obj, char *buf, ssize_t len); -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 16/17] iommu/omap: Fix bus error on debugfs access of unattached IOMMU
Any debugfs access on an OMAP IOMMU that is not enabled (done during attach) results in a bus error due to access of registers without the clock or the reset enabled for the respective IOMMU. So, add a check to make sure the IOMMU is enabled/attached by a client device. This gracefully prints a Operation not permitted trace when the corresponding IOMMU is not enabled. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/omap-iommu-debug.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c index 4813d3a..41b09a1 100644 --- a/drivers/iommu/omap-iommu-debug.c +++ b/drivers/iommu/omap-iommu-debug.c @@ -24,6 +24,11 @@ static DEFINE_MUTEX(iommu_debug_lock); static struct dentry *iommu_debug_root; +static inline bool is_omap_iommu_detached(struct omap_iommu *obj) +{ + return !obj-domain; +} + static ssize_t debug_read_regs(struct file *file, char __user *userbuf, size_t count, loff_t *ppos) { @@ -31,6 +36,9 @@ static ssize_t debug_read_regs(struct file *file, char __user *userbuf, char *p, *buf; ssize_t bytes; + if (is_omap_iommu_detached(obj)) + return -EPERM; + buf = kmalloc(count, GFP_KERNEL); if (!buf) return -ENOMEM; @@ -54,6 +62,9 @@ static ssize_t debug_read_tlb(struct file *file, char __user *userbuf, char *p, *buf; ssize_t bytes, rest; + if (is_omap_iommu_detached(obj)) + return -EPERM; + buf = kmalloc(count, GFP_KERNEL); if (!buf) return -ENOMEM; @@ -139,6 +150,9 @@ static ssize_t debug_read_pagetable(struct file *file, char __user *userbuf, char *p, *buf; size_t bytes; + if (is_omap_iommu_detached(obj)) + return -EPERM; + buf = (char *)__get_free_page(GFP_KERNEL); if (!buf) return -ENOMEM; -- 2.1.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 09/17] iommu/omap: Consolidate OMAP IOMMU modules
The OMAP IOMMU driver was originally designed as modules, and split into a core module and a thin arch-specific module through the OMAP arch-specific struct iommu_functions, to scale for both OMAP1 and OMAP2+ IOMMU variants. The driver can only be built for OMAP2+ platforms currently, and also can only be built-in after the adaptation to generic IOMMU API. The OMAP1 variant was never added and will most probably be never added (the code for the only potential user, its parent, DSP processor has already been cleaned up). So, consolidate the OMAP2 specific omap-iommu2 module into the core OMAP IOMMU driver - this eliminates the arch-specific ops structure and simplifies the driver into a single module that only implements the generic IOMMU API's iommu_ops. The following are the main changes: - omap-iommu2 module is completely eliminated, with the common definitions moved to the internal omap-iommu.h, and the ops implementations moved into omap-iommu.c - OMAP arch-specific struct iommu_functions is also eliminated, with the ops implementations directly absorbed into the calling functions - iotlb_alloc_cr() is no longer inlined and defined only when PREFETCH_IOTLB is defined - iotlb_dump_cr() is similarly defined only when CONFIG_OMAP_IOMMU_DEBUG is defined - Elimination of the OMAP IOMMU exported functions to register the arch ops, omap_install_iommu_arch() omap_uninstall_iommu_arch() - Any stale comments about OMAP1 are also cleaned up Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/Makefile | 1 - drivers/iommu/omap-iommu.c | 263 ++--- drivers/iommu/omap-iommu.h | 61 + drivers/iommu/omap-iommu2.c | 311 4 files changed, 217 insertions(+), 419 deletions(-) delete mode 100644 drivers/iommu/omap-iommu2.c diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 16edef7..18fa446 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -11,7 +11,6 @@ obj-$(CONFIG_INTEL_IOMMU) += iova.o intel-iommu.o obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o obj-$(CONFIG_OMAP_IOMMU) += omap-iommu.o -obj-$(CONFIG_OMAP_IOMMU) += omap-iommu2.o obj-$(CONFIG_OMAP_IOMMU_DEBUG) += omap-iommu-debug.o obj-$(CONFIG_TEGRA_IOMMU_GART) += tegra-gart.o obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c index f9efa6b..91262fa 100644 --- a/drivers/iommu/omap-iommu.c +++ b/drivers/iommu/omap-iommu.c @@ -76,53 +76,23 @@ struct iotlb_lock { short vict; }; -/* accommodate the difference between omap1 and omap2/3 */ -static const struct iommu_functions *arch_iommu; - static struct platform_driver omap_iommu_driver; static struct kmem_cache *iopte_cachep; /** - * omap_install_iommu_arch - Install archtecure specific iommu functions - * @ops: a pointer to architecture specific iommu functions - * - * There are several kind of iommu algorithm(tlb, pagetable) among - * omap series. This interface installs such an iommu algorighm. - **/ -int omap_install_iommu_arch(const struct iommu_functions *ops) -{ - if (arch_iommu) - return -EBUSY; - - arch_iommu = ops; - return 0; -} -EXPORT_SYMBOL_GPL(omap_install_iommu_arch); - -/** - * omap_uninstall_iommu_arch - Uninstall archtecure specific iommu functions - * @ops: a pointer to architecture specific iommu functions - * - * This interface uninstalls the iommu algorighm installed previously. - **/ -void omap_uninstall_iommu_arch(const struct iommu_functions *ops) -{ - if (arch_iommu != ops) - pr_err(%s: not your arch\n, __func__); - - arch_iommu = NULL; -} -EXPORT_SYMBOL_GPL(omap_uninstall_iommu_arch); - -/** * omap_iommu_save_ctx - Save registers for pm off-mode support * @dev: client device **/ void omap_iommu_save_ctx(struct device *dev) { struct omap_iommu *obj = dev_to_omap_iommu(dev); + u32 *p = obj-ctx; + int i; - arch_iommu-save_ctx(obj); + for (i = 0; i (MMU_REG_SIZE / sizeof(u32)); i++) { + p[i] = iommu_read_reg(obj, i * sizeof(u32)); + dev_dbg(obj-dev, %s\t[%02d] %08x\n, __func__, i, p[i]); + } } EXPORT_SYMBOL_GPL(omap_iommu_save_ctx); @@ -133,20 +103,75 @@ EXPORT_SYMBOL_GPL(omap_iommu_save_ctx); void omap_iommu_restore_ctx(struct device *dev) { struct omap_iommu *obj = dev_to_omap_iommu(dev); + u32 *p = obj-ctx; + int i; - arch_iommu-restore_ctx(obj); + for (i = 0; i (MMU_REG_SIZE / sizeof(u32)); i++) { + iommu_write_reg(obj, p[i], i * sizeof(u32)); + dev_dbg(obj-dev, %s\t[%02d] %08x\n, __func__, i, p[i]); + } } EXPORT_SYMBOL_GPL(omap_iommu_restore_ctx); +static void __iommu_set_twl(struct omap_iommu *obj, bool on) +{ +
Re: [PATCH v2 12/17] iommu/omap: Integrate omap-iommu-debug into omap-iommu
Hi Suman, Thank you for the patch. On Wednesday 22 October 2014 17:22:30 Suman Anna wrote: The debugfs support for OMAP IOMMU is currently implemented as a module, warranting certain OMAP-specific IOMMU API to be exported. The OMAP IOMMU, when enabled, can only be built-in into the kernel, so integrate the OMAP IOMMU debug module into the OMAP IOMMU driver. This helps in eliminating the need to export most of the current OMAP IOMMU API. The following are the main changes: - The debugfs directory and entry creation logic is reversed, the calls are invoked by the OMAP IOMMU driver now. - The current iffy circular logic of adding IOMMU archdata to the IOMMU devices itself to get a pointer to the omap_iommu object in the debugfs support code is replaced by directly using the omap_iommu structure while creating the debugfs entries. - The debugfs root directory is renamed from the generic name iommu to a specific name omap_iommu. - Unneeded headers have also been cleaned up while at this. - There will no longer be a omap-iommu-debug.ko module after this patch. - The OMAP_IOMMU_DEBUG Kconfig option is converted to boolean only, the OMAP IOMMU debugfs support is built alongside the OMAP IOMMU driver only when this option is enabled. Signed-off-by: Suman Anna s-a...@ti.com Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com --- drivers/iommu/Kconfig| 12 ++--- drivers/iommu/omap-iommu-debug.c | 100 ++- drivers/iommu/omap-iommu.c | 11 - drivers/iommu/omap-iommu.h | 15 ++ 4 files changed, 58 insertions(+), 80 deletions(-) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index dd51122..1d54996 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -144,13 +144,13 @@ config OMAP_IOMMU select IOMMU_API config OMAP_IOMMU_DEBUG - tristate Export OMAP IOMMU internals in DebugFS - depends on OMAP_IOMMU DEBUG_FS - help - Select this to see extensive information about - the internal state of OMAP IOMMU in debugfs. + bool Export OMAP IOMMU internals in DebugFS + depends on OMAP_IOMMU DEBUG_FS + ---help--- + Select this to see extensive information about + the internal state of OMAP IOMMU in debugfs. - Say N unless you know you need this. + Say N unless you know you need this. config TEGRA_IOMMU_GART bool Tegra GART IOMMU Support diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c index 28de657..4813d3a 100644 --- a/drivers/iommu/omap-iommu-debug.c +++ b/drivers/iommu/omap-iommu-debug.c @@ -10,15 +10,11 @@ * published by the Free Software Foundation. */ -#include linux/module.h #include linux/err.h -#include linux/clk.h #include linux/io.h #include linux/slab.h #include linux/uaccess.h -#include linux/platform_device.h #include linux/debugfs.h -#include linux/omap-iommu.h #include linux/platform_data/iommu-omap.h #include omap-iopgtable.h @@ -31,8 +27,7 @@ static struct dentry *iommu_debug_root; static ssize_t debug_read_regs(struct file *file, char __user *userbuf, size_t count, loff_t *ppos) { - struct device *dev = file-private_data; - struct omap_iommu *obj = dev_to_omap_iommu(dev); + struct omap_iommu *obj = file-private_data; char *p, *buf; ssize_t bytes; @@ -55,8 +50,7 @@ static ssize_t debug_read_regs(struct file *file, char __user *userbuf, static ssize_t debug_read_tlb(struct file *file, char __user *userbuf, size_t count, loff_t *ppos) { - struct device *dev = file-private_data; - struct omap_iommu *obj = dev_to_omap_iommu(dev); + struct omap_iommu *obj = file-private_data; char *p, *buf; ssize_t bytes, rest; @@ -141,8 +135,7 @@ out: static ssize_t debug_read_pagetable(struct file *file, char __user *userbuf, size_t count, loff_t *ppos) { - struct device *dev = file-private_data; - struct omap_iommu *obj = dev_to_omap_iommu(dev); + struct omap_iommu *obj = file-private_data; char *p, *buf; size_t bytes; @@ -181,93 +174,56 @@ DEBUG_FOPS_RO(pagetable); #define __DEBUG_ADD_FILE(attr, mode) \ { \ struct dentry *dent;\ - dent = debugfs_create_file(#attr, mode, parent, \ -dev, debug_##attr##_fops); \ + dent = debugfs_create_file(#attr, mode, obj-debug_dir, \ +obj, debug_##attr##_fops); \ if (!dent) \ - return -ENOMEM; \ + goto err;
Re: [PATCH v3 02/27] x86/xen/MSI: Eliminate arch_msix_mask_irq() and arch_msi_mask_irq()
On 2014/10/23 12:25, Bjorn Helgaas wrote: On Wed, Oct 15, 2014 at 11:06:50AM +0800, Yijing Wang wrote: Commit 0e4ccb1505a9 (PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()) introduced two __weak arch functions arch_msix_mask_irq() and arch_msi_mask_irq() to work around a bug when running xen in x86. These two functions made msi code more complex. This patch reverts the commit and introduces a global flag to control msi mask action to avoid the bug. The patch is also preparation for using MSI chip framework instead of weak arch MSI functions in all platforms. Keep default_msi_mask_irq() and default_msix_mask_irq() in linux/msi.h to make s390 MSI code compile happy, they will be removed in the later patch. This patch is basically a revert of 0e4ccb1505a9 intermingled with the addition of the new pci_msi_ignore_mask technique. Can you please split this into two patches: - Add the pci_msi_ignore_mask stuff (alongside the existing way) - Revert 0e4ccb1505a9 I think that will make it easier to see what's going on. OK, I will do it, thanks. Thanks! Yijing. Signed-off-by: Yijing Wang wangyij...@huawei.com CC: David Vrabel david.vra...@citrix.com CC: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- Hi David and Konrad, I dropped the Acked-by and tested-by, because this version has a lot changes compared to last. So, I guess you may want to check it again. --- arch/x86/include/asm/x86_init.h |3 --- arch/x86/kernel/x86_init.c | 10 -- arch/x86/pci/xen.c | 14 ++ drivers/pci/msi.c | 29 - include/linux/msi.h |8 ++-- 5 files changed, 20 insertions(+), 44 deletions(-) diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h index e45e4da..f58a9c7 100644 --- a/arch/x86/include/asm/x86_init.h +++ b/arch/x86/include/asm/x86_init.h @@ -172,7 +172,6 @@ struct x86_platform_ops { struct pci_dev; struct msi_msg; -struct msi_desc; struct x86_msi_ops { int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type); @@ -183,8 +182,6 @@ struct x86_msi_ops { void (*teardown_msi_irqs)(struct pci_dev *dev); void (*restore_msi_irqs)(struct pci_dev *dev); int (*setup_hpet_msi)(unsigned int irq, unsigned int id); -u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag); -u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag); }; struct IO_APIC_route_entry; diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c index e48b674..234b072 100644 --- a/arch/x86/kernel/x86_init.c +++ b/arch/x86/kernel/x86_init.c @@ -116,8 +116,6 @@ struct x86_msi_ops x86_msi = { .teardown_msi_irqs = default_teardown_msi_irqs, .restore_msi_irqs = default_restore_msi_irqs, .setup_hpet_msi = default_setup_hpet_msi, -.msi_mask_irq = default_msi_mask_irq, -.msix_mask_irq = default_msix_mask_irq, }; /* MSI arch specific hooks */ @@ -140,14 +138,6 @@ void arch_restore_msi_irqs(struct pci_dev *dev) { x86_msi.restore_msi_irqs(dev); } -u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag) -{ -return x86_msi.msi_mask_irq(desc, mask, flag); -} -u32 arch_msix_mask_irq(struct msi_desc *desc, u32 flag) -{ -return x86_msi.msix_mask_irq(desc, flag); -} #endif struct x86_io_apic_ops x86_io_apic_ops = { diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c index 093f5f4..e5b8b78 100644 --- a/arch/x86/pci/xen.c +++ b/arch/x86/pci/xen.c @@ -394,14 +394,6 @@ static void xen_teardown_msi_irq(unsigned int irq) { xen_destroy_irq(irq); } -static u32 xen_nop_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag) -{ -return 0; -} -static u32 xen_nop_msix_mask_irq(struct msi_desc *desc, u32 flag) -{ -return 0; -} #endif int __init pci_xen_init(void) @@ -425,8 +417,7 @@ int __init pci_xen_init(void) x86_msi.setup_msi_irqs = xen_setup_msi_irqs; x86_msi.teardown_msi_irq = xen_teardown_msi_irq; x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs; -x86_msi.msi_mask_irq = xen_nop_msi_mask_irq; -x86_msi.msix_mask_irq = xen_nop_msix_mask_irq; +pci_msi_ignore_mask = 1; #endif return 0; } @@ -506,8 +497,7 @@ int __init pci_xen_initial_domain(void) x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs; x86_msi.teardown_msi_irq = xen_teardown_msi_irq; x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs; -x86_msi.msi_mask_irq = xen_nop_msi_mask_irq; -x86_msi.msix_mask_irq = xen_nop_msix_mask_irq; +pci_msi_ignore_mask = 1; #endif xen_setup_acpi_sci(); __acpi_register_gsi = acpi_register_gsi_xen; diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index ecb92a5..22e413c 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -23,6 +23,7 @@ #include pci.h static int
Re: [PATCH v3 05/27] PCI: tegra: Save msi chip in pci_sys_data
On Wed, Oct 15, 2014 at 11:06:53AM +0800, Yijing Wang wrote: Save msi chip in pci_sys_data instead of assign msi chip to every pci bus in .add_bus(). Signed-off-by: Yijing Wang wangyij...@huawei.com --- drivers/pci/host/pci-tegra.c | 13 +++-- 1 files changed, 3 insertions(+), 10 deletions(-) diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c index 3d43874..5af0525 100644 --- a/drivers/pci/host/pci-tegra.c +++ b/drivers/pci/host/pci-tegra.c @@ -694,15 +694,6 @@ static int tegra_pcie_map_irq(const struct pci_dev *pdev, u8 slot, u8 pin) return irq; } -static void tegra_pcie_add_bus(struct pci_bus *bus) -{ - if (IS_ENABLED(CONFIG_PCI_MSI)) { - struct tegra_pcie *pcie = sys_to_pcie(bus-sysdata); - - bus-msi = pcie-msi.chip; - } -} - static struct pci_bus *tegra_pcie_scan_bus(int nr, struct pci_sys_data *sys) { struct tegra_pcie *pcie = sys_to_pcie(sys); @@ -1881,11 +1872,13 @@ static int tegra_pcie_enable(struct tegra_pcie *pcie) memset(hw, 0, sizeof(hw)); +#ifdef CONFIG_PCI_MSI + hw.msi_chip = pcie-msi.chip; +#endif Why did you use #ifdef CONFIG_PCI_MSI instead of the IS_ENABLED(CONFIG_PCI_MSI) used previously? It's true that CONFIG_PCI_MSI will never be a tristate symbol, so we don't really *need* the extra smarts of IS_ENABLED(), but I'm fairly sympathetic to James' argument [1] that we should just use IS_ENABLED() all the time because it's simpler overall. If you want to change the #ifdef to IS_ENABLED(), that should be a separate patch from your msi_chip change, and we can debate the merits of that by itself. [1] http://lkml.iu.edu//hypermail/linux/kernel/1204.3/00081.html hw.nr_controllers = 1; hw.private_data = (void **)pcie; hw.setup = tegra_pcie_setup; hw.map_irq = tegra_pcie_map_irq; - hw.add_bus = tegra_pcie_add_bus; hw.scan = tegra_pcie_scan_bus; hw.ops = tegra_pcie_ops; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 04/27] arm/MSI: Save MSI chip in pci_sys_data
On Wed, Oct 15, 2014 at 11:06:52AM +0800, Yijing Wang wrote: Saving msi chip in pci_sys_data can make pci bus and devices don't need to know msi chip detail, it also make pci enumeration code be decoupled from msi chip. In fact, all pci devices under the same pci hostbridge share same msi chip. So msi chip should be seen as one of resources or attributes to be initialized in pci host bridge driver. Currently, pci hostbridge drivers create pci_host_bridge in pci_create_root_bus(), and pass arch specific pci sysdata to core pci scan functions. So pci arch sysdata is good place to save msi chip. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/arm/include/asm/mach/pci.h |6 ++ arch/arm/include/asm/pci.h |9 + arch/arm/kernel/bios32.c|3 +++ drivers/pci/msi.c |6 ++ include/linux/pci.h |9 + 5 files changed, 33 insertions(+), 0 deletions(-) diff --git a/arch/arm/include/asm/mach/pci.h b/arch/arm/include/asm/mach/pci.h index 7fc4278..59b0d87 100644 --- a/arch/arm/include/asm/mach/pci.h +++ b/arch/arm/include/asm/mach/pci.h @@ -22,6 +22,9 @@ struct hw_pci { #ifdef CONFIG_PCI_DOMAINS int domain; #endif +#ifdef CONFIG_PCI_MSI + struct msi_chip *msi_chip; +#endif struct pci_ops *ops; int nr_controllers; void**private_data; @@ -47,6 +50,9 @@ struct pci_sys_data { #ifdef CONFIG_PCI_DOMAINS int domain; #endif +#ifdef CONFIG_PCI_MSI + struct msi_chip *msi_chip; +#endif struct list_head node; int busnr; /* primary bus number */ u64 mem_offset; /* bus-cpu memory mapping offset */ diff --git a/arch/arm/include/asm/pci.h b/arch/arm/include/asm/pci.h index 7e95d85..b562c09 100644 --- a/arch/arm/include/asm/pci.h +++ b/arch/arm/include/asm/pci.h @@ -31,6 +31,15 @@ static inline int pci_proc_domain(struct pci_bus *bus) } #endif /* CONFIG_PCI_DOMAINS */ +#ifdef CONFIG_PCI_MSI +static inline struct msi_chip *pci_msi_chip(struct pci_bus *bus) +{ + struct pci_sys_data *root = bus-sysdata; + + return root-msi_chip; +} +#endif + /* * The PCI address space does equal the physical memory address space. * The networking and block device layers use this boolean for bounce diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c index 17a26c1..a19038d 100644 --- a/arch/arm/kernel/bios32.c +++ b/arch/arm/kernel/bios32.c @@ -471,6 +471,9 @@ static void pcibios_init_hw(struct device *parent, struct hw_pci *hw, #ifdef CONFIG_PCI_DOMAINS sys-domain = hw-domain; #endif +#ifdef CONFIG_PCI_MSI + sys-msi_chip = hw-msi_chip; +#endif sys-busnr = busnr; sys-swizzle = hw-swizzle; sys-map_irq = hw-map_irq; diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 22e413c..f11108c 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -35,6 +35,9 @@ int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) struct msi_chip *chip = dev-bus-msi; int err; + if (!chip) + chip = pci_msi_chip(dev-bus); + if (!chip || !chip-setup_irq) return -EINVAL; @@ -50,6 +53,9 @@ void __weak arch_teardown_msi_irq(unsigned int irq) struct msi_desc *entry = irq_get_msi_desc(irq); struct msi_chip *chip = entry-dev-bus-msi; + if (!chip) + chip = pci_msi_chip(entry-dev-bus); + if (!chip || !chip-teardown_irq) return; diff --git a/include/linux/pci.h b/include/linux/pci.h index 9cd2721..7a48b40 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1433,6 +1433,15 @@ static inline int pci_get_new_domain_nr(void) { return -ENOSYS; } #include asm/pci.h +/* Just avoid compile error, will be clean up later */ +#ifdef CONFIG_PCI_MSI + +#ifndef pci_msi_chip +#define pci_msi_chip(bus)NULL +#endif +#endif I don't like the mixture of ARM changes and PCI core changes in the same patch. Can you split this into a core patch that does something like this: struct msi_chip * __weak pcibios_msi_controller(struct pci_bus *bus) { return NULL; } struct msi_chip *pci_msi_controller(struct pci_bus *bus) { msi_chip *controller = bus-msi; if (controller) return controller; return pcibios_msi_controller(bus); } followed by an ARM patch that puts the msi_chip pointer in struct hw_pci and implements pcibios_msi_controller()? I know you're trying to *remove* weak functions, and this adds one, but this section of the series is more about getting rid of the ARM pcibios_add_bus() because all it was used for was setting the bus-msi pointer. Eventually we might have a way to stash an MSI controller pointer in the generic pci_host_bridge struct, and
Re: [PATCH v3 09/27] arm/PCI: Clean unused pcibios_add_bus() and pcibios_remove_bus()
On Wed, Oct 15, 2014 at 11:06:57AM +0800, Yijing Wang wrote: MSI chip will be saved in pci_sys_data, now we can clean up pcibios_add_bus() and pcibios_remove_bus() in arm, and use pci_find_msi_chip() to get msi chip in core MSI code. Signed-off-by: Yijing Wang wangyij...@huawei.com --- arch/arm/include/asm/mach/pci.h |4 arch/arm/kernel/bios32.c| 16 drivers/pci/msi.c | 11 +++ 3 files changed, 3 insertions(+), 28 deletions(-) diff --git a/arch/arm/include/asm/mach/pci.h b/arch/arm/include/asm/mach/pci.h index 59b0d87..230b2c9 100644 --- a/arch/arm/include/asm/mach/pci.h +++ b/arch/arm/include/asm/mach/pci.h @@ -39,8 +39,6 @@ struct hw_pci { resource_size_t start, resource_size_t size, resource_size_t align); - void(*add_bus)(struct pci_bus *bus); - void(*remove_bus)(struct pci_bus *bus); }; /* @@ -71,8 +69,6 @@ struct pci_sys_data { resource_size_t start, resource_size_t size, resource_size_t align); - void(*add_bus)(struct pci_bus *bus); - void(*remove_bus)(struct pci_bus *bus); void*private_data; /* platform controller private data */ }; diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c index a19038d..b1b872e 100644 --- a/arch/arm/kernel/bios32.c +++ b/arch/arm/kernel/bios32.c @@ -360,20 +360,6 @@ void pcibios_fixup_bus(struct pci_bus *bus) } EXPORT_SYMBOL(pcibios_fixup_bus); -void pcibios_add_bus(struct pci_bus *bus) -{ - struct pci_sys_data *sys = bus-sysdata; - if (sys-add_bus) - sys-add_bus(bus); -} - -void pcibios_remove_bus(struct pci_bus *bus) -{ - struct pci_sys_data *sys = bus-sysdata; - if (sys-remove_bus) - sys-remove_bus(bus); -} - /* * Swizzle the device pin each time we cross a bridge. If a platform does * not provide a swizzle function, we perform the standard PCI swizzling. @@ -478,8 +464,6 @@ static void pcibios_init_hw(struct device *parent, struct hw_pci *hw, sys-swizzle = hw-swizzle; sys-map_irq = hw-map_irq; sys-align_resource = hw-align_resource; - sys-add_bus = hw-add_bus; - sys-remove_bus = hw-remove_bus; INIT_LIST_HEAD(sys-resources); if (hw-private_data) What do the core changes below have to do with the ARM changes above? They should be a separate patch unless they can't be separated. diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index f11108c..56e54ad 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -32,12 +32,10 @@ int pci_msi_ignore_mask; int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) { - struct msi_chip *chip = dev-bus-msi; + struct msi_chip *chip; int err; - if (!chip) - chip = pci_msi_chip(dev-bus); - + chip = pci_msi_chip(dev-bus); if (!chip || !chip-setup_irq) return -EINVAL; @@ -51,10 +49,7 @@ int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc) void __weak arch_teardown_msi_irq(unsigned int irq) { struct msi_desc *entry = irq_get_msi_desc(irq); - struct msi_chip *chip = entry-dev-bus-msi; - - if (!chip) - chip = pci_msi_chip(entry-dev-bus); + struct msi_chip *chip = pci_msi_chip(entry-dev-bus); if (!chip || !chip-teardown_irq) return; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-pci in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 10/27] PCI/MSI: Remove useless bus-msi assignment
On Wed, Oct 15, 2014 at 11:06:58AM +0800, Yijing Wang wrote: Now msi chip is saved in pci_sys_data in arm, we could clean the bus-msi assignment in pci core. Signed-off-by: Yijing Wang wangyij...@huawei.com CC: Thierry Reding thierry.red...@gmail.com CC: Thomas Petazzoni thomas.petazz...@free-electrons.com --- drivers/pci/probe.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index efa48dc..98bf4c3 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -682,7 +682,6 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent, child-parent = parent; child-ops = parent-ops; - child-msi = parent-msi; This needs an explanation of why ARM was the only arch to depend on this. child-sysdata = parent-sysdata; child-bus_flags = parent-bus_flags; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-pci in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 00/27] Use MSI chip framework to configure MSI/MSI-X in all platforms
On Wed, Oct 15, 2014 at 11:06:48AM +0800, Yijing Wang wrote: Now there are a lot of weak arch functions in MSI code. Thierry Reding Introduced MSI chip framework to configure MSI/MSI-X in arm, that's a better solution than overriding lots of existing weak arch functionsin. This series use MSI chip framework to refactor MSI code across all platforms to eliminate weak arch functions. Then all MSI irqs will be managed in a unified framework. Because this series changed a lot of ARCH MSI code, so tests in the related platforms are warmly welcomed! And you may access it at: https://github.com/YijingWang/msi-chip-v3.git master v2-v3: 1. For patch x86/xen/MSI: Eliminate..., introduce a new global flag pci_msi_ignore_mask to control the msi mask instead of replacing the irqchip-mask with nop function, the latter method has problem pointed out by Konrad Rzeszutek Wilk. 2. Save msi chip in arch pci sysdata instead of associating msi chip to pci bus. Because pci devices under same host share the same msi chip, so I think associate msi chip to pci host/pci sysdata is better than to bother every pci bus/devices. A better solution suggested by Liviu is to rip out pci_host_bridge from pci_create_root_bus(), then we can save some pci host common attributes like domain_nr, msi_chip, resources, into the generic pci_host_bridge. Because this changes to pci host bridge is also a large series, so I think we should go step by step, I will try to post it in another series later. 4. Clean up arm pcibios_add_bus() and pcibios_remove_bus() which were used to associate msi chip to pci bus. v1-v2: Add a patch to make s390 MSI code build happy between patch x86/xen/MSI: E.. and s390/MSI: Use MSI... Fix several typo problems found by Lucas. RFC-v1: Updated [patch 4/21] x86/xen/MSI: Eliminate..., export msi_chip instead of #ifdef to fix MSI bug in xen running in x86. Rename arch_get_match_msi_chip() to arch_find_msi_chip(). Drop use struct device as the msi_chip argument, we will do that later in another patchset. Yijing Wang (27): This is a lot of stuff, and it's not all related, so putting it all together in one series makes it hard for me to deal with it. MSI: Remove the redundant irq_set_chip_data() This doesn't seem related to eliminating weak functions. x86/xen/MSI: Eliminate arch_msix_mask_irq() and arch_msi_mask_irq() s390/MSI: Use __msi_mask_irq() instead of default_msi_mask_irq() These two seem to go together. arm/MSI: Save MSI chip in pci_sys_data PCI: tegra: Save msi chip in pci_sys_data PCI: designware: Save msi chip in pci_sys_data PCI: rcar: Save msi chip in pci_sys_data PCI: mvebu: Save msi chip in pci_sys_data arm/PCI: Clean unused pcibios_add_bus() and pcibios_remove_bus() PCI/MSI: Remove useless bus-msi assignment These seem to go together (it'd be nice if they were all capitalized the same). I don't like the msi_chip name because the chip part doesn't convey much information, other than telling us that apparently some sort of semiconductor integrated circuit is involved. I sort of assumed that much :) Something like msi_controller would be more descriptive since we're talking about an interrupt controller that accepts writes from a device and turns them into CPU interrupts, e.g., a LAPIC. PCI/MSI: Refactor struct msi_chip to make it become more common x86/MSI: Use MSI chip framework to configure MSI/MSI-X irq x86/xen/MSI: Use MSI chip framework to configure MSI/MSI-X irq Irq_remapping/MSI: Use MSI chip framework to configure MSI/MSI-X irq x86/MSI: Remove unused MSI weak arch functions Mips/MSI: Save msi chip in pci sysdata MIPS/Octeon/MSI: Use MSI chip framework to configure MSI/MSI-X irq MIPS/Xlp: Remove the dead function destroy_irq() to fix build error MIPS/Xlp/MSI: Use MSI chip framework to configure MSI/MSI-X irq MIPS/Xlr/MSI: Use MSI chip framework to configure MSI/MSI-X irq Powerpc/MSI: Use MSI chip framework to configure MSI/MSI-X irq s390/MSI: Use MSI chip framework to configure MSI/MSI-X irq arm/iop13xx/MSI: Use MSI chip framework to configure MSI/MSI-X irq IA64/MSI: Use MSI chip framework to configure MSI/MSI-X irq Sparc/MSI: Use MSI chip framework to configure MSI/MSI-X irq tile/MSI: Use MSI chip framework to configure MSI/MSI-X irq PCI/MSI: Clean up unused MSI arch functions arch/arm/include/asm/mach/pci.h | 10 +- arch/arm/include/asm/pci.h |9 ++ arch/arm/kernel/bios32.c| 19 +--- arch/arm/mach-iop13xx/include/mach/pci.h|4 + arch/arm/mach-iop13xx/iq81340mc.c |3 + arch/arm/mach-iop13xx/iq81340sc.c |5 +- arch/arm/mach-iop13xx/msi.c | 11 ++- arch/ia64/include/asm/pci.h | 10 ++ arch/ia64/kernel/msi_ia64.c | 14 ++- arch/ia64/pci/pci.c