Re: [PATCH 0/5] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO

2014-10-22 Thread Baoquan He
On 10/22/14 at 10:22am, Li, Zhen-Hua wrote:
 
 Hi Baoquan,
 I tested it on 3.17, it does not have these faults. There are little 
 differences between this version and Bill's last version.
 
 I will test it on 3.18.0-rc1+ on my system and let you know the result.
 
 And could you send me the result of lspci -vvv  on your system?

I have pasted them here.

[~]$ lspci -vvv
00:00.0 Host bridge: Intel Corporation Xeon E5/Core i7 DMI2 (rev 07)
Subsystem: Hewlett-Packard Company Device 1589
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort-
TAbort- MAbort- SERR- PERR- INTx-
Interrupt: pin A routed to IRQ 0
Capabilities: access denied

00:01.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express
Root Port 1a (rev 07) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr+ Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort-
TAbort- MAbort- SERR- PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
I/O behind bridge: f000-0fff
Memory behind bridge: fff0-000f
Prefetchable memory behind bridge:
fff0-000f
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast TAbort-
TAbort- MAbort- SERR- PERR-
BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: access denied
Kernel driver in use: pcieport

00:02.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express
Root Port 2a (rev 07) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr+ Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort-
TAbort- MAbort- SERR- PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
I/O behind bridge: d000-dfff
Memory behind bridge: d600-d70f
Prefetchable memory behind bridge:
d800-ddff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast TAbort-
TAbort- MAbort+ SERR- PERR-
BridgeCtl: Parity+ SERR+ NoISA- VGA+ MAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: access denied
Kernel driver in use: pcieport

00:03.0 PCI bridge: Intel Corporation Xeon E5/Core i7 IIO PCI Express
Root Port 3a in PCI Express Mode (rev 07) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr+ Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort-
TAbort- MAbort- SERR- PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Bus: primary=00, secondary=04, subordinate=04, sec-latency=0
I/O behind bridge: f000-0fff
Memory behind bridge: fff0-000f
Prefetchable memory behind bridge:
fff0-000f
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast TAbort-
TAbort- MAbort- SERR- PERR-
BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: access denied
Kernel driver in use: pcieport

00:05.0 System peripheral: Intel Corporation Xeon E5/Core i7 Address
Map, VTd_Misc, System Management (rev 07)
Subsystem: Hewlett-Packard Company Device 1589
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort-
TAbort- MAbort- SERR- PERR- INTx-
Capabilities: access denied

00:05.2 System peripheral: Intel Corporation Xeon E5/Core i7 Control
Status and Global Errors (rev 07)
Subsystem: Hewlett-Packard Company Device 1589
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort-
TAbort- MAbort- SERR- PERR- INTx-
Capabilities: access denied

00:05.4 PIC: Intel Corporation Xeon E5/Core i7 I/O APIC (rev 07)
(prog-if 20 [IO(X)-APIC])
Subsystem: Intel Corporation Xeon E5/Core i7 I/O APIC
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort-
TAbort- MAbort- SERR- PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Region 0: Memory at d7346000 (32-bit, non-prefetchable)
[size=4K]
Capabilities: access denied

00:11.0 PCI bridge: Intel Corporation 

Re: [PATCH 0/8] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO

2014-10-22 Thread Joerg Roedel
Hi Bjorn,

On Tue, Oct 21, 2014 at 08:16:46PM -0600, Bjorn Helgaas wrote:
 I was looking at Zhen-Hua's recent patches, trying to figure out if I
 need to do anything with them.  Resetting devices in the old kernel
 seems like a non-starter.  Resetting devices in the new kernel, ...,
 well, maybe.  It seems ugly, and it seems like the sort of problem
 that IOMMUs are designed to solve.

Actually resetting the devices in the kdump kernel would be one of the
better solutions for this problem. When we have a generic way to stop
all in-flight DMA from the PCI endpoints we could safely disable and
then re-enable the IOMMU.

 On Wed, Jul 2, 2014 at 7:32 AM, Joerg Roedel j...@8bytes.org wrote:
  That is a solution to prevent the in-flight DMA failures. But what
  happens when there is some in-flight DMA to a disk to write some inodes
  or a new superblock. Then this scratch address-space may cause
  filesystem corruption at worst.
 
 This in-flight DMA is from a device programmed by the old kernel, and
 it would be reading data from the old kernel's buffers.  I think
 you're suggesting that we might want that DMA read to complete so the
 device can update filesystem metadata?

Well, it is not about updating filesystem metadata. In the kdump kernel
we have these options:

1) Disable the IOMMU. Problem here is, that DMA is now
   untranslated, so that any in-flight DMA might read or write
   from a random location in memory, corrupting the kdump or
   even the new kexec kernel memory. So this is a non-starter.

2) Re-program the IOMMU to block all DMA. This is safer as it
   does not corrupt any data in memory. But some devices react
   very poorly on a master abort from the IOMMU, so bad that the
   driver in the kdump kernel fails to initialize that device.
   In this case taking dump itself might fail (and often does,
   according to reports)

3) To prevent master aborts like in option (2), David suggested
   to map the whole DMA address space to a scratch page. This
   also prevents system memory corruption and the master aborts.
   But the problem is, that in-flight DMA will now read all
   zeros. This can corrupt disk and network data, at worst it
   nulls out the superblocks of your filesystem and you lose all
   data. So this is not an option too.

What we currently do in the VT-d driver is a mixture of (1) and (2). The
VT-d driver disables the IOMMU hardware (opening a race window for
memory data corruption), re-initializes it to reject any ongoing DMA
(which causes master aborts for in-flight DMA) and re-enables the IOMMU
hardware.

But this strategy fails in heavy IO environments quite often and we look
into ways to solve the problem, or at least improve the current
situation as good as we can.

I talked to David about this at LPC and we came up with basically this
procedure:

1. If the VT-d driver finds the IOMMU enabled, it reuses its
   root-context table. This way the IOMMU must not be disabled
   and re-enabled, eliminating the race we have now.
   Other data structures like the context-entries are copied
   over from the old kernel.  We basically keep all mappings
   from the old kernel, allowing any in-flight DMA to succeed.
   No memory or disk data corruption.
   The issue here is, that we modify data from the old kernel
   which is about to be dumped. But there are ways to work
   around that.

2. When a device driver issues the first dma_map command for a
   device, we assign a new and empty page-table, thus removing
   all mappings from the old kernel for the device.
   Rationale is, that at this point the device driver should
   have reset the device to a point where all in-flight DMA is
   canceled.

This approach goes into the same direction as Bill Sumners patch-set,
which Li took over. But it goes not as far as keeping the old mappings
while the kdump kernel is still working with the devices (which might
introduce new issues and corner cases).

  So with this in mind I would prefer initially taking over the
  page-tables from the old kernel before the device drivers re-initialize
  the devices.
 
 This makes the dump kernel more dependent on data from the old kernel,
 which we obviously want to avoid when possible.

Sure, but this is not really possible here (unless we have a generic and
reliable way to reset all PCI endpoint devices and cancel all in-flight
DMA before we disable the IOMMU in the kdump kernel).
Otherwise we always risk data corruption somewhere, in system memory or
on disk.

 I didn't find the previous discussion where pointing every virtual bus
 address at the same physical scratch page was proposed.  Why was that
 better than programming the IOMMU to reject every DMA?

As I said, the problem is that this 

Re: [PATCH 12/17] iommu/omap: Integrate omap-iommu-debug into omap-iommu

2014-10-22 Thread Joerg Roedel
On Tue, Oct 21, 2014 at 04:28:27PM -0500, Suman Anna wrote:
 I am looking to refresh this series for 3.19, and this is the only patch
 that may need some changes. Please let me know what your preference is,
 and I can rework this patch if needed. Either way,
 the plan is to not have an OMAP IOMMU debugfs module, but only to
 have an option not to build the debugfs portions.

Can I apply this series without this patch and you send me an updated
version of only this patch? If yes, please let me know, otherwise please
rebase the series on v3.18-rc1 and resend.

Thanks,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 02/18] vfio: platform: probe to devices on the platform bus

2014-10-22 Thread Antonios Motakis
On Tue, Oct 21, 2014 at 6:37 PM, Eric Auger eric.au...@linaro.org wrote:
 On 10/21/2014 06:17 PM, Alex Williamson wrote:
 On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote:
 Driver to bind to Linux platform devices, and callbacks to discover their
 resources to be used by the main VFIO PLATFORM code.

 Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com
 ---
  drivers/vfio/platform/vfio_platform.c | 107 
 ++
  include/uapi/linux/vfio.h |   1 +
  2 files changed, 108 insertions(+)
  create mode 100644 drivers/vfio/platform/vfio_platform.c

 diff --git a/drivers/vfio/platform/vfio_platform.c 
 b/drivers/vfio/platform/vfio_platform.c
 new file mode 100644
 index 000..baeb7da
 --- /dev/null
 +++ b/drivers/vfio/platform/vfio_platform.c
 @@ -0,0 +1,107 @@
 +/*
 + * Copyright (C) 2013 - Virtual Open Systems
 + * Author: Antonios Motakis a.mota...@virtualopensystems.com
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License, version 2, as
 + * published by the Free Software Foundation.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 + */
 +
 +#include linux/device.h
 +#include linux/eventfd.h
 +#include linux/interrupt.h
 +#include linux/iommu.h
 +#include linux/module.h
 +#include linux/mutex.h
 +#include linux/notifier.h
 +#include linux/pm_runtime.h
 +#include linux/slab.h
 +#include linux/types.h
 +#include linux/uaccess.h
 +#include linux/vfio.h
 +#include linux/io.h
 +#include linux/platform_device.h
 +#include linux/irq.h
 +
 +#include vfio_platform_private.h
 +
 +#define DRIVER_VERSION  0.8
 +#define DRIVER_AUTHOR   Antonios Motakis 
 a.mota...@virtualopensystems.com
 +#define DRIVER_DESC VFIO for platform devices - User Level 
 meta-driver
 +
 +/* probing devices from the linux platform bus */
 +
 +static struct resource *get_platform_resource(struct vfio_platform_device 
 *vdev,
 +int num)
 +{
 +struct platform_device *dev = (struct platform_device *) vdev-opaque;
 +int i;
 +
 +for (i = 0; i  dev-num_resources; i++) {
 +struct resource *r = dev-resource[i];
 +
 +if (resource_type(r)  (IORESOURCE_MEM|IORESOURCE_IO)) {
 +num--;
 +
 +if (!num)
 +return r;

 Has this been tested?  What happens when we call this with num = 0?
 Yep. I confirm I enter that case with my xgmac where the IORESOURCE_MEM
 is the 1st resource. I Just ended to the same cause ;-)

Right, num-- should be after the check, or we miss one region.


 Best Regards

 Eric

 +}
 +}
 +return NULL;
 +}
 +
 +static int get_platform_irq(struct vfio_platform_device *vdev, int i)
 +{
 +struct platform_device *pdev = (struct platform_device *) vdev-opaque;
 +
 +return platform_get_irq(pdev, i);
 +}
 +
 +
 +static int vfio_platform_probe(struct platform_device *pdev)
 +{
 +struct vfio_platform_device *vdev;
 +int ret;
 +
 +vdev = kzalloc(sizeof(*vdev), GFP_KERNEL);
 +if (!vdev)
 +return -ENOMEM;
 +
 +vdev-opaque = (void *) pdev;
 +vdev-name = pdev-name;
 +vdev-flags = VFIO_DEVICE_FLAGS_PLATFORM;
 +vdev-get_resource = get_platform_resource;
 +vdev-get_irq = get_platform_irq;
 +
 +ret = vfio_platform_probe_common(vdev, pdev-dev);
 +if (ret)
 +kfree(vdev);
 +
 +return ret;
 +}
 +
 +static int vfio_platform_remove(struct platform_device *pdev)
 +{
 +return vfio_platform_remove_common(pdev-dev);
 +}
 +
 +static struct platform_driver vfio_platform_driver = {
 +.probe  = vfio_platform_probe,
 +.remove = vfio_platform_remove,
 +.driver = {
 +.name   = vfio-platform,
 +.owner  = THIS_MODULE,
 +},
 +};
 +
 +module_platform_driver(vfio_platform_driver);
 +
 +MODULE_VERSION(DRIVER_VERSION);
 +MODULE_LICENSE(GPL v2);
 +MODULE_AUTHOR(DRIVER_AUTHOR);
 +MODULE_DESCRIPTION(DRIVER_DESC);
 diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
 index 111b5e8..aca6d3e 100644
 --- a/include/uapi/linux/vfio.h
 +++ b/include/uapi/linux/vfio.h
 @@ -158,6 +158,7 @@ struct vfio_device_info {
  __u32   flags;
  #define VFIO_DEVICE_FLAGS_RESET (1  0)/* Device supports 
 reset */
  #define VFIO_DEVICE_FLAGS_PCI   (1  1)/* vfio-pci device */
 +#define VFIO_DEVICE_FLAGS_PLATFORM (1  2) /* vfio-platform device */
  __u32   num_regions;/* Max region index + 1 */
  __u32   num_irqs;   /* Max IRQ index + 1 */
  };







-- 
Antonios Motakis
Virtual Open Systems
___
iommu mailing list
iommu@lists.linux-foundation.org

Re: [PATCH v8 07/18] vfio/platform: return info for device memory mapped IO regions

2014-10-22 Thread Antonios Motakis
On Tue, Oct 21, 2014 at 6:34 PM, Alex Williamson
alex.william...@redhat.com wrote:
 On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote:
 This patch enables the IOCTLs VFIO_DEVICE_GET_REGION_INFO ioctl call,
 which allows the user to learn about the available MMIO resources of
 a device.

 Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com
 ---
  drivers/vfio/platform/vfio_platform_common.c  | 93 
 +--
  drivers/vfio/platform/vfio_platform_private.h | 22 +++
  2 files changed, 111 insertions(+), 4 deletions(-)

 diff --git a/drivers/vfio/platform/vfio_platform_common.c 
 b/drivers/vfio/platform/vfio_platform_common.c
 index 1e4073f..8a7e474 100644
 --- a/drivers/vfio/platform/vfio_platform_common.c
 +++ b/drivers/vfio/platform/vfio_platform_common.c
 @@ -27,17 +27,84 @@

  #include vfio_platform_private.h

 +static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
 +{
 + int cnt = 0, i;
 +
 + while (vdev-get_resource(vdev, cnt))
 + cnt++;
 +
 + vdev-regions = kcalloc(cnt, sizeof(struct vfio_platform_region),
 + GFP_KERNEL);
 + if (!vdev-regions)
 + return -ENOMEM;
 +
 + for (i = 0; i  cnt;  i++) {
 + struct resource *res =
 + vdev-get_resource(vdev, i);
 +
 + if (!res)
 + goto err;
 +
 + vdev-regions[i].addr = res-start;
 + vdev-regions[i].size = resource_size(res);
 + vdev-regions[i].flags = 0;
 +
 + switch (resource_type(res)) {
 + case IORESOURCE_MEM:
 + vdev-regions[i].type = VFIO_PLATFORM_REGION_TYPE_MMIO;
 + break;
 + case IORESOURCE_IO:
 + vdev-regions[i].type = VFIO_PLATFORM_REGION_TYPE_PIO;
 + break;

 Ok, we have support for PIO in platform now (thanks!), does the user
 know what type of region they're dealing with?  Do they care?  For PCI
 the user tests the PCI BAR in config space to determine which type it
 is.  I'm guessing that platform would do something similar against the
 device tree or ACPI, right?

Maybe is worthwhile to add an explicit flag in
VFIO_DEVICE_GET_REGION_INFO for PIO regions. For platform devices I
don't know if we can always rely on DT or ACPI info to be available.
For VFIO PCI the BAR is always implemented, and while I have proposed
an RFC to return DT information, I don't think we can assume how a
device is described in the host, whether DT, ACPI, or dark magic.


 + default:
 + goto err;
 + }
 + }
 +
 + vdev-num_regions = cnt;
 +
 + return 0;
 +err:
 + kfree(vdev-regions);
 + return -EINVAL;
 +}
 +
 +static void vfio_platform_regions_cleanup(struct vfio_platform_device *vdev)
 +{
 + vdev-num_regions = 0;
 + kfree(vdev-regions);
 +}
 +
  static void vfio_platform_release(void *device_data)
  {
 + struct vfio_platform_device *vdev = device_data;
 +
 + if (atomic_dec_and_test(vdev-refcnt))
 + vfio_platform_regions_cleanup(vdev);
 +
   module_put(THIS_MODULE);
  }

  static int vfio_platform_open(void *device_data)
  {
 + struct vfio_platform_device *vdev = device_data;
 + int ret;
 +
   if (!try_module_get(THIS_MODULE))
   return -ENODEV;

 + if (atomic_inc_return(vdev-refcnt) == 1) {
 + ret = vfio_platform_regions_init(vdev);
 + if (ret)
 + goto err_reg;
 + }
 +
   return 0;
 +
 +err_reg:
 + module_put(THIS_MODULE);
 + return ret;

 Note that if vfio_platform_regions_init() fails then your refcnt is
 wrong.  We switched to a mutex in vfio-pci to avoid this.  See
 61d792562b53.

Ack.


  }

  static long vfio_platform_ioctl(void *device_data,
 @@ -58,15 +125,33 @@ static long vfio_platform_ioctl(void *device_data,
   return -EINVAL;

   info.flags = vdev-flags;
 - info.num_regions = 0;
 + info.num_regions = vdev-num_regions;
   info.num_irqs = 0;

   return copy_to_user((void __user *)arg, info, minsz);

 - } else if (cmd == VFIO_DEVICE_GET_REGION_INFO)
 - return -EINVAL;
 + } else if (cmd == VFIO_DEVICE_GET_REGION_INFO) {
 + struct vfio_region_info info;
 +
 + minsz = offsetofend(struct vfio_region_info, offset);
 +
 + if (copy_from_user(info, (void __user *)arg, minsz))
 + return -EFAULT;
 +
 + if (info.argsz  minsz)
 + return -EINVAL;
 +
 + if (info.index = vdev-num_regions)
 + return -EINVAL;
 +
 + /* map offset to the physical address  */
 + info.offset = VFIO_PLATFORM_INDEX_TO_OFFSET(info.index);
 + info.size = vdev-regions[info.index].size;
 + info.flags = 

Re: [PATCH v8 09/18] vfio/platform: support MMAP of MMIO regions

2014-10-22 Thread Antonios Motakis
On Tue, Oct 21, 2014 at 6:51 PM, Alex Williamson
alex.william...@redhat.com wrote:
 On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote:
 Allow to memory map the MMIO regions of the device so userspace can
 directly access them. PIO regions are not being handled at this point.

 Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com
 ---
  drivers/vfio/platform/vfio_platform_common.c | 57 
 
  1 file changed, 57 insertions(+)

 diff --git a/drivers/vfio/platform/vfio_platform_common.c 
 b/drivers/vfio/platform/vfio_platform_common.c
 index ac74710..4db7187 100644
 --- a/drivers/vfio/platform/vfio_platform_common.c
 +++ b/drivers/vfio/platform/vfio_platform_common.c
 @@ -57,6 +57,16 @@ static int vfio_platform_regions_init(struct 
 vfio_platform_device *vdev)
   if (!(res-flags  IORESOURCE_READONLY))
   vdev-regions[i].flags |=
   VFIO_REGION_INFO_FLAG_WRITE;
 +
 + /*
 +  * Only regions addressed with PAGE granularity may be
 +  * MMAPed securely.
 +  */
 + if (!(vdev-regions[i].addr  ~PAGE_MASK) 
 + !(vdev-regions[i].size  ~PAGE_MASK))
 + vdev-regions[i].flags |=
 + VFIO_REGION_INFO_FLAG_MMAP;
 +

 Should this be included in the above !readonly test?  I don't see that
 we're doing anything below that would prevent writes to the mmap for a
 readonly resource.  I suspect that just like PCI, it's not all that
 useful to provide mmap support for read-only regions.  They're not
 typically performance paths.

Indeed. Alternatively we could just MMAP the region as read only as
well. Even if typically they are not performance critical areas, maybe
it doesn't hurt to allow MMAP when possible.


   break;
   case IORESOURCE_IO:
   vdev-regions[i].type = VFIO_PLATFORM_REGION_TYPE_PIO;
 @@ -325,8 +335,55 @@ static ssize_t vfio_platform_write(void *device_data, 
 const char __user *buf,
   return -EINVAL;
  }

 +static int vfio_platform_mmap_mmio(struct vfio_platform_region region,
 +struct vm_area_struct *vma)
 +{
 + u64 req_len, pgoff, req_start;
 +
 + req_len = vma-vm_end - vma-vm_start;
 + pgoff = vma-vm_pgoff 
 + ((1U  (VFIO_PLATFORM_OFFSET_SHIFT - PAGE_SHIFT)) - 1);
 + req_start = pgoff  PAGE_SHIFT;
 +
 + if (region.size  PAGE_SIZE || req_start + req_len  region.size)
 + return -EINVAL;
 +
 + vma-vm_page_prot = pgprot_noncached(vma-vm_page_prot);
 + vma-vm_pgoff = (region.addr  PAGE_SHIFT) + pgoff;
 +
 + return remap_pfn_range(vma, vma-vm_start, vma-vm_pgoff,
 +req_len, vma-vm_page_prot);
 +}
 +
  static int vfio_platform_mmap(void *device_data, struct vm_area_struct *vma)
  {
 + struct vfio_platform_device *vdev = device_data;
 + unsigned int index;
 +
 + index = vma-vm_pgoff  (VFIO_PLATFORM_OFFSET_SHIFT - PAGE_SHIFT);
 +
 + if (vma-vm_end  vma-vm_start)
 + return -EINVAL;
 + if ((vma-vm_flags  VM_SHARED) == 0)
 + return -EINVAL;
 + if (index = vdev-num_regions)
 + return -EINVAL;
 + if (vma-vm_start  ~PAGE_MASK)
 + return -EINVAL;
 + if (vma-vm_end  ~PAGE_MASK)
 + return -EINVAL;
 +
 + if (!(vdev-regions[index].flags  VFIO_REGION_INFO_FLAG_MMAP))
 + return -EINVAL;
 +
 + vma-vm_private_data = vdev;
 +
 + if (vdev-regions[index].type  VFIO_PLATFORM_REGION_TYPE_MMIO)
 + return vfio_platform_mmap_mmio(vdev-regions[index], vma);
 +
 + else if (vdev-regions[index].type  VFIO_PLATFORM_REGION_TYPE_PIO)
 + return -EINVAL; /* not implemented */
 +
   return -EINVAL;
  }







-- 
Antonios Motakis
Virtual Open Systems
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 13/18] vfio/platform: support for maskable and automasked interrupts

2014-10-22 Thread Antonios Motakis
On Tue, Oct 21, 2014 at 7:47 PM, Alex Williamson
alex.william...@redhat.com wrote:
 On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote:
 Adds support to mask interrupts, and also for automasked interrupts.
 Level sensitive interrupts are exposed as automasked interrupts and
 are masked and disabled automatically when they fire.

 Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com
 ---
  drivers/vfio/platform/vfio_platform_irq.c | 94 
 +--
  drivers/vfio/platform/vfio_platform_private.h |  2 +
  2 files changed, 91 insertions(+), 5 deletions(-)

 diff --git a/drivers/vfio/platform/vfio_platform_irq.c 
 b/drivers/vfio/platform/vfio_platform_irq.c
 index 4359b9c..7620a17 100644
 --- a/drivers/vfio/platform/vfio_platform_irq.c
 +++ b/drivers/vfio/platform/vfio_platform_irq.c
 @@ -31,27 +31,103 @@

  #include vfio_platform_private.h

 +static void vfio_platform_mask(struct vfio_platform_irq *irq_ctx)
 +{
 + unsigned long flags;
 +
 + spin_lock_irqsave(irq_ctx-lock, flags);
 +
 + if (!irq_ctx-masked) {
 + disable_irq(irq_ctx-hwirq);
 + irq_ctx-masked = true;
 + }
 +
 + spin_unlock_irqrestore(irq_ctx-lock, flags);
 +}
 +
  static int vfio_platform_set_irq_mask(struct vfio_platform_device *vdev,
   unsigned index, unsigned start,
   unsigned count, uint32_t flags, void *data)
  {
 - return -EINVAL;
 + if (start != 0 || count != 1)
 + return -EINVAL;
 +
 + if (flags  VFIO_IRQ_SET_DATA_EVENTFD)
 + return -EINVAL; /* not implemented yet */
 +
 + if (flags  VFIO_IRQ_SET_DATA_NONE) {
 + vfio_platform_mask(vdev-irqs[index]);
 +
 + } else if (flags  VFIO_IRQ_SET_DATA_BOOL) {
 + uint8_t mask = *(uint8_t *)data;
 +
 + if (mask)
 + vfio_platform_mask(vdev-irqs[index]);
 + }
 +
 + return 0;
 +}
 +
 +static void vfio_platform_unmask(struct vfio_platform_irq *irq_ctx)
 +{
 + unsigned long flags;
 +
 + spin_lock_irqsave(irq_ctx-lock, flags);
 +
 + if (irq_ctx-masked) {
 + enable_irq(irq_ctx-hwirq);
 + irq_ctx-masked = false;
 + }
 +
 + spin_unlock_irqrestore(irq_ctx-lock, flags);
  }

  static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
   unsigned index, unsigned start,
   unsigned count, uint32_t flags, void *data)
  {
 - return -EINVAL;
 + if (start != 0 || count != 1)
 + return -EINVAL;
 +
 + if (flags  VFIO_IRQ_SET_DATA_EVENTFD)
 + return -EINVAL; /* not implemented yet */
 +
 + if (flags  VFIO_IRQ_SET_DATA_NONE) {
 + vfio_platform_unmask(vdev-irqs[index]);
 +
 + } else if (flags  VFIO_IRQ_SET_DATA_BOOL) {
 + uint8_t unmask = *(uint8_t *)data;
 +
 + if (unmask)
 + vfio_platform_unmask(vdev-irqs[index]);
 + }
 +
 + return 0;
  }

  static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
  {
   struct vfio_platform_irq *irq_ctx = dev_id;
 + unsigned long flags;
 + int ret = IRQ_NONE;

 - eventfd_signal(irq_ctx-trigger, 1);
 + spin_lock_irqsave(irq_ctx-lock, flags);
 +
 + if (!irq_ctx-masked) {
 + ret = IRQ_HANDLED;
 +
 + if (irq_ctx-flags  VFIO_IRQ_INFO_AUTOMASKED) {
 + disable_irq_nosync(irq_ctx-hwirq);
 + irq_ctx-masked = true;
 + }
 + }

 - return IRQ_HANDLED;
 + spin_unlock_irqrestore(irq_ctx-lock, flags);
 +
 + if (ret == IRQ_HANDLED)
 + eventfd_signal(irq_ctx-trigger, 1);
 +
 + return ret;
  }

 If you actually have edge interrupts, you're unnecessarily penalizing
 them with the spinlock here.  You could do like vfio-pci and only
 advertise level interrupts as maskable then use separate edge vs level
 handlers.


Ok, I shall implement separate handlers then.


  static int vfio_set_trigger(struct vfio_platform_device *vdev,
 @@ -169,9 +245,17 @@ int vfio_platform_irq_init(struct vfio_platform_device 
 *vdev)
   if (hwirq  0)
   goto err;

 - vdev-irqs[i].flags = VFIO_IRQ_INFO_EVENTFD;
 + spin_lock_init(vdev-irqs[i].lock);
 +
 + vdev-irqs[i].flags = VFIO_IRQ_INFO_EVENTFD
 + | VFIO_IRQ_INFO_MASKABLE;
 +
 + if (irq_get_trigger_type(hwirq)  IRQ_TYPE_LEVEL_MASK)
 + vdev-irqs[i].flags |= VFIO_IRQ_INFO_AUTOMASKED;
 +
   vdev-irqs[i].count = 1;
   vdev-irqs[i].hwirq = hwirq;
 + vdev-irqs[i].masked = false;
   }

   vdev-num_irqs = cnt;
 diff --git a/drivers/vfio/platform/vfio_platform_private.h 
 b/drivers/vfio/platform/vfio_platform_private.h
 index 47af6e0..65e80e7 100644
 --- 

Re: [PATCH v8 14/18] vfio: move eventfd support code for VFIO_PCI to a separate file

2014-10-22 Thread Antonios Motakis
On Tue, Oct 21, 2014 at 7:55 PM, Alex Williamson
alex.william...@redhat.com wrote:
 On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote:
 The virqfd functionality that is used by VFIO_PCI to implement interrupt
 masking and unmasking via an eventfd, is generic enough and can be reused
 by another driver. Move it to a separate file in order to allow the code
 to be shared.

 Also properly export virqfd_enable and virqfd_disable in the process.

 Let's be friendly to the global namespace and add vfio_ prefixes to
 these as we export them.


Ack.



-- 
Antonios Motakis
Virtual Open Systems
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/5] vfio: type1: replace domain wide protection flags with supported capabilities

2014-10-22 Thread Antonios Motakis
On Wed, Oct 22, 2014 at 11:08 AM, Eric Auger eric.au...@linaro.org wrote:
 On 10/13/2014 03:09 PM, Antonios Motakis wrote:
 VFIO_IOMMU_TYPE1 keeps track for each domain it knows a list of protection
 flags it always applies to all mappings in the domain. This is used for
 domains that support IOMMU_CAP_CACHE_COHERENCY.

 Refactor this slightly, by keeping track instead that a given domain
 supports the capability, and applying the IOMMU_CACHE protection flag when
 doing the actual DMA mappings.

 This will allow us to reuse the behavior for IOMMU_CAP_NOEXEC, which we
 also want to keep track of, but without applying it to all domains that
 support it unless the user explicitly requests it.

 Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com
 ---
  drivers/vfio/vfio_iommu_type1.c | 25 +
  1 file changed, 17 insertions(+), 8 deletions(-)

 diff --git a/drivers/vfio/vfio_iommu_type1.c 
 b/drivers/vfio/vfio_iommu_type1.c
 index 562f686..62a8b4d 100644
 --- a/drivers/vfio/vfio_iommu_type1.c
 +++ b/drivers/vfio/vfio_iommu_type1.c
 @@ -64,7 +64,7 @@ struct vfio_domain {
   struct iommu_domain *domain;
   struct list_headnext;
   struct list_headgroup_list;
 - int prot;   /* IOMMU_CACHE */
 + int caps;
  };

  struct vfio_dma {
 @@ -485,7 +485,7 @@ static int map_try_harder(struct vfio_domain *domain, 
 dma_addr_t iova,
   for (i = 0; i  npage; i++, pfn++, iova += PAGE_SIZE) {
   ret = iommu_map(domain-domain, iova,
   (phys_addr_t)pfn  PAGE_SHIFT,
 - PAGE_SIZE, prot | domain-prot);
 + PAGE_SIZE, prot);
   if (ret)
   break;
   }
 @@ -503,11 +503,16 @@ static int vfio_iommu_map(struct vfio_iommu *iommu, 
 dma_addr_t iova,
   int ret;

   list_for_each_entry(d, iommu-domain_list, next) {
 + int dprot = prot;
 +
 + if (d-caps | IOMMU_CAP_CACHE_COHERENCY)
 should be 
 + dprot |= IOMMU_CACHE;
 +
   ret = iommu_map(d-domain, iova, (phys_addr_t)pfn  
 PAGE_SHIFT,
 - npage  PAGE_SHIFT, prot | d-prot);
 + npage  PAGE_SHIFT, dprot);
   if (ret) {
   if (ret != -EBUSY ||
 - map_try_harder(d, iova, pfn, npage, prot))
 + map_try_harder(d, iova, pfn, npage, dprot))
   goto unwind;
   }
   }
 @@ -620,6 +625,10 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu,
   struct vfio_domain *d;
   struct rb_node *n;
   int ret;
 + int dprot = 0;
 +
 + if (domain-caps | IOMMU_CAP_CACHE_COHERENCY)
 same to be fixed here.

 With the 3 | corrections and num-- fix in get_platform_resource, v8
 is functional with Calxeda xgmac QEMU VFIO device.


Ack.

 Best Regards

 Eric
 + dprot |= IOMMU_CACHE;

   /* Arbitrarily pick the first domain in the list for lookups */
   d = list_first_entry(iommu-domain_list, struct vfio_domain, next);
 @@ -653,7 +662,7 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu,
   size += PAGE_SIZE;

   ret = iommu_map(domain-domain, iova, phys,
 - size, dma-prot | domain-prot);
 + size, dma-prot | dprot);
   if (ret)
   return ret;

 @@ -721,7 +730,7 @@ static int vfio_iommu_type1_attach_group(void 
 *iommu_data,
   }

   if (iommu_capable(bus, IOMMU_CAP_CACHE_COHERENCY))
 - domain-prot |= IOMMU_CACHE;
 + domain-caps |= IOMMU_CAP_CACHE_COHERENCY;

   /*
* Try to match an existing compatible domain.  We don't want to
 @@ -732,7 +741,7 @@ static int vfio_iommu_type1_attach_group(void 
 *iommu_data,
*/
   list_for_each_entry(d, iommu-domain_list, next) {
   if (d-domain-ops == domain-domain-ops 
 - d-prot == domain-prot) {
 + d-caps == domain-caps) {
   iommu_detach_group(domain-domain, iommu_group);
   if (!iommu_attach_group(d-domain, iommu_group)) {
   list_add(group-next, d-group_list);
 @@ -865,7 +874,7 @@ static int vfio_domains_have_iommu_cache(struct 
 vfio_iommu *iommu)

   mutex_lock(iommu-lock);
   list_for_each_entry(domain, iommu-domain_list, next) {
 - if (!(domain-prot  IOMMU_CACHE)) {
 + if (!(domain-caps  IOMMU_CAP_CACHE_COHERENCY)) {
   ret = 0;
   break;
   }





-- 
Antonios Motakis
Virtual Open Systems
___
iommu mailing list
iommu@lists.linux-foundation.org

Re: [PATCH v2 11/18] iommu: exynos: remove useless device_add/remove callbacks

2014-10-22 Thread Alban Browaeys
Le mardi 16 septembre 2014 à 13:54 +0200, Marek Szyprowski a écrit :
 The driver doesn't need to do anything important in device add/remove
 callbacks, because initialization will be done from device-tree specific
 callbacks added later. IOMMU groups created by current code were never
 used.
 
 Signed-off-by: Marek Szyprowski m.szyprow...@samsung.com


The exyons iommu init fails if those are removed, that is it never reach
init_done:
1. exynos_iommu_setup
2. - exynos_iommu_init
3. ---bus_set_iommu
4. -- add_iommu_group

that is (4) add_iommu_group returns ENODEV to bus_set_iommu, the latter
doing so to exynos_iommu_init. Which thus error out before the init_done
is set to true.

Mind restoring device_add/remove  is not that easy. Either I did not and
modified add_iommu_group to return 0 if add_device was not defined. This
works.
Otherwise I did revert this patch but had to also move the iommu_init
from iommu.c before of_iommu_init in arch/arm/kernel/setup.c (switching
iommu_init to global namespace in the process). This to avoid use of not
yet initialized mutex  iommu_group_mutex and crash that follow suit.
This mutex is used in iommu_group_add_device called by
exynos_iommu_add_device.

The logs of the first issue, ie exynos_iommo_init and bus_set_iommu
returns ENODEV

[0.602816] exynos_iommu_init: Failed to register exynos-iommu
driver.
[0.603358] platform 1362.sysmmu: device is not dma coherent
[0.603384] platform 1362.sysmmu: device is not behind an iommu
[0.605243] exynos-sysmmu 1362.sysmmu: genpd_dev_pm_attach()
failed to find PM domain: -2
[0.606399] exynos_iommu_init: Failed to register exynos-iommu
driver.
[0.607263] platform 1363.sysmmu: device is not dma coherent
[0.607290] platform 1363.sysmmu: device is not behind an iommu
[0.609111] exynos-sysmmu 1362.sysmmu: genpd_dev_pm_attach()
failed to find PM domain: -2
[0.609464] exynos-sysmmu 1362.sysmmu: Unbalanced
pm_runtime_enable!
[0.609868] exynos-sysmmu 1363.sysmmu: genpd_dev_pm_attach()
failed to find PM domain: -2
(...)
[0.946506] platform 1080.g2d: device is not dma coherent
[0.946553] platform 1080.g2d: device is not behind an iommu
(...)



Mind the power domain failure comes from me not finding a way to apply
the power domain registration this early yet .


Best regards,
Alban Broweys


 ---
  drivers/iommu/exynos-iommu.c | 28 
  1 file changed, 28 deletions(-)
 
 diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
 index b271348a4ec1..1b3f00726cd4 100644
 --- a/drivers/iommu/exynos-iommu.c
 +++ b/drivers/iommu/exynos-iommu.c
 @@ -1055,32 +1055,6 @@ static phys_addr_t exynos_iommu_iova_to_phys(struct 
 iommu_domain *domain,
   return phys;
  }
  
 -static int exynos_iommu_add_device(struct device *dev)
 -{
 - struct iommu_group *group;
 - int ret;
 -
 - group = iommu_group_get(dev);
 -
 - if (!group) {
 - group = iommu_group_alloc();
 - if (IS_ERR(group)) {
 - dev_err(dev, Failed to allocate IOMMU group\n);
 - return PTR_ERR(group);
 - }
 - }
 -
 - ret = iommu_group_add_device(group, dev);
 - iommu_group_put(group);
 -
 - return ret;
 -}
 -
 -static void exynos_iommu_remove_device(struct device *dev)
 -{
 - iommu_group_remove_device(dev);
 -}
 -
  static const struct iommu_ops exynos_iommu_ops = {
   .domain_init = exynos_iommu_domain_init,
   .domain_destroy = exynos_iommu_domain_destroy,
 @@ -1089,8 +1063,6 @@ static const struct iommu_ops exynos_iommu_ops = {
   .map = exynos_iommu_map,
   .unmap = exynos_iommu_unmap,
   .iova_to_phys = exynos_iommu_iova_to_phys,
 - .add_device = exynos_iommu_add_device,
 - .remove_device = exynos_iommu_remove_device,
   .pgsize_bitmap = SECT_SIZE | LPAGE_SIZE | SPAGE_SIZE,
  };
  


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH linux-next] iommu: add iommu for s390 platform

2014-10-22 Thread Frank Blaschka
Add a basic iommu for the s390 platform. The code is pretty simple
since on s390 each PCI device has its own virtual io address space
starting at the same vio address. For this a domain could hold only
one pci device. Also there is no relation between pci devices so each
device belongs to a separate iommu group.

Signed-off-by: Frank Blaschka blasc...@linux.vnet.ibm.com
---
 MAINTAINERS |   8 ++
 arch/s390/include/asm/pci.h |   3 +
 arch/s390/pci/pci_dma.c |  19 -
 drivers/iommu/Kconfig   |   9 +++
 drivers/iommu/Makefile  |   1 +
 drivers/iommu/s390-iommu.c  | 179 
 6 files changed, 217 insertions(+), 2 deletions(-)
 create mode 100644 drivers/iommu/s390-iommu.c

diff --git a/MAINTAINERS b/MAINTAINERS
index bc69ca4..a3ba11b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7935,6 +7935,14 @@ F:   drivers/s390/net/*iucv*
 F: include/net/iucv/
 F: net/iucv/
 
+S390 IOMMU
+M:  Frank Blaschka blasc...@linux.vnet.ibm.com
+M:  linux...@de.ibm.com
+L:  linux-s...@vger.kernel.org
+W:  http://www.ibm.com/developerworks/linux/linux390/
+S:  Supported
+F:  drivers/iommu/s390-iommu.c
+
 S3C24XX SD/MMC Driver
 M: Ben Dooks ben-li...@fluff.org
 L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index c030900..6790d0d 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -177,6 +177,9 @@ struct zpci_dev *get_zdev_by_fid(u32);
 /* DMA */
 int zpci_dma_init(void);
 void zpci_dma_exit(void);
+int dma_update_trans(struct zpci_dev *zdev, unsigned long pa,
+dma_addr_t dma_addr, size_t size, int flags);
+void dma_purge_rto_entries(struct zpci_dev *zdev);
 
 /* FMB */
 int zpci_fmb_enable_device(struct zpci_dev *);
diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index 4cbb29a..a4db33e 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -139,8 +139,8 @@ static void dma_update_cpu_trans(struct zpci_dev *zdev, 
void *page_addr,
entry_clr_protected(entry);
 }
 
-static int dma_update_trans(struct zpci_dev *zdev, unsigned long pa,
-   dma_addr_t dma_addr, size_t size, int flags)
+int dma_update_trans(struct zpci_dev *zdev, unsigned long pa,
+dma_addr_t dma_addr, size_t size, int flags)
 {
unsigned int nr_pages = PAGE_ALIGN(size)  PAGE_SHIFT;
u8 *page_addr = (u8 *) (pa  PAGE_MASK);
@@ -210,6 +210,21 @@ static void dma_cleanup_tables(struct zpci_dev *zdev)
zdev-dma_table = NULL;
 }
 
+void dma_purge_rto_entries(struct zpci_dev *zdev)
+{
+   unsigned long *table;
+   int rtx;
+
+   if (!zdev || !zdev-dma_table)
+   return;
+   table = zdev-dma_table;
+   for (rtx = 0; rtx  ZPCI_TABLE_ENTRIES; rtx++)
+   if (reg_entry_isvalid(table[rtx])) {
+   dma_free_seg_table(table[rtx]);
+   invalidate_table_entry(table[rtx]);
+   }
+}
+
 static unsigned long __dma_alloc_iommu(struct zpci_dev *zdev,
   unsigned long start, int size)
 {
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index dd51122..545e3fd 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -302,4 +302,13 @@ config ARM_SMMU
  Say Y here if your SoC includes an IOMMU device implementing
  the ARM SMMU architecture.
 
+config S390_IOMMU
+bool s390 IOMMU Support
+depends on S390  PCI
+select IOMMU_API
+help
+  Support for the IBM s/390 IOMMU
+
+  If unsure, say N here.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 16edef7..1278aad 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -19,3 +19,4 @@ obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_SHMOBILE_IOMMU) += shmobile-iommu.o
 obj-$(CONFIG_SHMOBILE_IPMMU) += shmobile-ipmmu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
+obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
new file mode 100644
index 000..f9f048d
--- /dev/null
+++ b/drivers/iommu/s390-iommu.c
@@ -0,0 +1,179 @@
+#include linux/io.h
+#include linux/interrupt.h
+#include linux/platform_device.h
+#include linux/slab.h
+#include linux/pm_runtime.h
+#include linux/clk.h
+#include linux/err.h
+#include linux/mm.h
+#include linux/iommu.h
+#include linux/errno.h
+#include linux/list.h
+#include linux/memblock.h
+#include linux/export.h
+#include linux/pci.h
+#include linux/sizes.h
+#include asm/pci_dma.h
+
+#define S390_IOMMU_PGSIZES SZ_4K
+
+struct s390_domain {
+   struct zpci_dev *zdev;
+};
+
+static int s390_iommu_domain_init(struct iommu_domain *domain)
+{
+   struct s390_domain *priv;
+
+   priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+   

Re: [PATCH v5 1/3] iommu/rockchip: rk3288 iommu driver

2014-10-22 Thread Joerg Roedel
On Tue, Oct 14, 2014 at 04:02:40PM +0800, Daniel Kurtz wrote:
 +static void rk_iommu_detach_device(struct iommu_domain *domain,
 +struct device *dev)
 +{
 + struct rk_iommu *iommu = dev_get_drvdata(dev-archdata.iommu);
 + struct rk_iommu_domain *rk_domain = domain-priv;
 + unsigned long flags;
 +
 + /* Allow 'virtual devices' (eg drm) to detach from domain */
 + if (!iommu)
 + return;
 +
 + iommu-domain = NULL;

I guess this line is a left-over? Setting iommu-domain to NULL here
before you disabled the IOMMU interrupt is racy. To be fully secure, you
should make sure that no interrupt handler is still running after you
disabled the IOMMU irq and before setting iommu-domain = NULL.

Other than that the code looks good.


Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] Rockchip IOMMU driver and devicetree bindings

2014-10-22 Thread Joerg Roedel
On Wed, Oct 01, 2014 at 06:20:40PM +0800, Daniel Kurtz wrote:
 Add a driver and devicetree bindings for the IOMMU found in Rockchip RK3288
 SoCs.
 
 Daniel Kurtz (3):
   iommu/rockchip: rk3288 iommu driver
   dt-bindings: iommu: Add documentation for rockchip iommu
   ARM: dts: rk3288: add VOP iommu nodes
 
  .../devicetree/bindings/iommu/rockchip,iommu.txt   |  26 +
  arch/arm/boot/dts/rk3288.dtsi  |  18 +
  drivers/iommu/Kconfig  |  11 +
  drivers/iommu/Makefile |   1 +
  drivers/iommu/rockchip-iommu.c | 924 
 +
  5 files changed, 980 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/iommu/rockchip,iommu.txt
  create mode 100644 drivers/iommu/rockchip-iommu.c

Please make sure you also get Acks for the DT bindings. With that and
the issue I mentioned in the other email fixed, I will apply these
patches?


Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] iommu: replace IOMMU_EXEC with IOMMU_EXEC and update ARM SMMU driver

2014-10-22 Thread jroe...@suse.de
On Mon, Oct 20, 2014 at 07:42:01PM +0100, Will Deacon wrote:
 On Mon, Oct 20, 2014 at 04:39:15PM +0100, Will Deacon wrote:
  On Mon, Oct 13, 2014 at 02:06:15PM +0100, Antonios Motakis wrote:
   This patch series applies to Joerg Roedel's iommu/next branch, commit 
   09b5269a.
   It replaces the IOMMU_EXEC flag used by the ARM SMMU driver to 
   IOMMU_NOEXEC.
   This is more enforceable, since the lack of the flag on hardware that 
   doesn't
   support it implies that the target memory will be executable.
  
  Looks good to me; I'll take this via the arm-smmu tree and send it to Joerg
  along with anything else that gets queued for 3.19.
 
 The 0-day builder spotted a new warning from this series:
 
drivers/iommu/amd_iommu.c: In function 'amd_iommu_capable':
  drivers/iommu/amd_iommu.c:3409:2: warning: enumeration value 
  'IOMMU_CAP_NOEXEC' not handled in switch [-Wswitch]
  switch (cap) {
  ^
 
 I fixed it with the patch below, but I'd appreciate you and Joerg taking
 a look too.
 
 Cheers,
 
 Will
 
 ---8
 
 diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
 index 505a9adac2d5..3d78a8fb5a6a 100644
 --- a/drivers/iommu/amd_iommu.c
 +++ b/drivers/iommu/amd_iommu.c
 @@ -3411,6 +3411,8 @@ static bool amd_iommu_capable(enum iommu_cap cap)
 return true;
 case IOMMU_CAP_INTR_REMAP:
 return (irq_remapping_enabled == 1);
 +   case IOMMU_CAP_NOEXEC:
 +   return false;
 }

Looks good to me.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH linux-next] iommu: add iommu for s390 platform

2014-10-22 Thread Frank Blaschka
On Wed, Oct 22, 2014 at 04:17:29PM +0200, Joerg Roedel wrote:
 Hi Frank,
 
 On Tue, Oct 21, 2014 at 01:57:25PM +0200, Frank Blaschka wrote:
  Add a basic iommu for the s390 platform. The code is pretty simple
  since on s390 each PCI device has its own virtual io address space
  starting at the same vio address.
 
 Are there any limitations on IOVA address space for the devices or can
 be really any system physical address mapped starting from 0 to 2^64?


Hi Joerg,

Basically there are no limitations. Depending on the s390 maschine
generation a device starts its IOVA at a specific address (announced by
the HW). But as I already told each device starts at the same address.
I think this prevents having multiple devices on the same IOMMU domain.
 
  For this a domain could hold only one pci device.
 
 This bothers me, as it is not compatible with the IOMMU-API. I looked a
 little bit into how the mappings are created, and it seems there is a
 per-device dma_table.


yes, you are absolutely right. There is a per-device dma_table.
There is no general IOMMU device but each pci device has its own IOMMU
translation capability.
 
 Is there any reason a dma_table can't be per IOMMU domain and assigned
 to multiple devices at the same time?

Is there a possibility the IOMMU domain can support e.g. something like

VIOA 0x1 - pci device 1
VIOA 0x1 - pci device 2

 
 Otherwise the code looks quite simple and straight forward.


Thx for your review and help 

Frank
 
 
   Joerg
 
 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 07/18] vfio/platform: return info for device memory mapped IO regions

2014-10-22 Thread Alex Williamson
On Wed, 2014-10-22 at 15:54 +0200, Antonios Motakis wrote:
 On Tue, Oct 21, 2014 at 6:34 PM, Alex Williamson
 alex.william...@redhat.com wrote:
  On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote:
  This patch enables the IOCTLs VFIO_DEVICE_GET_REGION_INFO ioctl call,
  which allows the user to learn about the available MMIO resources of
  a device.
 
  Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com
  ---
   drivers/vfio/platform/vfio_platform_common.c  | 93 
  +--
   drivers/vfio/platform/vfio_platform_private.h | 22 +++
   2 files changed, 111 insertions(+), 4 deletions(-)
 
  diff --git a/drivers/vfio/platform/vfio_platform_common.c 
  b/drivers/vfio/platform/vfio_platform_common.c
  index 1e4073f..8a7e474 100644
  --- a/drivers/vfio/platform/vfio_platform_common.c
  +++ b/drivers/vfio/platform/vfio_platform_common.c
  @@ -27,17 +27,84 @@
 
   #include vfio_platform_private.h
 
  +static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
  +{
  + int cnt = 0, i;
  +
  + while (vdev-get_resource(vdev, cnt))
  + cnt++;
  +
  + vdev-regions = kcalloc(cnt, sizeof(struct vfio_platform_region),
  + GFP_KERNEL);
  + if (!vdev-regions)
  + return -ENOMEM;
  +
  + for (i = 0; i  cnt;  i++) {
  + struct resource *res =
  + vdev-get_resource(vdev, i);
  +
  + if (!res)
  + goto err;
  +
  + vdev-regions[i].addr = res-start;
  + vdev-regions[i].size = resource_size(res);
  + vdev-regions[i].flags = 0;
  +
  + switch (resource_type(res)) {
  + case IORESOURCE_MEM:
  + vdev-regions[i].type = 
  VFIO_PLATFORM_REGION_TYPE_MMIO;
  + break;
  + case IORESOURCE_IO:
  + vdev-regions[i].type = 
  VFIO_PLATFORM_REGION_TYPE_PIO;
  + break;
 
  Ok, we have support for PIO in platform now (thanks!), does the user
  know what type of region they're dealing with?  Do they care?  For PCI
  the user tests the PCI BAR in config space to determine which type it
  is.  I'm guessing that platform would do something similar against the
  device tree or ACPI, right?
 
 Maybe is worthwhile to add an explicit flag in
 VFIO_DEVICE_GET_REGION_INFO for PIO regions. For platform devices I
 don't know if we can always rely on DT or ACPI info to be available.
 For VFIO PCI the BAR is always implemented, and while I have proposed
 an RFC to return DT information, I don't think we can assume how a
 device is described in the host, whether DT, ACPI, or dark magic.

Is this already handled by the fact that vfio-platform is not meant to
be a generic meta driver to the same extent as vfio-pci?  There is no
self-describing config space on platform devices like there is on PCI,
so the user will need to know in advance somehow what the device is and
what resources/irqs it uses.  We do need to make sure though that we
provide a userspace ABI that allows them to match VFIO indexes to the
device in a predictable way.  It's not fully clear to me how that works.
Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 09/18] vfio/platform: support MMAP of MMIO regions

2014-10-22 Thread Alex Williamson
On Wed, 2014-10-22 at 15:55 +0200, Antonios Motakis wrote:
 On Tue, Oct 21, 2014 at 6:51 PM, Alex Williamson
 alex.william...@redhat.com wrote:
  On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote:
  Allow to memory map the MMIO regions of the device so userspace can
  directly access them. PIO regions are not being handled at this point.
 
  Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com
  ---
   drivers/vfio/platform/vfio_platform_common.c | 57 
  
   1 file changed, 57 insertions(+)
 
  diff --git a/drivers/vfio/platform/vfio_platform_common.c 
  b/drivers/vfio/platform/vfio_platform_common.c
  index ac74710..4db7187 100644
  --- a/drivers/vfio/platform/vfio_platform_common.c
  +++ b/drivers/vfio/platform/vfio_platform_common.c
  @@ -57,6 +57,16 @@ static int vfio_platform_regions_init(struct 
  vfio_platform_device *vdev)
if (!(res-flags  IORESOURCE_READONLY))
vdev-regions[i].flags |=
VFIO_REGION_INFO_FLAG_WRITE;
  +
  + /*
  +  * Only regions addressed with PAGE granularity may 
  be
  +  * MMAPed securely.
  +  */
  + if (!(vdev-regions[i].addr  ~PAGE_MASK) 
  + !(vdev-regions[i].size  
  ~PAGE_MASK))
  + vdev-regions[i].flags |=
  + VFIO_REGION_INFO_FLAG_MMAP;
  +
 
  Should this be included in the above !readonly test?  I don't see that
  we're doing anything below that would prevent writes to the mmap for a
  readonly resource.  I suspect that just like PCI, it's not all that
  useful to provide mmap support for read-only regions.  They're not
  typically performance paths.
 
 Indeed. Alternatively we could just MMAP the region as read only as
 well. Even if typically they are not performance critical areas, maybe
 it doesn't hurt to allow MMAP when possible.

Sure, we could allow a read-only mmap.  On PCI the only read-only region
is the PCI option ROM.  The PCI spec indicates that devices may use the
same address decoders for the ROM as for BARs, so the ROM and BAR may
not be simultaneously accessible.  QEMU also caches the ROM once the
guest does read it, so it's never been an issue to allow mmap of that
resource.  I would probably only bother to provide read-only mmap if you
know vfio-platform devices are going to make more significant use of
read-only regions.  Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] pci: fix dmar fault for kdump kernel

2014-10-22 Thread Alexander Duyck
On 10/21/2014 07:47 PM, Bjorn Helgaas wrote:
 [+cc Joerg, Eric, Tom, David, iommu list]

 On Wed, Oct 15, 2014 at 2:14 AM, Takao Indoh indou.ta...@jp.fujitsu.com 
 wrote:
 (2014/10/14 18:34), Li, ZhenHua wrote:
 I tested on the latest stable version 3.17, it works well.

 On 10/10/2014 03:13 PM, Li, Zhen-Hua wrote:
 On a HP system with Intel vt-d supported and many PCI devices on it,
 when kernel crashed and the kdump kernel boots with intel_iommu=on,
 there may be some unexpected DMA requests on this adapter, which will
 cause DMA Remapping faults like:
  dmar: DRHD: handling fault status reg 102
  dmar: DMAR:[DMA Read] Request device [41:00.0] fault addr fff81000
  DMAR:[fault reason 01] Present bit in root entry is clear

 This bug may happen on *any* PCI device.
 Analysis for this bug:

 The present bit is set in this function:

 static struct context_entry * device_to_context_entry(
  struct intel_iommu *iommu, u8 bus, u8 devfn)
 {
  ..
  set_root_present(root);
  ..
 }

 Calling tree:
  device driver
  intel_alloc_coherent
  __intel_map_single
  domain_context_mapping
  domain_context_mapping_one
  device_to_context_entry

 This means, the present bit in root entry will not be set until the device
 driver is loaded.

 But in the kdump kernel, hardware devices are not aware that control has
 transferred to the second kernel, and those drivers must initialize again.
 Consequently there may be unexpected DMA requests from devices activity
 initiated in the first kernel leading to the DMA Remapping errors in the
 second kernel.

 To fix this DMAR fault, we need to reset the bus that this device on. Reset
 the device itself does not work.
 You have not explained why the DMAR faults are a problem.  The fault
 is just an indication that the IOMMU prevented a DMA from completing.
 If the DMA is an artifact of the crashed kernel, we probably don't
 *want* it to complete, so taking a DMAR fault seems like exactly the
 right thing.

 If the problem is that we're being flooded with messages, it's easy
 enough to just tone down the printks.

As I recall what we have seen in the past with the network controllers
is that they get stuck in a state where they can no longer perform any
DMA due to the fact that some of the transactions have returned errors
from the IOMMU being reset.  The only way out is to perform a PCIe reset
on the part after the IOMMU has been enabled which doesn't occur
automatically unless AER or EEH is enabled in the system.

One thought would be to take a look at the IOMMU reset code.  Is there
any way to go through and make sure that all of the PCI devices that
make use of the IOMMU have the bus mastering disabled prior to the IOMMU
being reset?  For example could we suspend all of the parts in order to
force them to hold off any transactions, and then resume them after the
IOMMU has been reset?  If we could do at least that much that would
prevent the errors and should allow for a graceful reset.

Thanks,

Alex


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] pci: fix dmar fault for kdump kernel

2014-10-22 Thread Bjorn Helgaas
On Wed, Oct 22, 2014 at 10:54 AM, Alexander Duyck
alexander.du...@gmail.com wrote:
 On 10/21/2014 07:47 PM, Bjorn Helgaas wrote:
 [+cc Joerg, Eric, Tom, David, iommu list]

 On Wed, Oct 15, 2014 at 2:14 AM, Takao Indoh indou.ta...@jp.fujitsu.com 
 wrote:
 (2014/10/14 18:34), Li, ZhenHua wrote:
 I tested on the latest stable version 3.17, it works well.

 On 10/10/2014 03:13 PM, Li, Zhen-Hua wrote:

 To fix this DMAR fault, we need to reset the bus that this device on. 
 Reset
 the device itself does not work.
 You have not explained why the DMAR faults are a problem.  The fault
 is just an indication that the IOMMU prevented a DMA from completing.
 If the DMA is an artifact of the crashed kernel, we probably don't
 *want* it to complete, so taking a DMAR fault seems like exactly the
 right thing.

 If the problem is that we're being flooded with messages, it's easy
 enough to just tone down the printks.

 As I recall what we have seen in the past with the network controllers
 is that they get stuck in a state where they can no longer perform any
 DMA due to the fact that some of the transactions have returned errors
 from the IOMMU being reset.  The only way out is to perform a PCIe reset
 on the part after the IOMMU has been enabled which doesn't occur
 automatically unless AER or EEH is enabled in the system.

OK, now we're talking about a real issue, the sort of thing that
should be in the changelog for a change like this.

I'm uneasy about the strategy of it hurts when an IOMMU fault occurs,
therefore we need to avoid all IOMMU faults.  Isn't the whole *point*
of an IOMMU to generate faults?  It seems like we need to be able to
handle faults gracefully.

If having AER or EEH enabled in the kdump kernel is part of what's
required to recover, I don't see a problem with requiring that.

Don't we have to be able to recover from IOMMU faults for the device
pass-through case anyway?  If a NIC is passed through to a malicious
guest, I assume the guest can cause IOMMU faults.  I assume we handle
this today by resetting the NIC when the guest exits.

 One thought would be to take a look at the IOMMU reset code.  Is there
 any way to go through and make sure that all of the PCI devices that
 make use of the IOMMU have the bus mastering disabled prior to the IOMMU
 being reset?  For example could we suspend all of the parts in order to
 force them to hold off any transactions, and then resume them after the
 IOMMU has been reset?  If we could do at least that much that would
 prevent the errors and should allow for a graceful reset.

 Thanks,

 Alex


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 07/18] vfio/platform: return info for device memory mapped IO regions

2014-10-22 Thread Antonios Motakis
On Wed, Oct 22, 2014 at 6:46 PM, Alex Williamson
alex.william...@redhat.com wrote:
 On Wed, 2014-10-22 at 15:54 +0200, Antonios Motakis wrote:
 On Tue, Oct 21, 2014 at 6:34 PM, Alex Williamson
 alex.william...@redhat.com wrote:
  On Mon, 2014-10-13 at 15:10 +0200, Antonios Motakis wrote:
  This patch enables the IOCTLs VFIO_DEVICE_GET_REGION_INFO ioctl call,
  which allows the user to learn about the available MMIO resources of
  a device.
 
  Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com
  ---
   drivers/vfio/platform/vfio_platform_common.c  | 93 
  +--
   drivers/vfio/platform/vfio_platform_private.h | 22 +++
   2 files changed, 111 insertions(+), 4 deletions(-)
 
  diff --git a/drivers/vfio/platform/vfio_platform_common.c 
  b/drivers/vfio/platform/vfio_platform_common.c
  index 1e4073f..8a7e474 100644
  --- a/drivers/vfio/platform/vfio_platform_common.c
  +++ b/drivers/vfio/platform/vfio_platform_common.c
  @@ -27,17 +27,84 @@
 
   #include vfio_platform_private.h
 
  +static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
  +{
  + int cnt = 0, i;
  +
  + while (vdev-get_resource(vdev, cnt))
  + cnt++;
  +
  + vdev-regions = kcalloc(cnt, sizeof(struct vfio_platform_region),
  + GFP_KERNEL);
  + if (!vdev-regions)
  + return -ENOMEM;
  +
  + for (i = 0; i  cnt;  i++) {
  + struct resource *res =
  + vdev-get_resource(vdev, i);
  +
  + if (!res)
  + goto err;
  +
  + vdev-regions[i].addr = res-start;
  + vdev-regions[i].size = resource_size(res);
  + vdev-regions[i].flags = 0;
  +
  + switch (resource_type(res)) {
  + case IORESOURCE_MEM:
  + vdev-regions[i].type = 
  VFIO_PLATFORM_REGION_TYPE_MMIO;
  + break;
  + case IORESOURCE_IO:
  + vdev-regions[i].type = 
  VFIO_PLATFORM_REGION_TYPE_PIO;
  + break;
 
  Ok, we have support for PIO in platform now (thanks!), does the user
  know what type of region they're dealing with?  Do they care?  For PCI
  the user tests the PCI BAR in config space to determine which type it
  is.  I'm guessing that platform would do something similar against the
  device tree or ACPI, right?

 Maybe is worthwhile to add an explicit flag in
 VFIO_DEVICE_GET_REGION_INFO for PIO regions. For platform devices I
 don't know if we can always rely on DT or ACPI info to be available.
 For VFIO PCI the BAR is always implemented, and while I have proposed
 an RFC to return DT information, I don't think we can assume how a
 device is described in the host, whether DT, ACPI, or dark magic.

 Is this already handled by the fact that vfio-platform is not meant to
 be a generic meta driver to the same extent as vfio-pci?  There is no
 self-describing config space on platform devices like there is on PCI,
 so the user will need to know in advance somehow what the device is and
 what resources/irqs it uses.  We do need to make sure though that we
 provide a userspace ABI that allows them to match VFIO indexes to the
 device in a predictable way.  It's not fully clear to me how that works.
 Thanks,

Yeah, it is a fact of life that the user needs to know what regions he
is accessing, just from their ordering. Even with the extension for
accessing device tree data, the user still needs to know what each
region is and what info he is looking for from the device tree. In
that respect we could delegate the responsibility to the user to just
know what kind of device he is accessing and what kind of regions it
features (and in what order).


 Alex




-- 
Antonios Motakis
Virtual Open Systems
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] IOMMU-MSM: Deletion of unnecessary checks before the function call clk_disable

2014-10-22 Thread SF Markus Elfring
 If you are convinced that dropping the null tests is a good idea, then you 
 can submit the patch that makes the change to the relevant maintainers and 
 mailing lists.

From af73fb59d5d4b2c289fb236d0752522b6b38 Mon Sep 17 00:00:00 2001
From: Markus Elfring elfr...@users.sourceforge.net
Date: Wed, 22 Oct 2014 19:39:21 +0200
Subject: [PATCH] IOMMU-MSM: Deletion of unnecessary checks before the function
 call clk_disable

A semantic patch approach was proposed with the subject [PATCH with
Coccinelle?] Deletion of unnecessary checks before specific function calls
on 2014-03-05.
https://lkml.org/lkml/2014/3/5/344
http://article.gmane.org/gmane.comp.version-control.coccinelle/3513/

This patch pattern application was repeated with the help of the software
Coccinelle 1.0.0-rc22 on the source files for Linux 3.17.1. An extract
of the automatically generated update suggestions is shown here.

It was determined that the affected source code places call functions
which perform input parameter validation already. It is therefore not
needed that a similar safety check is repeated at the call site.

Signed-off-by: Markus Elfring elfr...@users.sourceforge.net
---
 drivers/iommu/msm_iommu.c | 3 +--
 drivers/iommu/msm_iommu_dev.c | 6 ++
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index 6e3dcc28..3e4d888 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -73,8 +73,7 @@ fail:

 static void __disable_clocks(struct msm_iommu_drvdata *drvdata)
 {
-   if (drvdata-clk)
-   clk_disable(drvdata-clk);
+   clk_disable(drvdata-clk);
clk_disable(drvdata-pclk);
 }

diff --git a/drivers/iommu/msm_iommu_dev.c b/drivers/iommu/msm_iommu_dev.c
index 61def7cb..9574d21 100644
--- a/drivers/iommu/msm_iommu_dev.c
+++ b/drivers/iommu/msm_iommu_dev.c
@@ -224,8 +224,7 @@ static int msm_iommu_probe(struct platform_device *pdev)

platform_set_drvdata(pdev, drvdata);

-   if (iommu_clk)
-   clk_disable(iommu_clk);
+   clk_disable(iommu_clk);

clk_disable(iommu_pclk);

@@ -323,8 +322,7 @@ static int msm_iommu_ctx_probe(struct platform_device *pdev)
SET_NSCFG(drvdata-base, mid, 3);
}

-   if (drvdata-clk)
-   clk_disable(drvdata-clk);
+   clk_disable(drvdata-clk);
clk_disable(drvdata-pclk);

dev_info(pdev-dev, context %s using bank %d\n, c-name, c-num);
-- 
2.1.2


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/8] iommu/vt-d: Fix crash dump failure caused by legacy DMA/IO

2014-10-22 Thread Bjorn Helgaas
On Wed, Oct 22, 2014 at 7:21 AM, Joerg Roedel j...@8bytes.org wrote:
 Hi Bjorn,

 On Tue, Oct 21, 2014 at 08:16:46PM -0600, Bjorn Helgaas wrote:
 I was looking at Zhen-Hua's recent patches, trying to figure out if I
 need to do anything with them.  Resetting devices in the old kernel
 seems like a non-starter.  Resetting devices in the new kernel, ...,
 well, maybe.  It seems ugly, and it seems like the sort of problem
 that IOMMUs are designed to solve.

 Actually resetting the devices in the kdump kernel would be one of the
 better solutions for this problem. When we have a generic way to stop
 all in-flight DMA from the PCI endpoints we could safely disable and
 then re-enable the IOMMU.

 On Wed, Jul 2, 2014 at 7:32 AM, Joerg Roedel j...@8bytes.org wrote:
  That is a solution to prevent the in-flight DMA failures. But what
  happens when there is some in-flight DMA to a disk to write some inodes
  or a new superblock. Then this scratch address-space may cause
  filesystem corruption at worst.

 This in-flight DMA is from a device programmed by the old kernel, and
 it would be reading data from the old kernel's buffers.  I think
 you're suggesting that we might want that DMA read to complete so the
 device can update filesystem metadata?

 Well, it is not about updating filesystem metadata. In the kdump kernel
 we have these options:

 1) Disable the IOMMU. Problem here is, that DMA is now
untranslated, so that any in-flight DMA might read or write
from a random location in memory, corrupting the kdump or
even the new kexec kernel memory. So this is a non-starter.

Agreed (at least if the IOMMU was enabled in the crashed kernel).

 2) Re-program the IOMMU to block all DMA. This is safer as it
does not corrupt any data in memory. But some devices react
very poorly on a master abort from the IOMMU, so bad that the
driver in the kdump kernel fails to initialize that device.
In this case taking dump itself might fail (and often does,
according to reports)

Sounds like an option, even though broken devices work poorly.

 3) To prevent master aborts like in option (2), David suggested
to map the whole DMA address space to a scratch page. This
also prevents system memory corruption and the master aborts.
But the problem is, that in-flight DMA will now read all
zeros. This can corrupt disk and network data, at worst it
nulls out the superblocks of your filesystem and you lose all
data. So this is not an option too.

Ah, yes, I see your point now.  This allows corrupted data, e.g., all
zeroes, to be written to disk or network after the kernel crash.  I
agree; this doesn't sound like a good option.

And the proposal below is a 4th option (leave IOMMU enabled, reusing
crashed kernel's mappings until drivers make new mappings).

 What we currently do in the VT-d driver is a mixture of (1) and (2). The
 VT-d driver disables the IOMMU hardware (opening a race window for
 memory data corruption), re-initializes it to reject any ongoing DMA
 (which causes master aborts for in-flight DMA) and re-enables the IOMMU
 hardware.

 But this strategy fails in heavy IO environments quite often and we look
 into ways to solve the problem, or at least improve the current
 situation as good as we can.

 I talked to David about this at LPC and we came up with basically this
 procedure:

 1. If the VT-d driver finds the IOMMU enabled, it reuses its
root-context table. This way the IOMMU must not be disabled
and re-enabled, eliminating the race we have now.
Other data structures like the context-entries are copied
over from the old kernel.  We basically keep all mappings
from the old kernel, allowing any in-flight DMA to succeed.
No memory or disk data corruption.

If the crashed kernel had corrupted memory, couldn't an in-flight DMA
read that corrupted data from memory and write it to disk?

I guess you could argue that this is merely a race, and the in-flight
DMA could just as easily have happened before the kernel crash, so
there's always a window and the only question is whether it closes
when the IOMMU driver starts up or when the device driver starts up.

The issue here is, that we modify data from the old kernel
which is about to be dumped. But there are ways to work
around that.

 2. When a device driver issues the first dma_map command for a
device, we assign a new and empty page-table, thus removing
all mappings from the old kernel for the device.
Rationale is, that at this point the device driver should
have reset the device to a point where all in-flight DMA is
canceled.

 This approach goes into the same direction as Bill Sumners patch-set,
 which Li 

Re: [PATCH 12/17] iommu/omap: Integrate omap-iommu-debug into omap-iommu

2014-10-22 Thread Suman Anna
Hi Joerg,

On 10/22/2014 08:29 AM, Joerg Roedel wrote:
 On Tue, Oct 21, 2014 at 04:28:27PM -0500, Suman Anna wrote:
 I am looking to refresh this series for 3.19, and this is the only patch
 that may need some changes. Please let me know what your preference is,
 and I can rework this patch if needed. Either way,
 the plan is to not have an OMAP IOMMU debugfs module, but only to
 have an option not to build the debugfs portions.
 
 Can I apply this series without this patch and you send me an updated
 version of only this patch? If yes, please let me know, otherwise please
 rebase the series on v3.18-rc1 and resend.

Some of the patches following this patch don't apply without this patch,
so this patch alone cannot be dropped. The preceding patches (Patches 1
through 11) are not affected though. I will rebase and resend the entire
series.

regards
Suman


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 0/4] vfio: platform: return device tree info for a platform device node

2014-10-22 Thread Alex Williamson
On Thu, 2014-10-16 at 17:54 +0200, Antonios Motakis wrote:
 This RFC's intention is to show what an interface to access device node
 properties for the VFIO platform driver can look like.
 
 If a device tree node corresponding to a platform device bound by VFIO 
 PLATFORM
 or VFIO AMBA is available, this patch series will allow the user to query the
 properties associated with this device node. This can be useful for userspace
 drivers to automatically query parameters related to the device.
 
 An API to return data from a device's device tree has been proposed before on
 these lists. The API proposed here is slightly different.
 
 Properties to parse from the device tree are not indexed by a numerical id.
 The host system doesn't guarantee any specific ordering for the available
 properties, or that those will remain the same; while this does not happen in
 practice, there is nothing from the host changing the device nodes during
 operation. For this reason in this RFC properties are accessed by property 
 name.
 
 The type of the property accessed must also be known by the user. Properties
 types implemented in this RFC:
 - VFIO_DEVTREE_TYPE_STRINGS (strings sepparated by the null character)
 - VFIO_DEVTREE_TYPE_U32
 - VFIO_DEVTREE_TYPE_U16
 - VFIO_DEVTREE_TYPE_U8
 
 These can all be access by the ioctl VFIO_DEVICE_GET_DEVTREE_INFO. A new ioctl
 was preferred instead of shoehorning the functionality in 
 VFIO_DEVICE_GET_INFO.
 The structure exchanged with userspace looks like this:
 
 /**
  * VFIO_DEVICE_GET_DEVTREE_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 16,
  *struct vfio_devtree_info)
  *
  * Retrieve information from the device's device tree, if available.
  * Caller will initialize data[] with a single string with the requested
  * devicetree property name, and type depending on whether a array of strings
  * or an array of u32 values is expected. On success, data[] will be extended
  * with the requested information, either as an array of u32, or with a list
  * of strings sepparated by the NULL terminating character.
  * Return: 0 on success, -errno on failure.
  */
 struct vfio_devtree_info {
   __u32   argsz;
   __u32   type;
 #define VFIO_DEVTREE_PROP_LIST0
 #define VFIO_DEVTREE_TYPE_STRINGS 1
 #define VFIO_DEVTREE_TYPE_U8  2
 #define VFIO_DEVTREE_TYPE_U16 3
 #define VFIO_DEVTREE_TYPE_U32 4
   __u32   length;
   __u8data[];
 };
 #define VFIO_DEVICE_GET_DEVTREE_INFO  _IO(VFIO_TYPE, VFIO_BASE + 17)

Seems pretty reasonable.  We should add a flags variable to
vfio_devtree_info as well.  General question, in order to get a VFIO
device file descriptor, the user already needs to know the name of the
device.  Presumably they'll get this from sysfs or otherwise just know
it in advance.  Is that redundant information to what we're providing
here?  What other information already exists in sysfs for device tree
devices that we're duplicating here for the sake of creating a VFIO
mechanism?  If it's not already exposed in sysfs, should it be?  Thanks,

Alex

 The length of the property will be reported in length, so the user can 
 reallocate
 the structure if the data does not fit the first time the call is used.
 
 Specifically for QEMU, reading the compatible property of the device tree
 node could be of use to find out what device is being assigned to the guest 
 and
 handle appropriately a wider range of devices in the future, and to generate 
 an
 appropriate device tree for the guest.
 
 TODOs:
  - 64 bit values support
  - We can consider to rebase this on the patch series by Rafael J. Wysocki:
[PATCH v4 00/13] Add ACPI _DSD and unified device properties support
This would make 64 bit support more straightforward as it already includes
the APIs we need for 64 bit OF values.
 
 Changes since v1:
  - Updated for VFIO platform patch series v8:
VFIO AMBA devices now supported in addition to VFIO PLATFORM devices
  - Refactored and cleaned up the code
 
 Antonios Motakis (4):
   vfio: platform: add device tree info API and skeleton
   vfio: platform: devtree: return available property names
   vfio: platform: devtree: access property as a list of strings
   vfio: platform: devtree: return arrays of u32, u16, or u8 data
 
  drivers/vfio/platform/Makefile|   3 +-
  drivers/vfio/platform/devtree.c   | 199 
 ++
  drivers/vfio/platform/vfio_platform_common.c  |  39 +
  drivers/vfio/platform/vfio_platform_private.h |   6 +
  include/uapi/linux/vfio.h |  26 
  5 files changed, 272 insertions(+), 1 deletion(-)
  create mode 100644 drivers/vfio/platform/devtree.c
 



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 08/17] iommu/omap: Simplify omap2_iommu_fault_isr()

2014-10-22 Thread Suman Anna
The function omap2_iommu_fault_isr() does an unnecessary
recomputation of the return value. The logic relies on
setting the same bit fields as the MMU fault error status
bits, so simplify this function and remove the unneeded
macros. These macros were originally exported to notify
MMU faults to users prior to the IOMMU framework adaptation,
but are now redundant.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu2.c | 20 +---
 1 file changed, 1 insertion(+), 19 deletions(-)

diff --git a/drivers/iommu/omap-iommu2.c b/drivers/iommu/omap-iommu2.c
index 372141b..ce2fff3 100644
--- a/drivers/iommu/omap-iommu2.c
+++ b/drivers/iommu/omap-iommu2.c
@@ -53,13 +53,6 @@
 ((pgsz) == MMU_CAM_PGSZ_64K) ? 0x :\
 ((pgsz) == MMU_CAM_PGSZ_4K)  ? 0xf000 : 0)
 
-/* IOMMU errors */
-#define OMAP_IOMMU_ERR_TLB_MISS(1  0)
-#define OMAP_IOMMU_ERR_TRANS_FAULT (1  1)
-#define OMAP_IOMMU_ERR_EMU_MISS(1  2)
-#define OMAP_IOMMU_ERR_TBLWALK_FAULT   (1  3)
-#define OMAP_IOMMU_ERR_MULTIHIT_FAULT  (1  4)
-
 static void __iommu_set_twl(struct omap_iommu *obj, bool on)
 {
u32 l = iommu_read_reg(obj, MMU_CNTL);
@@ -122,7 +115,6 @@ static void omap2_iommu_set_twl(struct omap_iommu *obj, 
bool on)
 static u32 omap2_iommu_fault_isr(struct omap_iommu *obj, u32 *ra)
 {
u32 stat, da;
-   u32 errs = 0;
 
stat = iommu_read_reg(obj, MMU_IRQSTATUS);
stat = MMU_IRQ_MASK;
@@ -134,19 +126,9 @@ static u32 omap2_iommu_fault_isr(struct omap_iommu *obj, 
u32 *ra)
da = iommu_read_reg(obj, MMU_FAULT_AD);
*ra = da;
 
-   if (stat  MMU_IRQ_TLBMISS)
-   errs |= OMAP_IOMMU_ERR_TLB_MISS;
-   if (stat  MMU_IRQ_TRANSLATIONFAULT)
-   errs |= OMAP_IOMMU_ERR_TRANS_FAULT;
-   if (stat  MMU_IRQ_EMUMISS)
-   errs |= OMAP_IOMMU_ERR_EMU_MISS;
-   if (stat  MMU_IRQ_TABLEWALKFAULT)
-   errs |= OMAP_IOMMU_ERR_TBLWALK_FAULT;
-   if (stat  MMU_IRQ_MULTIHITFAULT)
-   errs |= OMAP_IOMMU_ERR_MULTIHIT_FAULT;
iommu_write_reg(obj, stat, MMU_IRQSTATUS);
 
-   return errs;
+   return stat;
 }
 
 static void omap2_tlb_read_cr(struct omap_iommu *obj, struct cr_regs *cr)
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 10/17] iommu/omap: Fix the permissions on nr_tlb_entries

2014-10-22 Thread Suman Anna
The permissions on the debugfs entry nr_tlb_entries should
have been octal, not decimal, so fix it.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu-debug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c
index 0fb92aa..a520438 100644
--- a/drivers/iommu/omap-iommu-debug.c
+++ b/drivers/iommu/omap-iommu-debug.c
@@ -256,7 +256,7 @@ static int iommu_debug_register(struct device *dev, void 
*data)
goto nomem;
parent = d;
 
-   d = debugfs_create_u8(nr_tlb_entries, 400, parent,
+   d = debugfs_create_u8(nr_tlb_entries, 0400, parent,
  (u8 *)obj-nr_tlb_entries);
if (!d)
goto nomem;
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 05/17] iommu/omap: Remove ver debugfs entry

2014-10-22 Thread Suman Anna
The debugfs entry 'ver' to read the OMAP IOMMU version is
not much useful for developers, so it has been removed. The
same can be deduced from the register dump, provided by the
debugfs entry 'regs', REVISION register. This also allows us
to remove the omap_iommu_arch_revision() which is currently
returning a fixed value.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu-debug.c | 13 -
 1 file changed, 13 deletions(-)

diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c
index 531658d..0fb92aa 100644
--- a/drivers/iommu/omap-iommu-debug.c
+++ b/drivers/iommu/omap-iommu-debug.c
@@ -30,17 +30,6 @@ static DEFINE_MUTEX(iommu_debug_lock);
 
 static struct dentry *iommu_debug_root;
 
-static ssize_t debug_read_ver(struct file *file, char __user *userbuf,
- size_t count, loff_t *ppos)
-{
-   u32 ver = omap_iommu_arch_version();
-   char buf[MAXCOLUMN], *p = buf;
-
-   p += sprintf(p, H/W version: %d.%d\n, (ver  4)  0xf , ver  0xf);
-
-   return simple_read_from_buffer(userbuf, count, ppos, buf, p - buf);
-}
-
 static ssize_t debug_read_regs(struct file *file, char __user *userbuf,
   size_t count, loff_t *ppos)
 {
@@ -228,7 +217,6 @@ static ssize_t debug_read_pagetable(struct file *file, char 
__user *userbuf,
.llseek = generic_file_llseek,  \
};
 
-DEBUG_FOPS_RO(ver);
 DEBUG_FOPS_RO(regs);
 DEBUG_FOPS_RO(tlb);
 DEBUG_FOPS(pagetable);
@@ -273,7 +261,6 @@ static int iommu_debug_register(struct device *dev, void 
*data)
if (!d)
goto nomem;
 
-   DEBUG_ADD_FILE_RO(ver);
DEBUG_ADD_FILE_RO(regs);
DEBUG_ADD_FILE_RO(tlb);
DEBUG_ADD_FILE(pagetable);
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 11/17] iommu/omap: Make pagetable debugfs entry read-only

2014-10-22 Thread Suman Anna
Remove the writeability on the 'pagetable' debugfs entry,
so that the mapping/unmapping into an OMAP IOMMU is only
limited to actual client devices/drivers at kernel-level.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu-debug.c | 48 ++--
 1 file changed, 2 insertions(+), 46 deletions(-)

diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c
index a520438..28de657 100644
--- a/drivers/iommu/omap-iommu-debug.c
+++ b/drivers/iommu/omap-iommu-debug.c
@@ -24,8 +24,6 @@
 #include omap-iopgtable.h
 #include omap-iommu.h
 
-#define MAXCOLUMN 100 /* for short messages */
-
 static DEFINE_MUTEX(iommu_debug_lock);
 
 static struct dentry *iommu_debug_root;
@@ -82,39 +80,6 @@ static ssize_t debug_read_tlb(struct file *file, char __user 
*userbuf,
return bytes;
 }
 
-static ssize_t debug_write_pagetable(struct file *file,
-const char __user *userbuf, size_t count, loff_t *ppos)
-{
-   struct iotlb_entry e;
-   struct cr_regs cr;
-   int err;
-   struct device *dev = file-private_data;
-   struct omap_iommu *obj = dev_to_omap_iommu(dev);
-   char buf[MAXCOLUMN], *p = buf;
-
-   count = min(count, sizeof(buf));
-
-   mutex_lock(iommu_debug_lock);
-   if (copy_from_user(p, userbuf, count)) {
-   mutex_unlock(iommu_debug_lock);
-   return -EFAULT;
-   }
-
-   sscanf(p, %x %x, cr.cam, cr.ram);
-   if (!cr.cam || !cr.ram) {
-   mutex_unlock(iommu_debug_lock);
-   return -EINVAL;
-   }
-
-   omap_iotlb_cr_to_e(cr, e);
-   err = omap_iopgtable_store_entry(obj, e);
-   if (err)
-   dev_err(obj-dev, %s: fail to store cr\n, __func__);
-
-   mutex_unlock(iommu_debug_lock);
-   return count;
-}
-
 #define dump_ioptable_entry_one(lv, da, val)   \
({  \
int __err = 0;  \
@@ -202,14 +167,6 @@ static ssize_t debug_read_pagetable(struct file *file, 
char __user *userbuf,
return bytes;
 }
 
-#define DEBUG_FOPS(name)   \
-   static const struct file_operations debug_##name##_fops = { \
-   .open = simple_open,\
-   .read = debug_read_##name,  \
-   .write = debug_write_##name,\
-   .llseek = generic_file_llseek,  \
-   };
-
 #define DEBUG_FOPS_RO(name)\
static const struct file_operations debug_##name##_fops = { \
.open = simple_open,\
@@ -219,7 +176,7 @@ static ssize_t debug_read_pagetable(struct file *file, char 
__user *userbuf,
 
 DEBUG_FOPS_RO(regs);
 DEBUG_FOPS_RO(tlb);
-DEBUG_FOPS(pagetable);
+DEBUG_FOPS_RO(pagetable);
 
 #define __DEBUG_ADD_FILE(attr, mode)   \
{   \
@@ -230,7 +187,6 @@ DEBUG_FOPS(pagetable);
return -ENOMEM; \
}
 
-#define DEBUG_ADD_FILE(name) __DEBUG_ADD_FILE(name, 0600)
 #define DEBUG_ADD_FILE_RO(name) __DEBUG_ADD_FILE(name, 0400)
 
 static int iommu_debug_register(struct device *dev, void *data)
@@ -263,7 +219,7 @@ static int iommu_debug_register(struct device *dev, void 
*data)
 
DEBUG_ADD_FILE_RO(regs);
DEBUG_ADD_FILE_RO(tlb);
-   DEBUG_ADD_FILE(pagetable);
+   DEBUG_ADD_FILE_RO(pagetable);
 
return 0;
 
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 03/17] iommu/omap: Remove duplicate declarations

2014-10-22 Thread Suman Anna
The omap_iommu_save_ctx() and omap_iommu_restore_ctx() declarations
are defined in include/linux/omap-iommu.h and do not belong in the
internal drivers/iommu/omap-iommu.h header, so remove them.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h
index 18a0f3a..4fc51c8 100644
--- a/drivers/iommu/omap-iommu.h
+++ b/drivers/iommu/omap-iommu.h
@@ -197,9 +197,6 @@ extern void omap_iotlb_cr_to_e(struct cr_regs *cr, struct 
iotlb_entry *e);
 extern int
 omap_iopgtable_store_entry(struct omap_iommu *obj, struct iotlb_entry *e);
 
-extern void omap_iommu_save_ctx(struct device *dev);
-extern void omap_iommu_restore_ctx(struct device *dev);
-
 extern int omap_foreach_iommu_device(void *data,
int (*fn)(struct device *, void *));
 
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 12/17] iommu/omap: Integrate omap-iommu-debug into omap-iommu

2014-10-22 Thread Suman Anna
The debugfs support for OMAP IOMMU is currently implemented
as a module, warranting certain OMAP-specific IOMMU API to
be exported. The OMAP IOMMU, when enabled, can only be built-in
into the kernel, so integrate the OMAP IOMMU debug module
into the OMAP IOMMU driver. This helps in eliminating the
need to export most of the current OMAP IOMMU API.

The following are the main changes:
- The debugfs directory and entry creation logic is reversed,
  the calls are invoked by the OMAP IOMMU driver now.
- The current iffy circular logic of adding IOMMU archdata
  to the IOMMU devices itself to get a pointer to the omap_iommu
  object in the debugfs support code is replaced by directly
  using the omap_iommu structure while creating the debugfs
  entries.
- The debugfs root directory is renamed from the generic name
  iommu to a specific name omap_iommu.
- Unneeded headers have also been cleaned up while at this.
- There will no longer be a omap-iommu-debug.ko module after
  this patch.
- The OMAP_IOMMU_DEBUG Kconfig option is converted to boolean
  only, the OMAP IOMMU debugfs support is built alongside the
  OMAP IOMMU driver only when this option is enabled.

Signed-off-by: Suman Anna s-a...@ti.com
---
 drivers/iommu/Kconfig|  12 ++---
 drivers/iommu/omap-iommu-debug.c | 100 +++
 drivers/iommu/omap-iommu.c   |  11 -
 drivers/iommu/omap-iommu.h   |  15 ++
 4 files changed, 58 insertions(+), 80 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index dd51122..1d54996 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -144,13 +144,13 @@ config OMAP_IOMMU
select IOMMU_API
 
 config OMAP_IOMMU_DEBUG
-   tristate Export OMAP IOMMU internals in DebugFS
-   depends on OMAP_IOMMU  DEBUG_FS
-   help
- Select this to see extensive information about
- the internal state of OMAP IOMMU in debugfs.
+   bool Export OMAP IOMMU internals in DebugFS
+   depends on OMAP_IOMMU  DEBUG_FS
+   ---help---
+ Select this to see extensive information about
+ the internal state of OMAP IOMMU in debugfs.
 
- Say N unless you know you need this.
+ Say N unless you know you need this.
 
 config TEGRA_IOMMU_GART
bool Tegra GART IOMMU Support
diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c
index 28de657..4813d3a 100644
--- a/drivers/iommu/omap-iommu-debug.c
+++ b/drivers/iommu/omap-iommu-debug.c
@@ -10,15 +10,11 @@
  * published by the Free Software Foundation.
  */
 
-#include linux/module.h
 #include linux/err.h
-#include linux/clk.h
 #include linux/io.h
 #include linux/slab.h
 #include linux/uaccess.h
-#include linux/platform_device.h
 #include linux/debugfs.h
-#include linux/omap-iommu.h
 #include linux/platform_data/iommu-omap.h
 
 #include omap-iopgtable.h
@@ -31,8 +27,7 @@ static struct dentry *iommu_debug_root;
 static ssize_t debug_read_regs(struct file *file, char __user *userbuf,
   size_t count, loff_t *ppos)
 {
-   struct device *dev = file-private_data;
-   struct omap_iommu *obj = dev_to_omap_iommu(dev);
+   struct omap_iommu *obj = file-private_data;
char *p, *buf;
ssize_t bytes;
 
@@ -55,8 +50,7 @@ static ssize_t debug_read_regs(struct file *file, char __user 
*userbuf,
 static ssize_t debug_read_tlb(struct file *file, char __user *userbuf,
  size_t count, loff_t *ppos)
 {
-   struct device *dev = file-private_data;
-   struct omap_iommu *obj = dev_to_omap_iommu(dev);
+   struct omap_iommu *obj = file-private_data;
char *p, *buf;
ssize_t bytes, rest;
 
@@ -141,8 +135,7 @@ out:
 static ssize_t debug_read_pagetable(struct file *file, char __user *userbuf,
size_t count, loff_t *ppos)
 {
-   struct device *dev = file-private_data;
-   struct omap_iommu *obj = dev_to_omap_iommu(dev);
+   struct omap_iommu *obj = file-private_data;
char *p, *buf;
size_t bytes;
 
@@ -181,93 +174,56 @@ DEBUG_FOPS_RO(pagetable);
 #define __DEBUG_ADD_FILE(attr, mode)   \
{   \
struct dentry *dent;\
-   dent = debugfs_create_file(#attr, mode, parent, \
-  dev, debug_##attr##_fops);  \
+   dent = debugfs_create_file(#attr, mode, obj-debug_dir, \
+  obj, debug_##attr##_fops);  \
if (!dent)  \
-   return -ENOMEM; \
+   goto err;   \
}
 
 #define DEBUG_ADD_FILE_RO(name) __DEBUG_ADD_FILE(name, 0400)
 
-static int 

[PATCH v2 17/17] iommu/omap: Switch pagetable debugfs entry to use seq_file

2014-10-22 Thread Suman Anna
The debugfs entry 'pagetable' that shows the page table entry
(PTE) data currently outputs only data that can be fit into a
page. Switch the entry to use the seq_file interface so that
it can show all the valid page table entries.

The patch also corrected the output for L2 entries, and prints
the proper L2 PTE instead of the previous L1 page descriptor
pointer.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu-debug.c | 81 ++--
 1 file changed, 28 insertions(+), 53 deletions(-)

diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c
index 41b09a1..f3d20a2 100644
--- a/drivers/iommu/omap-iommu-debug.c
+++ b/drivers/iommu/omap-iommu-debug.c
@@ -85,95 +85,70 @@ static ssize_t debug_read_tlb(struct file *file, char 
__user *userbuf,
return bytes;
 }
 
-#define dump_ioptable_entry_one(lv, da, val)   \
-   ({  \
-   int __err = 0;  \
-   ssize_t bytes;  \
-   const int maxcol = 22;  \
-   const char *str = %d: %08x %08x\n;\
-   bytes = snprintf(p, maxcol, str, lv, da, val);  \
-   p += bytes; \
-   len -= bytes;   \
-   if (len  maxcol)   \
-   __err = -ENOMEM;\
-   __err;  \
-   })
-
-static ssize_t dump_ioptable(struct omap_iommu *obj, char *buf, ssize_t len)
+static void dump_ioptable(struct seq_file *s)
 {
-   int i;
-   u32 *iopgd;
-   char *p = buf;
+   int i, j;
+   u32 da;
+   u32 *iopgd, *iopte;
+   struct omap_iommu *obj = s-private;
 
spin_lock(obj-page_table_lock);
 
iopgd = iopgd_offset(obj, 0);
for (i = 0; i  PTRS_PER_IOPGD; i++, iopgd++) {
-   int j, err;
-   u32 *iopte;
-   u32 da;
-
if (!*iopgd)
continue;
 
if (!(*iopgd  IOPGD_TABLE)) {
da = i  IOPGD_SHIFT;
-
-   err = dump_ioptable_entry_one(1, da, *iopgd);
-   if (err)
-   goto out;
+   seq_printf(s, 1: 0x%08x 0x%08x\n, da, *iopgd);
continue;
}
 
iopte = iopte_offset(iopgd, 0);
-
for (j = 0; j  PTRS_PER_IOPTE; j++, iopte++) {
if (!*iopte)
continue;
 
da = (i  IOPGD_SHIFT) + (j  IOPTE_SHIFT);
-   err = dump_ioptable_entry_one(2, da, *iopgd);
-   if (err)
-   goto out;
+   seq_printf(s, 2: 0x%08x 0x%08x\n, da, *iopte);
}
}
-out:
-   spin_unlock(obj-page_table_lock);
 
-   return p - buf;
+   spin_unlock(obj-page_table_lock);
 }
 
-static ssize_t debug_read_pagetable(struct file *file, char __user *userbuf,
-   size_t count, loff_t *ppos)
+static int debug_read_pagetable(struct seq_file *s, void *data)
 {
-   struct omap_iommu *obj = file-private_data;
-   char *p, *buf;
-   size_t bytes;
+   struct omap_iommu *obj = s-private;
 
if (is_omap_iommu_detached(obj))
return -EPERM;
 
-   buf = (char *)__get_free_page(GFP_KERNEL);
-   if (!buf)
-   return -ENOMEM;
-   p = buf;
-
-   p += sprintf(p, L: %8s %8s\n, da:, pa:);
-   p += sprintf(p, -\n);
-
mutex_lock(iommu_debug_lock);
 
-   bytes = PAGE_SIZE - (p - buf);
-   p += dump_ioptable(obj, p, bytes);
-
-   bytes = simple_read_from_buffer(userbuf, count, ppos, buf, p - buf);
+   seq_printf(s, L: %8s %8s\n, da:, pte:);
+   seq_puts(s, --\n);
+   dump_ioptable(s);
 
mutex_unlock(iommu_debug_lock);
-   free_page((unsigned long)buf);
 
-   return bytes;
+   return 0;
 }
 
+#define DEBUG_SEQ_FOPS_RO(name)
   \
+   static int debug_open_##name(struct inode *inode, struct file *file)   \
+   {  \
+   return single_open(file, debug_read_##name, inode-i_private); \
+   }  \
+  \
+   static const struct file_operations debug_##name##_fops = {\
+   .open

[PATCH v2 14/17] iommu/omap: Do not export unneeded functions

2014-10-22 Thread Suman Anna
The following functions were exported previously for usage by
the OMAP IOMMU debug module:
omap_iommu_dump_ctx()
omap_dump_tlb_entries()
omap_iopgtable_store_entry()

These functions need not be exported anymore as the OMAP IOMMU
debugfs code is integrated with the OMAP IOMMU driver, and
there won't be external users for these functions. So, remove
the EXPORT_SYMBOL_GPL on these. The omap_iopgtable_store_entry()
is also made internal only, after making the 'pagetable' debugfs
entry read-only.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu.c | 6 ++
 drivers/iommu/omap-iommu.h | 3 ---
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index 9171112..3dcaef0 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -511,7 +511,6 @@ ssize_t omap_iommu_dump_ctx(struct omap_iommu *obj, char 
*buf, ssize_t bytes)
 
return bytes;
 }
-EXPORT_SYMBOL_GPL(omap_iommu_dump_ctx);
 
 static int
 __dump_tlb_entries(struct omap_iommu *obj, struct cr_regs *crs, int num)
@@ -579,7 +578,6 @@ size_t omap_dump_tlb_entries(struct omap_iommu *obj, char 
*buf, ssize_t bytes)
 
return p - buf;
 }
-EXPORT_SYMBOL_GPL(omap_dump_tlb_entries);
 
 #endif /* CONFIG_OMAP_IOMMU_DEBUG */
 
@@ -764,7 +762,8 @@ iopgtable_store_entry_core(struct omap_iommu *obj, struct 
iotlb_entry *e)
  * @obj:   target iommu
  * @e: an iommu tlb entry info
  **/
-int omap_iopgtable_store_entry(struct omap_iommu *obj, struct iotlb_entry *e)
+static int
+omap_iopgtable_store_entry(struct omap_iommu *obj, struct iotlb_entry *e)
 {
int err;
 
@@ -774,7 +773,6 @@ int omap_iopgtable_store_entry(struct omap_iommu *obj, 
struct iotlb_entry *e)
prefetch_iotlb_entry(obj, e);
return err;
 }
-EXPORT_SYMBOL_GPL(omap_iopgtable_store_entry);
 
 /**
  * iopgtable_lookup_entry - Lookup an iommu pte entry
diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h
index b18cecc..d736630 100644
--- a/drivers/iommu/omap-iommu.h
+++ b/drivers/iommu/omap-iommu.h
@@ -190,9 +190,6 @@ static inline struct omap_iommu *dev_to_omap_iommu(struct 
device *dev)
 /*
  * global functions
  */
-extern int
-omap_iopgtable_store_entry(struct omap_iommu *obj, struct iotlb_entry *e);
-
 #ifdef CONFIG_OMAP_IOMMU_DEBUG
 extern ssize_t
 omap_iommu_dump_ctx(struct omap_iommu *obj, char *buf, ssize_t len);
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 15/17] iommu/omap: Reset the domain field upon detaching

2014-10-22 Thread Suman Anna
The .domain field in omap_iommu struct is set properly when the
OMAP IOMMU device is attached to, but is never reset properly
on detach. Reset this properly so that the OMAP IOMMU debugfs
logic can depend on this field before allowing the debugfs
operations.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index 3dcaef0..2ba3219 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1206,6 +1206,7 @@ static void _omap_iommu_detach_dev(struct 
omap_iommu_domain *omap_domain,
 
omap_domain-iommu_dev = arch_data-iommu_dev = NULL;
omap_domain-dev = NULL;
+   oiommu-domain = NULL;
 }
 
 static void omap_iommu_detach_dev(struct iommu_domain *domain,
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 01/17] iommu/omap: Remove refcount field from omap_iommu object

2014-10-22 Thread Suman Anna
The refcount field in omap_iommu object is primarily used to check
if an IOMMU device has already been enabled, but this is already
implicit in the omap_iommu_attach_dev() which ensures that only
a single device can attach to an IOMMU. This field is redundant,
and so has been cleaned up.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu.c | 15 +++
 drivers/iommu/omap-iommu.h |  1 -
 2 files changed, 3 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index 3627887..ea04e4d 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -819,8 +819,9 @@ static irqreturn_t iommu_fault_handler(int irq, void *data)
u32 *iopgd, *iopte;
struct omap_iommu *obj = data;
struct iommu_domain *domain = obj-domain;
+   struct omap_iommu_domain *omap_domain = domain-priv;
 
-   if (!obj-refcount)
+   if (!omap_domain-iommu_dev)
return IRQ_NONE;
 
errs = iommu_report_fault(obj, da);
@@ -880,13 +881,6 @@ static struct omap_iommu *omap_iommu_attach(const char 
*name, u32 *iopgd)
 
spin_lock(obj-iommu_lock);
 
-   /* an iommu device can only be attached once */
-   if (++obj-refcount  1) {
-   dev_err(dev, %s: already attached!\n, obj-name);
-   err = -EBUSY;
-   goto err_enable;
-   }
-
obj-iopgd = iopgd;
err = iommu_enable(obj);
if (err)
@@ -899,7 +893,6 @@ static struct omap_iommu *omap_iommu_attach(const char 
*name, u32 *iopgd)
return obj;
 
 err_enable:
-   obj-refcount--;
spin_unlock(obj-iommu_lock);
return ERR_PTR(err);
 }
@@ -915,9 +908,7 @@ static void omap_iommu_detach(struct omap_iommu *obj)
 
spin_lock(obj-iommu_lock);
 
-   if (--obj-refcount == 0)
-   iommu_disable(obj);
-
+   iommu_disable(obj);
obj-iopgd = NULL;
 
spin_unlock(obj-iommu_lock);
diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h
index 4f1b68c..5c14000 100644
--- a/drivers/iommu/omap-iommu.h
+++ b/drivers/iommu/omap-iommu.h
@@ -33,7 +33,6 @@ struct omap_iommu {
void*isr_priv;
struct iommu_domain *domain;
 
-   unsigned intrefcount;
spinlock_t  iommu_lock; /* global for this whole object */
 
/*
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 04/17] iommu/omap: Remove conditional definition of dev_to_omap_iommu()

2014-10-22 Thread Suman Anna
The dev_to_omap_iommu() is local to the OMAP IOMMU modules, and
need not be defined conditionally. The CONFIG_IOMMU_API dependency
check was added in the past to fix a compilation issue back when
the header resided in the arch/arm layers, and is no longer
needed.

While at this, fix the header against double inclusion as well.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu.h | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h
index 4fc51c8..d7c5132 100644
--- a/drivers/iommu/omap-iommu.h
+++ b/drivers/iommu/omap-iommu.h
@@ -10,6 +10,9 @@
  * published by the Free Software Foundation.
  */
 
+#ifndef _OMAP_IOMMU_H
+#define _OMAP_IOMMU_H
+
 #if defined(CONFIG_ARCH_OMAP1)
 #error iommu for this processor not implemented yet
 #endif
@@ -92,7 +95,6 @@ struct iommu_functions {
ssize_t (*dump_ctx)(struct omap_iommu *obj, char *buf, ssize_t len);
 };
 
-#ifdef CONFIG_IOMMU_API
 /**
  * dev_to_omap_iommu() - retrieves an omap iommu object from a user device
  * @dev: iommu client device
@@ -103,7 +105,6 @@ static inline struct omap_iommu *dev_to_omap_iommu(struct 
device *dev)
 
return arch_data-iommu_dev;
 }
-#endif
 
 /*
  * MMU Register offsets
@@ -220,3 +221,5 @@ static inline void iommu_write_reg(struct omap_iommu *obj, 
u32 val, size_t offs)
 {
__raw_writel(val, obj-regbase + offs);
 }
+
+#endif /* _OMAP_IOMMU_H */
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 07/17] iommu/omap: Remove bogus version check in context save/restore

2014-10-22 Thread Suman Anna
The omap2_iommu_save_ctx() and omap2_iommu_restore_ctx()
performs a sanity version check against a fixed value
that is correct only for OMAP2/OMAP3 IOMMUs. This fixed check
does not scale for all OMAP2+ IOMMUs and is not absolutely
required, so it has been removed.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu2.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/iommu/omap-iommu2.c b/drivers/iommu/omap-iommu2.c
index 2f6a9f7..372141b 100644
--- a/drivers/iommu/omap-iommu2.c
+++ b/drivers/iommu/omap-iommu2.c
@@ -26,8 +26,6 @@
 /*
  * omap2 architecture specific register bit definitions
  */
-#define IOMMU_ARCH_VERSION 0x0011
-
 /* IRQSTATUS  IRQENABLE */
 #define MMU_IRQ_MULTIHITFAULT  (1  4)
 #define MMU_IRQ_TABLEWALKFAULT (1  3)
@@ -268,8 +266,6 @@ static void omap2_iommu_save_ctx(struct omap_iommu *obj)
p[i] = iommu_read_reg(obj, i * sizeof(u32));
dev_dbg(obj-dev, %s\t[%02d] %08x\n, __func__, i, p[i]);
}
-
-   BUG_ON(p[0] != IOMMU_ARCH_VERSION);
 }
 
 static void omap2_iommu_restore_ctx(struct omap_iommu *obj)
@@ -281,8 +277,6 @@ static void omap2_iommu_restore_ctx(struct omap_iommu *obj)
iommu_write_reg(obj, p[i], i * sizeof(u32));
dev_dbg(obj-dev, %s\t[%02d] %08x\n, __func__, i, p[i]);
}
-
-   BUG_ON(p[0] != IOMMU_ARCH_VERSION);
 }
 
 static void omap2_cr_to_e(struct cr_regs *cr, struct iotlb_entry *e)
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 06/17] iommu/omap: Remove omap_iommu_arch_version() and version field

2014-10-22 Thread Suman Anna
The function omap_iommu_arch_version() is not used anymore,
and is not required either, so remove it. The .version field
in struct iommu_functions that this function uses is also
removed, as it is not really an ops to retrieve a version and
there won't be any usage for this field either.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu.c  | 9 -
 drivers/iommu/omap-iommu.h  | 4 
 drivers/iommu/omap-iommu2.c | 2 --
 3 files changed, 15 deletions(-)

diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index ea04e4d..f9efa6b 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -138,15 +138,6 @@ void omap_iommu_restore_ctx(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(omap_iommu_restore_ctx);
 
-/**
- * omap_iommu_arch_version - Return running iommu arch version
- **/
-u32 omap_iommu_arch_version(void)
-{
-   return arch_iommu-version;
-}
-EXPORT_SYMBOL_GPL(omap_iommu_arch_version);
-
 static int iommu_enable(struct omap_iommu *obj)
 {
int err;
diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h
index d7c5132..45fe67d 100644
--- a/drivers/iommu/omap-iommu.h
+++ b/drivers/iommu/omap-iommu.h
@@ -70,8 +70,6 @@ struct cr_regs {
 
 /* architecture specific functions */
 struct iommu_functions {
-   unsigned long   version;
-
int (*enable)(struct omap_iommu *obj);
void (*disable)(struct omap_iommu *obj);
void (*set_twl)(struct omap_iommu *obj, bool on);
@@ -191,8 +189,6 @@ static inline struct omap_iommu *dev_to_omap_iommu(struct 
device *dev)
 /*
  * global functions
  */
-extern u32 omap_iommu_arch_version(void);
-
 extern void omap_iotlb_cr_to_e(struct cr_regs *cr, struct iotlb_entry *e);
 
 extern int
diff --git a/drivers/iommu/omap-iommu2.c b/drivers/iommu/omap-iommu2.c
index 5e1ea3b..2f6a9f7 100644
--- a/drivers/iommu/omap-iommu2.c
+++ b/drivers/iommu/omap-iommu2.c
@@ -297,8 +297,6 @@ static void omap2_cr_to_e(struct cr_regs *cr, struct 
iotlb_entry *e)
 }
 
 static const struct iommu_functions omap2_iommu_ops = {
-   .version= IOMMU_ARCH_VERSION,
-
.enable = omap2_iommu_enable,
.disable= omap2_iommu_disable,
.set_twl= omap2_iommu_set_twl,
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 02/17] iommu/omap: Remove unused isr_priv field from omap_iommu

2014-10-22 Thread Suman Anna
The isr_priv field is a left-over from before the IOMMU API
adaptation, this was used to store the callback data. This is
no longer relevant, so remove it.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h
index 5c14000..18a0f3a 100644
--- a/drivers/iommu/omap-iommu.h
+++ b/drivers/iommu/omap-iommu.h
@@ -30,7 +30,6 @@ struct omap_iommu {
const char  *name;
void __iomem*regbase;
struct device   *dev;
-   void*isr_priv;
struct iommu_domain *domain;
 
spinlock_t  iommu_lock; /* global for this whole object */
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 13/17] iommu/omap: Remove couple of unused exported functions

2014-10-22 Thread Suman Anna
The exported functions omap_foreach_iommu_device() and
omap_iotlb_cr_to_e() have been deleted, as they are no
longer needed.

The function omap_foreach_iommu_device() is not required
after the consolidation of the OMAP IOMMU debug module,
and the function omap_iotlb_cr_to_e() is not required
after making the debugfs entry 'pagetable' read-only.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu.c | 21 -
 drivers/iommu/omap-iommu.h |  5 -
 2 files changed, 26 deletions(-)

diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index b92b6fc..9171112 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -203,20 +203,6 @@ static void iommu_disable(struct omap_iommu *obj)
 /*
  * TLB operations
  */
-void omap_iotlb_cr_to_e(struct cr_regs *cr, struct iotlb_entry *e)
-{
-   BUG_ON(!cr || !e);
-
-   e-da   = cr-cam  MMU_CAM_VATAG_MASK;
-   e-pa   = cr-ram  MMU_RAM_PADDR_MASK;
-   e-valid= cr-cam  MMU_CAM_V;
-   e-pgsz = cr-cam  MMU_CAM_PGSZ_MASK;
-   e-endian   = cr-ram  MMU_RAM_ENDIAN_MASK;
-   e-elsz = cr-ram  MMU_RAM_ELSZ_MASK;
-   e-mixed= cr-ram  MMU_RAM_MIXED;
-}
-EXPORT_SYMBOL_GPL(omap_iotlb_cr_to_e);
-
 static inline int iotlb_cr_valid(struct cr_regs *cr)
 {
if (!cr)
@@ -595,13 +581,6 @@ size_t omap_dump_tlb_entries(struct omap_iommu *obj, char 
*buf, ssize_t bytes)
 }
 EXPORT_SYMBOL_GPL(omap_dump_tlb_entries);
 
-int omap_foreach_iommu_device(void *data, int (*fn)(struct device *, void *))
-{
-   return driver_for_each_device(omap_iommu_driver.driver,
- NULL, data, fn);
-}
-EXPORT_SYMBOL_GPL(omap_foreach_iommu_device);
-
 #endif /* CONFIG_OMAP_IOMMU_DEBUG */
 
 /*
diff --git a/drivers/iommu/omap-iommu.h b/drivers/iommu/omap-iommu.h
index 4783779..b18cecc 100644
--- a/drivers/iommu/omap-iommu.h
+++ b/drivers/iommu/omap-iommu.h
@@ -190,14 +190,9 @@ static inline struct omap_iommu *dev_to_omap_iommu(struct 
device *dev)
 /*
  * global functions
  */
-extern void omap_iotlb_cr_to_e(struct cr_regs *cr, struct iotlb_entry *e);
-
 extern int
 omap_iopgtable_store_entry(struct omap_iommu *obj, struct iotlb_entry *e);
 
-extern int omap_foreach_iommu_device(void *data,
-   int (*fn)(struct device *, void *));
-
 #ifdef CONFIG_OMAP_IOMMU_DEBUG
 extern ssize_t
 omap_iommu_dump_ctx(struct omap_iommu *obj, char *buf, ssize_t len);
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 16/17] iommu/omap: Fix bus error on debugfs access of unattached IOMMU

2014-10-22 Thread Suman Anna
Any debugfs access on an OMAP IOMMU that is not enabled (done during
attach) results in a bus error due to access of registers without
the clock or the reset enabled for the respective IOMMU. So, add a
check to make sure the IOMMU is enabled/attached by a client device.
This gracefully prints a Operation not permitted trace when the
corresponding IOMMU is not enabled.

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/omap-iommu-debug.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c
index 4813d3a..41b09a1 100644
--- a/drivers/iommu/omap-iommu-debug.c
+++ b/drivers/iommu/omap-iommu-debug.c
@@ -24,6 +24,11 @@ static DEFINE_MUTEX(iommu_debug_lock);
 
 static struct dentry *iommu_debug_root;
 
+static inline bool is_omap_iommu_detached(struct omap_iommu *obj)
+{
+   return !obj-domain;
+}
+
 static ssize_t debug_read_regs(struct file *file, char __user *userbuf,
   size_t count, loff_t *ppos)
 {
@@ -31,6 +36,9 @@ static ssize_t debug_read_regs(struct file *file, char __user 
*userbuf,
char *p, *buf;
ssize_t bytes;
 
+   if (is_omap_iommu_detached(obj))
+   return -EPERM;
+
buf = kmalloc(count, GFP_KERNEL);
if (!buf)
return -ENOMEM;
@@ -54,6 +62,9 @@ static ssize_t debug_read_tlb(struct file *file, char __user 
*userbuf,
char *p, *buf;
ssize_t bytes, rest;
 
+   if (is_omap_iommu_detached(obj))
+   return -EPERM;
+
buf = kmalloc(count, GFP_KERNEL);
if (!buf)
return -ENOMEM;
@@ -139,6 +150,9 @@ static ssize_t debug_read_pagetable(struct file *file, char 
__user *userbuf,
char *p, *buf;
size_t bytes;
 
+   if (is_omap_iommu_detached(obj))
+   return -EPERM;
+
buf = (char *)__get_free_page(GFP_KERNEL);
if (!buf)
return -ENOMEM;
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 09/17] iommu/omap: Consolidate OMAP IOMMU modules

2014-10-22 Thread Suman Anna
The OMAP IOMMU driver was originally designed as modules, and split
into a core module and a thin arch-specific module through the OMAP
arch-specific struct iommu_functions, to scale for both OMAP1 and
OMAP2+ IOMMU variants. The driver can only be built for OMAP2+
platforms currently, and also can only be built-in after the
adaptation to generic IOMMU API. The OMAP1 variant was never added
and will most probably be never added (the code for the only potential
user, its parent, DSP processor has already been cleaned up). So,
consolidate the OMAP2 specific omap-iommu2 module into the core OMAP
IOMMU driver - this eliminates the arch-specific ops structure and
simplifies the driver into a single module that only implements the
generic IOMMU API's iommu_ops.

The following are the main changes:
- omap-iommu2 module is completely eliminated, with the common
  definitions moved to the internal omap-iommu.h, and the ops
  implementations moved into omap-iommu.c
- OMAP arch-specific struct iommu_functions is also eliminated,
  with the ops implementations directly absorbed into the calling
  functions
- iotlb_alloc_cr() is no longer inlined and defined only when
  PREFETCH_IOTLB is defined
- iotlb_dump_cr() is similarly defined only when CONFIG_OMAP_IOMMU_DEBUG
  is defined
- Elimination of the OMAP IOMMU exported functions to register the
  arch ops, omap_install_iommu_arch()  omap_uninstall_iommu_arch()
- Any stale comments about OMAP1 are also cleaned up

Signed-off-by: Suman Anna s-a...@ti.com
Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
---
 drivers/iommu/Makefile  |   1 -
 drivers/iommu/omap-iommu.c  | 263 ++---
 drivers/iommu/omap-iommu.h  |  61 +
 drivers/iommu/omap-iommu2.c | 311 
 4 files changed, 217 insertions(+), 419 deletions(-)
 delete mode 100644 drivers/iommu/omap-iommu2.c

diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 16edef7..18fa446 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -11,7 +11,6 @@ obj-$(CONFIG_INTEL_IOMMU) += iova.o intel-iommu.o
 obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o
 obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o
 obj-$(CONFIG_OMAP_IOMMU) += omap-iommu.o
-obj-$(CONFIG_OMAP_IOMMU) += omap-iommu2.o
 obj-$(CONFIG_OMAP_IOMMU_DEBUG) += omap-iommu-debug.o
 obj-$(CONFIG_TEGRA_IOMMU_GART) += tegra-gart.o
 obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index f9efa6b..91262fa 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -76,53 +76,23 @@ struct iotlb_lock {
short vict;
 };
 
-/* accommodate the difference between omap1 and omap2/3 */
-static const struct iommu_functions *arch_iommu;
-
 static struct platform_driver omap_iommu_driver;
 static struct kmem_cache *iopte_cachep;
 
 /**
- * omap_install_iommu_arch - Install archtecure specific iommu functions
- * @ops:   a pointer to architecture specific iommu functions
- *
- * There are several kind of iommu algorithm(tlb, pagetable) among
- * omap series. This interface installs such an iommu algorighm.
- **/
-int omap_install_iommu_arch(const struct iommu_functions *ops)
-{
-   if (arch_iommu)
-   return -EBUSY;
-
-   arch_iommu = ops;
-   return 0;
-}
-EXPORT_SYMBOL_GPL(omap_install_iommu_arch);
-
-/**
- * omap_uninstall_iommu_arch - Uninstall archtecure specific iommu functions
- * @ops:   a pointer to architecture specific iommu functions
- *
- * This interface uninstalls the iommu algorighm installed previously.
- **/
-void omap_uninstall_iommu_arch(const struct iommu_functions *ops)
-{
-   if (arch_iommu != ops)
-   pr_err(%s: not your arch\n, __func__);
-
-   arch_iommu = NULL;
-}
-EXPORT_SYMBOL_GPL(omap_uninstall_iommu_arch);
-
-/**
  * omap_iommu_save_ctx - Save registers for pm off-mode support
  * @dev:   client device
  **/
 void omap_iommu_save_ctx(struct device *dev)
 {
struct omap_iommu *obj = dev_to_omap_iommu(dev);
+   u32 *p = obj-ctx;
+   int i;
 
-   arch_iommu-save_ctx(obj);
+   for (i = 0; i  (MMU_REG_SIZE / sizeof(u32)); i++) {
+   p[i] = iommu_read_reg(obj, i * sizeof(u32));
+   dev_dbg(obj-dev, %s\t[%02d] %08x\n, __func__, i, p[i]);
+   }
 }
 EXPORT_SYMBOL_GPL(omap_iommu_save_ctx);
 
@@ -133,20 +103,75 @@ EXPORT_SYMBOL_GPL(omap_iommu_save_ctx);
 void omap_iommu_restore_ctx(struct device *dev)
 {
struct omap_iommu *obj = dev_to_omap_iommu(dev);
+   u32 *p = obj-ctx;
+   int i;
 
-   arch_iommu-restore_ctx(obj);
+   for (i = 0; i  (MMU_REG_SIZE / sizeof(u32)); i++) {
+   iommu_write_reg(obj, p[i], i * sizeof(u32));
+   dev_dbg(obj-dev, %s\t[%02d] %08x\n, __func__, i, p[i]);
+   }
 }
 EXPORT_SYMBOL_GPL(omap_iommu_restore_ctx);
 
+static void __iommu_set_twl(struct omap_iommu *obj, bool on)
+{
+ 

Re: [PATCH v2 12/17] iommu/omap: Integrate omap-iommu-debug into omap-iommu

2014-10-22 Thread Laurent Pinchart
Hi Suman,

Thank you for the patch.

On Wednesday 22 October 2014 17:22:30 Suman Anna wrote:
 The debugfs support for OMAP IOMMU is currently implemented
 as a module, warranting certain OMAP-specific IOMMU API to
 be exported. The OMAP IOMMU, when enabled, can only be built-in
 into the kernel, so integrate the OMAP IOMMU debug module
 into the OMAP IOMMU driver. This helps in eliminating the
 need to export most of the current OMAP IOMMU API.
 
 The following are the main changes:
 - The debugfs directory and entry creation logic is reversed,
   the calls are invoked by the OMAP IOMMU driver now.
 - The current iffy circular logic of adding IOMMU archdata
   to the IOMMU devices itself to get a pointer to the omap_iommu
   object in the debugfs support code is replaced by directly
   using the omap_iommu structure while creating the debugfs
   entries.
 - The debugfs root directory is renamed from the generic name
   iommu to a specific name omap_iommu.
 - Unneeded headers have also been cleaned up while at this.
 - There will no longer be a omap-iommu-debug.ko module after
   this patch.
 - The OMAP_IOMMU_DEBUG Kconfig option is converted to boolean
   only, the OMAP IOMMU debugfs support is built alongside the
   OMAP IOMMU driver only when this option is enabled.
 
 Signed-off-by: Suman Anna s-a...@ti.com

Acked-by: Laurent Pinchart laurent.pinch...@ideasonboard.com

 ---
  drivers/iommu/Kconfig|  12 ++---
  drivers/iommu/omap-iommu-debug.c | 100  ++-
  drivers/iommu/omap-iommu.c   |  11 -
  drivers/iommu/omap-iommu.h   |  15 ++
  4 files changed, 58 insertions(+), 80 deletions(-)
 
 diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
 index dd51122..1d54996 100644
 --- a/drivers/iommu/Kconfig
 +++ b/drivers/iommu/Kconfig
 @@ -144,13 +144,13 @@ config OMAP_IOMMU
   select IOMMU_API
 
  config OMAP_IOMMU_DEBUG
 -   tristate Export OMAP IOMMU internals in DebugFS
 -   depends on OMAP_IOMMU  DEBUG_FS
 -   help
 - Select this to see extensive information about
 - the internal state of OMAP IOMMU in debugfs.
 + bool Export OMAP IOMMU internals in DebugFS
 + depends on OMAP_IOMMU  DEBUG_FS
 + ---help---
 +   Select this to see extensive information about
 +   the internal state of OMAP IOMMU in debugfs.
 
 - Say N unless you know you need this.
 +   Say N unless you know you need this.
 
  config TEGRA_IOMMU_GART
   bool Tegra GART IOMMU Support
 diff --git a/drivers/iommu/omap-iommu-debug.c
 b/drivers/iommu/omap-iommu-debug.c index 28de657..4813d3a 100644
 --- a/drivers/iommu/omap-iommu-debug.c
 +++ b/drivers/iommu/omap-iommu-debug.c
 @@ -10,15 +10,11 @@
   * published by the Free Software Foundation.
   */
 
 -#include linux/module.h
  #include linux/err.h
 -#include linux/clk.h
  #include linux/io.h
  #include linux/slab.h
  #include linux/uaccess.h
 -#include linux/platform_device.h
  #include linux/debugfs.h
 -#include linux/omap-iommu.h
  #include linux/platform_data/iommu-omap.h
 
  #include omap-iopgtable.h
 @@ -31,8 +27,7 @@ static struct dentry *iommu_debug_root;
  static ssize_t debug_read_regs(struct file *file, char __user *userbuf,
  size_t count, loff_t *ppos)
  {
 - struct device *dev = file-private_data;
 - struct omap_iommu *obj = dev_to_omap_iommu(dev);
 + struct omap_iommu *obj = file-private_data;
   char *p, *buf;
   ssize_t bytes;
 
 @@ -55,8 +50,7 @@ static ssize_t debug_read_regs(struct file *file, char
 __user *userbuf, static ssize_t debug_read_tlb(struct file *file, char
 __user *userbuf, size_t count, loff_t *ppos)
  {
 - struct device *dev = file-private_data;
 - struct omap_iommu *obj = dev_to_omap_iommu(dev);
 + struct omap_iommu *obj = file-private_data;
   char *p, *buf;
   ssize_t bytes, rest;
 
 @@ -141,8 +135,7 @@ out:
  static ssize_t debug_read_pagetable(struct file *file, char __user
 *userbuf, size_t count, loff_t *ppos)
  {
 - struct device *dev = file-private_data;
 - struct omap_iommu *obj = dev_to_omap_iommu(dev);
 + struct omap_iommu *obj = file-private_data;
   char *p, *buf;
   size_t bytes;
 
 @@ -181,93 +174,56 @@ DEBUG_FOPS_RO(pagetable);
  #define __DEBUG_ADD_FILE(attr, mode) \
   {   \
   struct dentry *dent;\
 - dent = debugfs_create_file(#attr, mode, parent, \
 -dev, debug_##attr##_fops);  \
 + dent = debugfs_create_file(#attr, mode, obj-debug_dir, \
 +obj, debug_##attr##_fops);  \
   if (!dent)  \
 - return -ENOMEM; \
 + goto err;   

Re: [PATCH v3 02/27] x86/xen/MSI: Eliminate arch_msix_mask_irq() and arch_msi_mask_irq()

2014-10-22 Thread Yijing Wang
On 2014/10/23 12:25, Bjorn Helgaas wrote:
 On Wed, Oct 15, 2014 at 11:06:50AM +0800, Yijing Wang wrote:
 Commit 0e4ccb1505a9 (PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq())
 introduced two __weak arch functions arch_msix_mask_irq() and
 arch_msi_mask_irq() to work around a bug when running xen in x86.
 These two functions made msi code more complex. This patch reverts
 the commit and introduces a global flag to control msi mask action to
 avoid the bug. The patch is also preparation for using MSI chip framework
 instead of weak arch MSI functions in all platforms. Keep 
 default_msi_mask_irq()
 and default_msix_mask_irq() in linux/msi.h to make s390 MSI code compile
 happy, they will be removed in the later patch.
 
 This patch is basically a revert of 0e4ccb1505a9 intermingled with the
 addition of the new pci_msi_ignore_mask technique.
 
 Can you please split this into two patches:
 
   - Add the pci_msi_ignore_mask stuff (alongside the existing way)
   - Revert 0e4ccb1505a9
 
 I think that will make it easier to see what's going on.

OK, I will do it, thanks.

Thanks!
Yijing.

 
 Signed-off-by: Yijing Wang wangyij...@huawei.com
 CC: David Vrabel david.vra...@citrix.com
 CC: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 ---
 Hi David and Konrad,
I dropped the Acked-by and tested-by, because this version has a
 lot changes compared to last. So, I guess you may want to check it again.
 ---
  arch/x86/include/asm/x86_init.h |3 ---
  arch/x86/kernel/x86_init.c  |   10 --
  arch/x86/pci/xen.c  |   14 ++
  drivers/pci/msi.c   |   29 -
  include/linux/msi.h |8 ++--
  5 files changed, 20 insertions(+), 44 deletions(-)

 diff --git a/arch/x86/include/asm/x86_init.h 
 b/arch/x86/include/asm/x86_init.h
 index e45e4da..f58a9c7 100644
 --- a/arch/x86/include/asm/x86_init.h
 +++ b/arch/x86/include/asm/x86_init.h
 @@ -172,7 +172,6 @@ struct x86_platform_ops {
  
  struct pci_dev;
  struct msi_msg;
 -struct msi_desc;
  
  struct x86_msi_ops {
  int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type);
 @@ -183,8 +182,6 @@ struct x86_msi_ops {
  void (*teardown_msi_irqs)(struct pci_dev *dev);
  void (*restore_msi_irqs)(struct pci_dev *dev);
  int  (*setup_hpet_msi)(unsigned int irq, unsigned int id);
 -u32 (*msi_mask_irq)(struct msi_desc *desc, u32 mask, u32 flag);
 -u32 (*msix_mask_irq)(struct msi_desc *desc, u32 flag);
  };
  
  struct IO_APIC_route_entry;
 diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
 index e48b674..234b072 100644
 --- a/arch/x86/kernel/x86_init.c
 +++ b/arch/x86/kernel/x86_init.c
 @@ -116,8 +116,6 @@ struct x86_msi_ops x86_msi = {
  .teardown_msi_irqs  = default_teardown_msi_irqs,
  .restore_msi_irqs   = default_restore_msi_irqs,
  .setup_hpet_msi = default_setup_hpet_msi,
 -.msi_mask_irq   = default_msi_mask_irq,
 -.msix_mask_irq  = default_msix_mask_irq,
  };
  
  /* MSI arch specific hooks */
 @@ -140,14 +138,6 @@ void arch_restore_msi_irqs(struct pci_dev *dev)
  {
  x86_msi.restore_msi_irqs(dev);
  }
 -u32 arch_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 -{
 -return x86_msi.msi_mask_irq(desc, mask, flag);
 -}
 -u32 arch_msix_mask_irq(struct msi_desc *desc, u32 flag)
 -{
 -return x86_msi.msix_mask_irq(desc, flag);
 -}
  #endif
  
  struct x86_io_apic_ops x86_io_apic_ops = {
 diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
 index 093f5f4..e5b8b78 100644
 --- a/arch/x86/pci/xen.c
 +++ b/arch/x86/pci/xen.c
 @@ -394,14 +394,6 @@ static void xen_teardown_msi_irq(unsigned int irq)
  {
  xen_destroy_irq(irq);
  }
 -static u32 xen_nop_msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
 -{
 -return 0;
 -}
 -static u32 xen_nop_msix_mask_irq(struct msi_desc *desc, u32 flag)
 -{
 -return 0;
 -}
  #endif
  
  int __init pci_xen_init(void)
 @@ -425,8 +417,7 @@ int __init pci_xen_init(void)
  x86_msi.setup_msi_irqs = xen_setup_msi_irqs;
  x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
  x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs;
 -x86_msi.msi_mask_irq = xen_nop_msi_mask_irq;
 -x86_msi.msix_mask_irq = xen_nop_msix_mask_irq;
 +pci_msi_ignore_mask = 1;
  #endif
  return 0;
  }
 @@ -506,8 +497,7 @@ int __init pci_xen_initial_domain(void)
  x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs;
  x86_msi.teardown_msi_irq = xen_teardown_msi_irq;
  x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs;
 -x86_msi.msi_mask_irq = xen_nop_msi_mask_irq;
 -x86_msi.msix_mask_irq = xen_nop_msix_mask_irq;
 +pci_msi_ignore_mask = 1;
  #endif
  xen_setup_acpi_sci();
  __acpi_register_gsi = acpi_register_gsi_xen;
 diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
 index ecb92a5..22e413c 100644
 --- a/drivers/pci/msi.c
 +++ b/drivers/pci/msi.c
 @@ -23,6 +23,7 @@
  #include pci.h
  
  static int 

Re: [PATCH v3 05/27] PCI: tegra: Save msi chip in pci_sys_data

2014-10-22 Thread Bjorn Helgaas
On Wed, Oct 15, 2014 at 11:06:53AM +0800, Yijing Wang wrote:
 Save msi chip in pci_sys_data instead of assign
 msi chip to every pci bus in .add_bus().
 
 Signed-off-by: Yijing Wang wangyij...@huawei.com
 ---
  drivers/pci/host/pci-tegra.c |   13 +++--
  1 files changed, 3 insertions(+), 10 deletions(-)
 
 diff --git a/drivers/pci/host/pci-tegra.c b/drivers/pci/host/pci-tegra.c
 index 3d43874..5af0525 100644
 --- a/drivers/pci/host/pci-tegra.c
 +++ b/drivers/pci/host/pci-tegra.c
 @@ -694,15 +694,6 @@ static int tegra_pcie_map_irq(const struct pci_dev 
 *pdev, u8 slot, u8 pin)
   return irq;
  }
  
 -static void tegra_pcie_add_bus(struct pci_bus *bus)
 -{
 - if (IS_ENABLED(CONFIG_PCI_MSI)) {
 - struct tegra_pcie *pcie = sys_to_pcie(bus-sysdata);
 -
 - bus-msi = pcie-msi.chip;
 - }
 -}
 -
  static struct pci_bus *tegra_pcie_scan_bus(int nr, struct pci_sys_data *sys)
  {
   struct tegra_pcie *pcie = sys_to_pcie(sys);
 @@ -1881,11 +1872,13 @@ static int tegra_pcie_enable(struct tegra_pcie *pcie)
  
   memset(hw, 0, sizeof(hw));
  
 +#ifdef CONFIG_PCI_MSI
 + hw.msi_chip = pcie-msi.chip;
 +#endif

Why did you use #ifdef CONFIG_PCI_MSI instead of the
IS_ENABLED(CONFIG_PCI_MSI) used previously?

It's true that CONFIG_PCI_MSI will never be a tristate symbol, so we don't
really *need* the extra smarts of IS_ENABLED(), but I'm fairly sympathetic
to James' argument [1] that we should just use IS_ENABLED() all the time
because it's simpler overall.

If you want to change the #ifdef to IS_ENABLED(), that should be a separate
patch from your msi_chip change, and we can debate the merits of that by
itself.

[1] http://lkml.iu.edu//hypermail/linux/kernel/1204.3/00081.html

   hw.nr_controllers = 1;
   hw.private_data = (void **)pcie;
   hw.setup = tegra_pcie_setup;
   hw.map_irq = tegra_pcie_map_irq;
 - hw.add_bus = tegra_pcie_add_bus;
   hw.scan = tegra_pcie_scan_bus;
   hw.ops = tegra_pcie_ops;
  
 -- 
 1.7.1
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 04/27] arm/MSI: Save MSI chip in pci_sys_data

2014-10-22 Thread Bjorn Helgaas
On Wed, Oct 15, 2014 at 11:06:52AM +0800, Yijing Wang wrote:
 Saving msi chip in pci_sys_data can make pci bus and
 devices don't need to know msi chip detail, it also
 make pci enumeration code be decoupled from msi chip.
 In fact, all pci devices under the same pci hostbridge
 share same msi chip. So msi chip should be seen as one
 of resources or attributes to be initialized in pci host
 bridge driver. Currently, pci hostbridge drivers create
 pci_host_bridge in pci_create_root_bus(), and pass arch
 specific pci sysdata to core pci scan functions. So pci
 arch sysdata is good place to save msi chip.
 
 Signed-off-by: Yijing Wang wangyij...@huawei.com
 ---
  arch/arm/include/asm/mach/pci.h |6 ++
  arch/arm/include/asm/pci.h  |9 +
  arch/arm/kernel/bios32.c|3 +++
  drivers/pci/msi.c   |6 ++
  include/linux/pci.h |9 +
  5 files changed, 33 insertions(+), 0 deletions(-)
 
 diff --git a/arch/arm/include/asm/mach/pci.h b/arch/arm/include/asm/mach/pci.h
 index 7fc4278..59b0d87 100644
 --- a/arch/arm/include/asm/mach/pci.h
 +++ b/arch/arm/include/asm/mach/pci.h
 @@ -22,6 +22,9 @@ struct hw_pci {
  #ifdef CONFIG_PCI_DOMAINS
   int domain;
  #endif
 +#ifdef CONFIG_PCI_MSI
 + struct msi_chip *msi_chip;
 +#endif
   struct pci_ops  *ops;
   int nr_controllers;
   void**private_data;
 @@ -47,6 +50,9 @@ struct pci_sys_data {
  #ifdef CONFIG_PCI_DOMAINS
   int domain;
  #endif
 +#ifdef CONFIG_PCI_MSI
 + struct msi_chip *msi_chip;
 +#endif
   struct list_head node;
   int busnr;  /* primary bus number   
 */
   u64 mem_offset; /* bus-cpu memory mapping offset   
 */
 diff --git a/arch/arm/include/asm/pci.h b/arch/arm/include/asm/pci.h
 index 7e95d85..b562c09 100644
 --- a/arch/arm/include/asm/pci.h
 +++ b/arch/arm/include/asm/pci.h
 @@ -31,6 +31,15 @@ static inline int pci_proc_domain(struct pci_bus *bus)
  }
  #endif /* CONFIG_PCI_DOMAINS */
  
 +#ifdef CONFIG_PCI_MSI
 +static inline struct msi_chip *pci_msi_chip(struct pci_bus *bus)
 +{
 + struct pci_sys_data *root = bus-sysdata;
 +
 + return root-msi_chip;
 +}
 +#endif
 +
  /*
   * The PCI address space does equal the physical memory address space.
   * The networking and block device layers use this boolean for bounce
 diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c
 index 17a26c1..a19038d 100644
 --- a/arch/arm/kernel/bios32.c
 +++ b/arch/arm/kernel/bios32.c
 @@ -471,6 +471,9 @@ static void pcibios_init_hw(struct device *parent, struct 
 hw_pci *hw,
  #ifdef CONFIG_PCI_DOMAINS
   sys-domain  = hw-domain;
  #endif
 +#ifdef CONFIG_PCI_MSI
 + sys-msi_chip = hw-msi_chip;
 +#endif
   sys-busnr   = busnr;
   sys-swizzle = hw-swizzle;
   sys-map_irq = hw-map_irq;
 diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
 index 22e413c..f11108c 100644
 --- a/drivers/pci/msi.c
 +++ b/drivers/pci/msi.c
 @@ -35,6 +35,9 @@ int __weak arch_setup_msi_irq(struct pci_dev *dev, struct 
 msi_desc *desc)
   struct msi_chip *chip = dev-bus-msi;
   int err;
  
 + if (!chip)
 + chip = pci_msi_chip(dev-bus);
 +
   if (!chip || !chip-setup_irq)
   return -EINVAL;
  
 @@ -50,6 +53,9 @@ void __weak arch_teardown_msi_irq(unsigned int irq)
   struct msi_desc *entry = irq_get_msi_desc(irq);
   struct msi_chip *chip = entry-dev-bus-msi;
  
 + if (!chip)
 + chip = pci_msi_chip(entry-dev-bus);
 +
   if (!chip || !chip-teardown_irq)
   return;
  
 diff --git a/include/linux/pci.h b/include/linux/pci.h
 index 9cd2721..7a48b40 100644
 --- a/include/linux/pci.h
 +++ b/include/linux/pci.h
 @@ -1433,6 +1433,15 @@ static inline int pci_get_new_domain_nr(void) { return 
 -ENOSYS; }
  
  #include asm/pci.h
  
 +/* Just avoid compile error, will be clean up later */
 +#ifdef CONFIG_PCI_MSI
 +
 +#ifndef pci_msi_chip
 +#define pci_msi_chip(bus)NULL
 +#endif
 +#endif

I don't like the mixture of ARM changes and PCI core changes in the same
patch.  Can you split this into a core patch that does something like this:

  struct msi_chip * __weak pcibios_msi_controller(struct pci_bus *bus)
  {
return NULL;
  }

  struct msi_chip *pci_msi_controller(struct pci_bus *bus)
  {
msi_chip *controller = bus-msi;

if (controller)
  return controller;
return pcibios_msi_controller(bus);
  }

followed by an ARM patch that puts the msi_chip pointer in struct hw_pci
and implements pcibios_msi_controller()?

I know you're trying to *remove* weak functions, and this adds one, but
this section of the series is more about getting rid of the ARM
pcibios_add_bus() because all it was used for was setting the bus-msi
pointer.

Eventually we might have a way to stash an MSI controller pointer in the
generic pci_host_bridge struct, and 

Re: [PATCH v3 09/27] arm/PCI: Clean unused pcibios_add_bus() and pcibios_remove_bus()

2014-10-22 Thread Bjorn Helgaas
On Wed, Oct 15, 2014 at 11:06:57AM +0800, Yijing Wang wrote:
 MSI chip will be saved in pci_sys_data, now we can
 clean up pcibios_add_bus() and pcibios_remove_bus()
 in arm, and use pci_find_msi_chip() to get msi chip
 in core MSI code.
 
 Signed-off-by: Yijing Wang wangyij...@huawei.com
 ---
  arch/arm/include/asm/mach/pci.h |4 
  arch/arm/kernel/bios32.c|   16 
  drivers/pci/msi.c   |   11 +++
  3 files changed, 3 insertions(+), 28 deletions(-)
 
 diff --git a/arch/arm/include/asm/mach/pci.h b/arch/arm/include/asm/mach/pci.h
 index 59b0d87..230b2c9 100644
 --- a/arch/arm/include/asm/mach/pci.h
 +++ b/arch/arm/include/asm/mach/pci.h
 @@ -39,8 +39,6 @@ struct hw_pci {
 resource_size_t start,
 resource_size_t size,
 resource_size_t align);
 - void(*add_bus)(struct pci_bus *bus);
 - void(*remove_bus)(struct pci_bus *bus);
  };
  
  /*
 @@ -71,8 +69,6 @@ struct pci_sys_data {
 resource_size_t start,
 resource_size_t size,
 resource_size_t align);
 - void(*add_bus)(struct pci_bus *bus);
 - void(*remove_bus)(struct pci_bus *bus);
   void*private_data;  /* platform controller private data 
 */
  };
  
 diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c
 index a19038d..b1b872e 100644
 --- a/arch/arm/kernel/bios32.c
 +++ b/arch/arm/kernel/bios32.c
 @@ -360,20 +360,6 @@ void pcibios_fixup_bus(struct pci_bus *bus)
  }
  EXPORT_SYMBOL(pcibios_fixup_bus);
  
 -void pcibios_add_bus(struct pci_bus *bus)
 -{
 - struct pci_sys_data *sys = bus-sysdata;
 - if (sys-add_bus)
 - sys-add_bus(bus);
 -}
 -
 -void pcibios_remove_bus(struct pci_bus *bus)
 -{
 - struct pci_sys_data *sys = bus-sysdata;
 - if (sys-remove_bus)
 - sys-remove_bus(bus);
 -}
 -
  /*
   * Swizzle the device pin each time we cross a bridge.  If a platform does
   * not provide a swizzle function, we perform the standard PCI swizzling.
 @@ -478,8 +464,6 @@ static void pcibios_init_hw(struct device *parent, struct 
 hw_pci *hw,
   sys-swizzle = hw-swizzle;
   sys-map_irq = hw-map_irq;
   sys-align_resource = hw-align_resource;
 - sys-add_bus = hw-add_bus;
 - sys-remove_bus = hw-remove_bus;
   INIT_LIST_HEAD(sys-resources);
  
   if (hw-private_data)

What do the core changes below have to do with the ARM changes above?
They should be a separate patch unless they can't be separated.

 diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
 index f11108c..56e54ad 100644
 --- a/drivers/pci/msi.c
 +++ b/drivers/pci/msi.c
 @@ -32,12 +32,10 @@ int pci_msi_ignore_mask;
  
  int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
  {
 - struct msi_chip *chip = dev-bus-msi;
 + struct msi_chip *chip;
   int err;
  
 - if (!chip)
 - chip = pci_msi_chip(dev-bus);
 -
 + chip = pci_msi_chip(dev-bus);
   if (!chip || !chip-setup_irq)
   return -EINVAL;
  
 @@ -51,10 +49,7 @@ int __weak arch_setup_msi_irq(struct pci_dev *dev, struct 
 msi_desc *desc)
  void __weak arch_teardown_msi_irq(unsigned int irq)
  {
   struct msi_desc *entry = irq_get_msi_desc(irq);
 - struct msi_chip *chip = entry-dev-bus-msi;
 -
 - if (!chip)
 - chip = pci_msi_chip(entry-dev-bus);
 + struct msi_chip *chip = pci_msi_chip(entry-dev-bus);
  
   if (!chip || !chip-teardown_irq)
   return;
 -- 
 1.7.1
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-pci in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 10/27] PCI/MSI: Remove useless bus-msi assignment

2014-10-22 Thread Bjorn Helgaas
On Wed, Oct 15, 2014 at 11:06:58AM +0800, Yijing Wang wrote:
 Now msi chip is saved in pci_sys_data in arm,
 we could clean the bus-msi assignment in
 pci core.
 
 Signed-off-by: Yijing Wang wangyij...@huawei.com
 CC: Thierry Reding thierry.red...@gmail.com
 CC: Thomas Petazzoni thomas.petazz...@free-electrons.com
 ---
  drivers/pci/probe.c |1 -
  1 files changed, 0 insertions(+), 1 deletions(-)
 
 diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
 index efa48dc..98bf4c3 100644
 --- a/drivers/pci/probe.c
 +++ b/drivers/pci/probe.c
 @@ -682,7 +682,6 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus 
 *parent,
  
   child-parent = parent;
   child-ops = parent-ops;
 - child-msi = parent-msi;

This needs an explanation of why ARM was the only arch to depend on this.

   child-sysdata = parent-sysdata;
   child-bus_flags = parent-bus_flags;
  
 -- 
 1.7.1
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-pci in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 00/27] Use MSI chip framework to configure MSI/MSI-X in all platforms

2014-10-22 Thread Bjorn Helgaas
On Wed, Oct 15, 2014 at 11:06:48AM +0800, Yijing Wang wrote:
 Now there are a lot of weak arch functions in MSI code.
 Thierry Reding Introduced MSI chip framework to configure MSI/MSI-X in arm,
 that's a better solution than overriding lots of existing weak arch 
 functionsin. 
 This series use MSI chip framework to refactor MSI code across all platforms 
 to eliminate weak arch functions. Then all MSI irqs will be managed in a 
 unified framework. Because this series changed a lot of ARCH MSI code, 
 so tests in the related platforms are warmly welcomed!
 
 And you may access it at:
 https://github.com/YijingWang/msi-chip-v3.git master
 
 v2-v3:
 1. For patch x86/xen/MSI: Eliminate..., introduce a new global flag 
 pci_msi_ignore_mask
 to control the msi mask instead of replacing the irqchip-mask with nop 
 function,
 the latter method has problem pointed out by Konrad Rzeszutek Wilk.
 2. Save msi chip in arch pci sysdata instead of associating msi chip to pci 
 bus.
 Because pci devices under same host share the same msi chip, so I think 
 associate
 msi chip to pci host/pci sysdata is better than to bother every pci 
 bus/devices.
 A better solution suggested by Liviu is to rip out pci_host_bridge from 
 pci_create_root_bus(), 
 then we can save some pci host common attributes like domain_nr, msi_chip, 
 resources,
 into the generic pci_host_bridge. Because this changes to pci host bridge is 
 also 
 a large series, so I think we should go step by step, I will try to post it 
 in another
 series later.
 4. Clean up arm pcibios_add_bus() and pcibios_remove_bus() which were used to 
 associate
 msi chip to pci bus.
 
 v1-v2:
 Add a patch to make s390 MSI code build happy between patch x86/xen/MSI: E..
 and s390/MSI: Use MSI... Fix several typo problems found by Lucas.
 
 RFC-v1: 
 Updated [patch 4/21] x86/xen/MSI: Eliminate..., export msi_chip instead
 of #ifdef to fix MSI bug in xen running in x86. 
 Rename arch_get_match_msi_chip() to arch_find_msi_chip().
 Drop use struct device as the msi_chip argument, we will do that
 later in another patchset.
 
 Yijing Wang (27):

This is a lot of stuff, and it's not all related, so putting it all
together in one series makes it hard for me to deal with it.

   MSI: Remove the redundant irq_set_chip_data()

This doesn't seem related to eliminating weak functions.

   x86/xen/MSI: Eliminate arch_msix_mask_irq() and arch_msi_mask_irq()
   s390/MSI: Use __msi_mask_irq() instead of default_msi_mask_irq()

These two seem to go together.

   arm/MSI: Save MSI chip in pci_sys_data
   PCI: tegra: Save msi chip in pci_sys_data
   PCI: designware: Save msi chip in pci_sys_data
   PCI: rcar: Save msi chip in pci_sys_data
   PCI: mvebu: Save msi chip in pci_sys_data
   arm/PCI: Clean unused pcibios_add_bus() and pcibios_remove_bus()
   PCI/MSI: Remove useless bus-msi assignment

These seem to go together (it'd be nice if they were all capitalized the
same).

I don't like the msi_chip name because the chip part doesn't convey
much information, other than telling us that apparently some sort of
semiconductor integrated circuit is involved.  I sort of assumed that
much :)

Something like msi_controller would be more descriptive since we're
talking about an interrupt controller that accepts writes from a device and
turns them into CPU interrupts, e.g., a LAPIC.

   PCI/MSI: Refactor struct msi_chip to make it become more common
   x86/MSI: Use MSI chip framework to configure MSI/MSI-X irq
   x86/xen/MSI: Use MSI chip framework to configure MSI/MSI-X irq
   Irq_remapping/MSI: Use MSI chip framework to configure MSI/MSI-X irq
   x86/MSI: Remove unused MSI weak arch functions
   Mips/MSI: Save msi chip in pci sysdata
   MIPS/Octeon/MSI: Use MSI chip framework to configure MSI/MSI-X irq
   MIPS/Xlp: Remove the dead function destroy_irq() to fix build error
   MIPS/Xlp/MSI: Use MSI chip framework to configure MSI/MSI-X irq
   MIPS/Xlr/MSI: Use MSI chip framework to configure MSI/MSI-X irq
   Powerpc/MSI: Use MSI chip framework to configure MSI/MSI-X irq
   s390/MSI: Use MSI chip framework to configure MSI/MSI-X irq
   arm/iop13xx/MSI: Use MSI chip framework to configure MSI/MSI-X irq
   IA64/MSI: Use MSI chip framework to configure MSI/MSI-X irq
   Sparc/MSI: Use MSI chip framework to configure MSI/MSI-X irq
   tile/MSI: Use MSI chip framework to configure MSI/MSI-X irq
   PCI/MSI: Clean up unused MSI arch functions
 
  arch/arm/include/asm/mach/pci.h |   10 +-
  arch/arm/include/asm/pci.h  |9 ++
  arch/arm/kernel/bios32.c|   19 +---
  arch/arm/mach-iop13xx/include/mach/pci.h|4 +
  arch/arm/mach-iop13xx/iq81340mc.c   |3 +
  arch/arm/mach-iop13xx/iq81340sc.c   |5 +-
  arch/arm/mach-iop13xx/msi.c |   11 ++-
  arch/ia64/include/asm/pci.h |   10 ++
  arch/ia64/kernel/msi_ia64.c |   14 ++-
  arch/ia64/pci/pci.c