Re: [PATCH v2 2/4] iommu/arm-smmu: Invoke DT probe from arm_smmu_of_setup()

2016-02-18 Thread Anup Patel
On Thu, Feb 18, 2016 at 5:30 PM, Sricharan  wrote:
> Hi,
>
>> -Original Message-
>> From: linux-arm-kernel [mailto:linux-arm-kernel-
>> boun...@lists.infradead.org] On Behalf Of Anup Patel
>> Sent: Monday, February 08, 2016 10:48 AM
>> To: Catalin Marinas; Joerg Roedel; Will Deacon; Robin Murphy; Sricharan R;
>> Linux IOMMU; Linux ARM Kernel
>> Cc: Mark Rutland; Device Tree; Scott Branden; Pawel Moll; Ian Campbell;
> Ray
>> Jui; Linux Kernel; Vikram Prakash; Rob Herring; BCM Kernel Feedback; Kumar
>> Gala; Anup Patel
>> Subject: [PATCH v2 2/4] iommu/arm-smmu: Invoke DT probe from
>> arm_smmu_of_setup()
>>
>> The SMMUv1/SMMUv2 driver is initialized very early using the
>> IOMMU_OF_DECLARE() but the actual platform device is probed via normal
>> DT probing.
>>
>> This patch uses of_platform_device_create() from arm_smmu_of_setup() to
>> ensure that SMMU platform device is probed immediately.
>>
>> Signed-off-by: Anup Patel 
>> ---
>>  drivers/iommu/arm-smmu.c | 6 ++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index
>> 2c8f871..02cd67d 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -40,6 +40,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -1956,10 +1957,15 @@ static int __init arm_smmu_init(void)
>>
>>  static int __init arm_smmu_of_setup(struct device_node *np)  {
>> + struct platform_device *pdev;
>>
>>   if (!init_done)
>>   arm_smmu_init();
>>
>> + pdev = of_platform_device_create(np, NULL, NULL);
>> + if (IS_ERR(pdev))
>> + return PTR_ERR(pdev);
>> +
>>   of_iommu_set_ops(np, _smmu_ops);
>
>  A question here is,  There was a probe deferral series [1], to take care of
> deferred
>  probing of devices behind iommus. With that this sort of early device
> probing during setup
>  should not be required and also it clears other dependencies of iommus on
> clocks, etc, if any.
>  My intention was to check whats the right way to do this  ? (or) point me
> to any updates
>  on the probe deferral series that I miss ?
>
> [1]  http://lkml.iu.edu/hypermail/linux/kernel/1505.3/03280.html

I think probe deferral is better way of handling this. I will drop this
patch.

Thanks,
Anup
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: RFC: extend IOMMU attributes

2016-02-18 Thread Stuart Yoder


> -Original Message-
> From: Will Deacon [mailto:will.dea...@arm.com]
> Sent: Thursday, February 18, 2016 10:22 AM
> To: Stuart Yoder 
> Cc: j...@8bytes.org; robin.mur...@arm.com; iommu@lists.linux-foundation.org; 
> linux-arm-
> ker...@lists.infradead.org; Varun Sethi ; Bharat Bhushan
> ; Nipun Gupta ; Peter Newton
> 
> Subject: Re: RFC: extend IOMMU attributes
> 
> Hi Stuart,
> 
> On Thu, Feb 18, 2016 at 04:16:26PM +, Stuart Yoder wrote:
> > We are implementing support for some specialized NXP SoC network
> > devices and have the desire to extend the mapping attributes beyond
> > those currently in iommu.h. (I see there is a recent proposal to
> > add an IOMMU_MMIO flag)
> >
> > What we have right now in Linux is a least-common-denominator set of
> > attributes, while for the ARM SMMU there is a much richer set of
> > attributes that seem useful to support. Specifically, we have one
> > SoC device we're dealing with right now that is the target of DMA
> > that functionally requires a "cacheable, non-shareable" attribute
> > in its SMMU mapping.
> >
> > In addition, there are many other attributes such as r/w allocate
> > hints, transient hints, write-back/write-thru, etc in the SMMU.
> >
> > We wanted to see what your thinking is with respect to the
> > direction the Linux IOMMU layer will head over the longer term with
> > respect to attributes.
> >
> > Is there anything problematic with continuing to grow the
> > attributes in iommu.h?...e.g.:
> >
> >  #define IOMMU_READ(1 << 0)
> >  #define IOMMU_WRITE   (1 << 1)
> > -#define IOMMU_CACHE   (1 << 2) /* DMA cache coherency */
> > +#define IOMMU_CACHE_COHERENT  (1 << 2) /* cacheable and coherent */
> >  #define IOMMU_NOEXEC  (1 << 3)
> >  #define IOMMU_MMIO(1 << 4) /* e.g. things like MSI doorbells */
> > +#define IOMMU_CACHEABLE   (1 << 5) /* cacheable, not coherent */
> 
> What does that even mean?

It means allocate in the cache, but don't propagate coherency.
In our case we have a device using an A57 cluster ACP port
to do stashing.  

 cluster
   +--+
   |   A57 A57|++
   |   ..ACP..<|SMMU|<...device
   |  +.--+   |  non-coherent  ++
   |  |L2  v  |   | write w/allocate
   |  +.--+   |
   +---.--+
   .
   +---v--+
   | CCN-504  |
   +-.+
 .
  +--v--+
  |device   |
  | (descriptor |
  |rings)   | 
  +-+

The device in question is doing a cacheable write targeting some
device descriptor rings.  The purpose of this transaction is to stash
the descriptor ring data in the L2 cache so the driver on the
A57s can get at it with low latency.  It has to be non-coherent
because of stuff related to how things are hooked up to CCN-504
(can take more details offline if you want).  Because it is non
coherent, the driver obviously has to be affined to the cluster.
It's a lot of hoops to jump through, but the performance benefit
is worth it.

> > +#define IOMMU_CACHE_ALLOCATE  (1 << 6) /* hint to allocate in the cache */
> >
> > Also, are we willing to let somewhat architecture specific flags
> > onto that list?  For, example the ARM 'transient' hint.
> 
> If we're going to support fine-grained attribute control, I think it needs
> to be done in a page-table specific manner. That is, io-pgtable-arm could
> provide those attribute controls which feature in the ARMv8 architecture...

So there would be some new API for users like vfio?

> ...but that brings me onto my next question: Who on Earth is actually
> going to provide these attributes to the IOMMU API?
> 
> There seems to be a missing piece.

Currently it would be vfio, because the driver is in user space.
There are some details to be worked out regarding how vfio would
determine and provide the attributes.  This is a device on the
fsl-mc bus (see drivers/staging/fsl-mc/README.txt) and the
vfio-fsl-mc bus layer should be able to determine what attributes
to use.

Once the kernel dma map API is hooked up to the SMMU, I think there
could be similar requirements down the road for kernel drivers.
If a driver is  intelligent enough to use some of the fine grained
attributes, how can that be supported?

Thanks,
Stuart
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V2] iommu/arm-smmu-v2: Workaround for ThunderX errata#27704

2016-02-18 Thread tchalamarla
From: Tirumalesh Chalamarla 

Due to Errata#27704 CN88xx SMMUv2,supports  only shared ASID and VMID
namespaces; specifically within a given node SMMU0 and SMMU1 share,
as does SMMU2 and SMMU3.

This patch tries to address these issuee by supplying asid and vmid
base from devicetree.

changes from V1:
- rebased on top of 16 bit VMID patch
- removed redundent options from DT
- insted of transform, DT now supplies starting ASID/VMID

Signed-off-by: Akula Geethasowjanya 
Signed-off-by: Tirumalesh Chalamarla 
---
 .../devicetree/bindings/iommu/arm,smmu.txt |  8 +
 drivers/iommu/arm-smmu.c   | 37 +++---
 2 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index bb7e569..80b8484 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -57,6 +57,14 @@ conditions.
 
 - smmu-enable-vmid16 : Enable 16 bit VMID, if allowed.
 
+- asid-base :  Buggy SMMUv2 implementations which doesn't satisfy the
+   ASID namespace needs, use this field to specify starting
+   ASID for the SMMU.
+
+- vmid-base :  Buggy SMMUv2 implementations which doesn't satisfy the VMID
+   namespace needs, use this field to specify starting VMID
+   for the SMMU.
+
 Example:
 
 smmu {
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 003c442..dc46b9a 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -320,6 +320,9 @@ struct arm_smmu_device {
unsigned long   ipa_size;
unsigned long   pa_size;
 
+   u32 asid_base;
+   u32 vmid_base;
+
u32 num_global_irqs;
u32 num_context_irqs;
unsigned int*irqs;
@@ -335,8 +338,8 @@ struct arm_smmu_cfg {
 };
 #define INVALID_IRPTNDX0xff
 
-#define ARM_SMMU_CB_ASID(cfg)  ((cfg)->cbndx)
-#define ARM_SMMU_CB_VMID(cfg)  ((cfg)->cbndx + 1)
+#define ARM_SMMU_CB_ASID(smmu, cfg)((u16)((smmu)->asid_base + 
(cfg)->cbndx))
+#define ARM_SMMU_CB_VMID(smmu, cfg)((u16)((smmu)->vmid_base + 
(cfg)->cbndx))
 
 enum arm_smmu_domain_stage {
ARM_SMMU_DOMAIN_S1 = 0,
@@ -576,11 +579,11 @@ static void arm_smmu_tlb_inv_context(void *cookie)
 
if (stage1) {
base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
-   writel_relaxed(ARM_SMMU_CB_ASID(cfg),
+   writel_relaxed(ARM_SMMU_CB_ASID(smmu, cfg),
   base + ARM_SMMU_CB_S1_TLBIASID);
} else {
base = ARM_SMMU_GR0(smmu);
-   writel_relaxed(ARM_SMMU_CB_VMID(cfg),
+   writel_relaxed(ARM_SMMU_CB_VMID(smmu, cfg),
   base + ARM_SMMU_GR0_TLBIVMID);
}
 
@@ -602,7 +605,7 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long 
iova, size_t size,
 
if (!IS_ENABLED(CONFIG_64BIT) || smmu->version == ARM_SMMU_V1) {
iova &= ~12UL;
-   iova |= ARM_SMMU_CB_ASID(cfg);
+   iova |= ARM_SMMU_CB_ASID(smmu, cfg);
do {
writel_relaxed(iova, reg);
iova += granule;
@@ -610,7 +613,7 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long 
iova, size_t size,
 #ifdef CONFIG_64BIT
} else {
iova >>= 12;
-   iova |= (u64)ARM_SMMU_CB_ASID(cfg) << 48;
+   iova |= (u64)ARM_SMMU_CB_ASID(smmu, cfg) << 48;
do {
writeq_relaxed(iova, reg);
iova += granule >> 12;
@@ -630,7 +633,7 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long 
iova, size_t size,
 #endif
} else {
reg = ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_TLBIVMID;
-   writel_relaxed(ARM_SMMU_CB_VMID(cfg), reg);
+   writel_relaxed(ARM_SMMU_CB_VMID(smmu, cfg), reg);
}
 }
 
@@ -744,7 +747,7 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
 #endif
/* if 16bit VMID required set VMID in CBA2R */
if (smmu->options & ARM_SMMU_OPT_ENABLE_VMID16)
-   reg |= ARM_SMMU_CB_VMID(cfg) << CBA2R_VMID_SHIFT;
+   reg |= ARM_SMMU_CB_VMID(smmu, cfg) << CBA2R_VMID_SHIFT;
 
writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBA2R(cfg->cbndx));
}
@@ -763,7 +766,7 @@ static void 

[PATCH] iommu/arm-smmu-v2: Add support for 16 bit VMID

2016-02-18 Thread tchalamarla
From: Tirumalesh Chalamarla 

ARM-SMMUv2 supports upto 16 bit VMID. This patch enables
16 bit VMID when requested from device-tree.

Signed-off-by: Tirumalesh Chalamarla 
---
 .../devicetree/bindings/iommu/arm,smmu.txt  |  2 ++
 drivers/iommu/arm-smmu.c| 21 -
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 7180745..bb7e569 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -55,6 +55,8 @@ conditions.
   aliases of secure registers have to be used during
   SMMU configuration.
 
+- smmu-enable-vmid16 : Enable 16 bit VMID, if allowed.
+
 Example:
 
 smmu {
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 59ee4b8..003c442 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -93,6 +93,7 @@
 #define sCR0_VMIDPNE   (1 << 11)
 #define sCR0_PTM   (1 << 12)
 #define sCR0_FB(1 << 13)
+#define sCR0_VMID16EN  (1 << 31)
 #define sCR0_BSU_SHIFT 14
 #define sCR0_BSU_MASK  0x3
 
@@ -140,6 +141,7 @@
 #define ID2_PTFS_4K(1 << 12)
 #define ID2_PTFS_16K   (1 << 13)
 #define ID2_PTFS_64K   (1 << 14)
+#define ID2_VMID16 (1 << 15)
 
 /* Global TLB invalidation */
 #define ARM_SMMU_GR0_TLBIVMID  0x64
@@ -189,6 +191,8 @@
 #define ARM_SMMU_GR1_CBA2R(n)  (0x800 + ((n) << 2))
 #define CBA2R_RW64_32BIT   (0 << 0)
 #define CBA2R_RW64_64BIT   (1 << 0)
+#define CBA2R_VMID_SHIFT   16
+#define CBA2R_VMID_MASK0x
 
 /* Translation context bank */
 #define ARM_SMMU_CB_BASE(smmu) ((smmu)->base + ((smmu)->size >> 1))
@@ -300,6 +304,7 @@ struct arm_smmu_device {
u32 features;
 
 #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
+#define ARM_SMMU_OPT_ENABLE_VMID16 (1 << 1)
u32 options;
enum arm_smmu_arch_version  version;
 
@@ -361,6 +366,7 @@ struct arm_smmu_option_prop {
 
 static struct arm_smmu_option_prop arm_smmu_options[] = {
{ ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
+   { ARM_SMMU_OPT_ENABLE_VMID16, "smmu-enable-vmid16" },
{ 0, NULL},
 };
 
@@ -736,6 +742,10 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
 #else
reg = CBA2R_RW64_32BIT;
 #endif
+   /* if 16bit VMID required set VMID in CBA2R */
+   if (smmu->options & ARM_SMMU_OPT_ENABLE_VMID16)
+   reg |= ARM_SMMU_CB_VMID(cfg) << CBA2R_VMID_SHIFT;
+
writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBA2R(cfg->cbndx));
}
 
@@ -751,7 +761,8 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
if (stage1) {
reg |= (CBAR_S1_BPSHCFG_NSH << CBAR_S1_BPSHCFG_SHIFT) |
(CBAR_S1_MEMATTR_WB << CBAR_S1_MEMATTR_SHIFT);
-   } else {
+   } else if (!(smmu->options & ARM_SMMU_OPT_ENABLE_VMID16)) {
+   /*16 bit VMID is not enabled set 8 bit VMID here */
reg |= ARM_SMMU_CB_VMID(cfg) << CBAR_VMID_SHIFT;
}
writel_relaxed(reg, gr1_base + ARM_SMMU_GR1_CBAR(cfg->cbndx));
@@ -1508,6 +1519,14 @@ static void arm_smmu_device_reset(struct arm_smmu_device 
*smmu)
/* Don't upgrade barriers */
reg &= ~(sCR0_BSU_MASK << sCR0_BSU_SHIFT);
 
+   /* See if 16bit VMID is required */
+   if (smmu->options & ARM_SMMU_OPT_ENABLE_VMID16) {
+   u32 id = readl_relaxed(gr0_base + ARM_SMMU_GR0_ID2);
+   /* Check to see if HW accepts */
+   if (id & ID2_VMID16)
+   reg |= (sCR0_VMID16EN);
+   }
+
/* Push the button */
__arm_smmu_tlb_sync(smmu);
writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
-- 
2.1.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RFC: extend IOMMU attributes

2016-02-18 Thread Stuart Yoder
Hi Joerg and Will,

We are implementing support for some specialized NXP SoC network
devices and have the desire to extend the mapping attributes beyond
those currently in iommu.h. (I see there is a recent proposal to
add an IOMMU_MMIO flag)

What we have right now in Linux is a least-common-denominator set of
attributes, while for the ARM SMMU there is a much richer set of
attributes that seem useful to support. Specifically, we have one
SoC device we're dealing with right now that is the target of DMA
that functionally requires a "cacheable, non-shareable" attribute
in its SMMU mapping.

In addition, there are many other attributes such as r/w allocate
hints, transient hints, write-back/write-thru, etc in the SMMU.

We wanted to see what your thinking is with respect to the
direction the Linux IOMMU layer will head over the longer term with
respect to attributes.

Is there anything problematic with continuing to grow the
attributes in iommu.h?...e.g.:

 #define IOMMU_READ(1 << 0)
 #define IOMMU_WRITE   (1 << 1)
-#define IOMMU_CACHE   (1 << 2) /* DMA cache coherency */
+#define IOMMU_CACHE_COHERENT  (1 << 2) /* cacheable and coherent */
 #define IOMMU_NOEXEC  (1 << 3)
 #define IOMMU_MMIO(1 << 4) /* e.g. things like MSI doorbells */
+#define IOMMU_CACHEABLE   (1 << 5) /* cacheable, not coherent */
+#define IOMMU_CACHE_ALLOCATE  (1 << 6) /* hint to allocate in the cache */

Also, are we willing to let somewhat architecture specific flags
onto that list?  For, example the ARM 'transient' hint.

Another potential direction would be to provide some approach
where IOMMU PTE attributes could be somehow derived from a
corresponding MMU PTE in an arch-specific manner and the IOMMU
API itself would not have to explicitly define all possible
attributes.

Thoughts?

Thanks,
Stuart
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[GIT PULL] iommu/io-pgtable: Updates for 4.6

2016-02-18 Thread Will Deacon
Hi Joerg,

Here are the io-pgtable updates from Robin for 4.6.

The bulk of this is adding support for the ARM v7s page table format,
which will be used by the Mediatek IOMMU driver from Yong Wu. Subsequent
work may also enable this format for the ARM SMMU.

Cheers,

Will

--->8

The following changes since commit 18558cae0272f8fd9647e69d3fec1565a7949865:

  Linux 4.5-rc4 (2016-02-14 13:05:20 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
for-joerg/io-pgtable

for you to fetch changes up to c8bff3a68f33b94e53f65aa9a4c0e9be34b6b514:

  MAINTAINERS: update ARM SMMU entry (2016-02-17 14:43:40 +)


Robin Murphy (4):
  iommu/io-pgtable: Add ARMv7 short descriptor support
  iommu/io-pgtable: Add helper functions for TLB ops
  iommu/io-pgtable: Avoid redundant TLB syncs
  iommu/io-pgtable: Rationalise quirk handling

Will Deacon (1):
  MAINTAINERS: update ARM SMMU entry

 MAINTAINERS|   2 +
 drivers/iommu/Kconfig  |  19 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/io-pgtable-arm-v7s.c | 846 +
 drivers/iommu/io-pgtable-arm.c |  34 +-
 drivers/iommu/io-pgtable.c |   5 +-
 drivers/iommu/io-pgtable.h |  53 ++-
 7 files changed, 943 insertions(+), 17 deletions(-)
 create mode 100644 drivers/iommu/io-pgtable-arm-v7s.c
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 07/15] iommu: iommu_get/put_single_reserved

2016-02-18 Thread Eric Auger
Hi Marc,
On 02/18/2016 05:51 PM, Marc Zyngier wrote:
> On 18/02/16 16:42, Eric Auger wrote:
>> Hello,
>> On 02/18/2016 12:06 PM, Marc Zyngier wrote:
>>> On Fri, 12 Feb 2016 08:13:09 +
>>> Eric Auger  wrote:
>>>
 This patch introduces iommu_get/put_single_reserved.

 iommu_get_single_reserved allows to allocate a new reserved iova page
 and map it onto the physical page that contains a given physical address.
 It returns the iova that is mapped onto the provided physical address.
 Hence the physical address passed in argument does not need to be aligned.

 In case a mapping already exists between both pages, the IOVA mapped
 to the PA is directly returned.

 Each time an iova is successfully returned a binding ref count is
 incremented.

 iommu_put_single_reserved decrements the ref count and when this latter
 is null, the mapping is destroyed and the iova is released.
>>>
>>> I wonder if there is a requirement for the caller to find out about the
>>> size of the mapping, or to impose a given size... MSIs clearly do not
>>> have that requirement (this is always a 32bit value), but since. 
>>> allocations usually pair address and size, I though I'd ask...
>> Yes. Currently this only makes sure the host PA is mapped and returns
>> the corresponding IOVA. It is part of the discussion we need to have on
>> the API besides the problematic of which API it should belong to.
> 
> One of the issues I have with the API at the moment is that there is no
> control on the page size. Imagine you have allocated a 4kB IOVA window
> for your MSI, but your IOMMU can only map 64kB (not unreasonable to
> imagine on arm64). What happens then?
The code checks the IOVA window size is aligned with the IOMMU page size
so I think that case is handled at iova domain creation
(arm_smmu_alloc_reserved_iova_domain).
> 
> Somehow, userspace should be told about it, one way or another.
I agree on that point. The user-space should be provided with the
information about the requested iova pool size and alignments. This is
missing in current rfc series.

Best Regards

Eric
> 
> Thanks,
> 
>   M.
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 15/15] irqchip/gicv2m/v3-its-pci-msi: IOMMU map the MSI frame when needed

2016-02-18 Thread Eric Auger
Hi Marc,
On 02/18/2016 04:47 PM, Marc Zyngier wrote:
> On 18/02/16 15:33, Eric Auger wrote:
>> Hi Marc,
>> On 02/18/2016 12:33 PM, Marc Zyngier wrote:
>>> On Fri, 12 Feb 2016 08:13:17 +
>>> Eric Auger  wrote:
>>>
 In case the msi_desc references a device attached to an iommu
 domain, the msi address needs to be mapped in the IOMMU. Else any
 MSI write transaction will cause a fault.

 gic_set_msi_addr detects that case and allocates an iova bound
 to the physical address page comprising the MSI frame. This iova
 then is used as the msi_msg address. Unset operation decrements the
 reference on the binding.

 The functions are called in the irq_write_msi_msg ops implementation.
 At that time we can recognize whether the msi is setup or teared down
 looking at the msi_msg content. Indeed msi_domain_deactivate zeroes all
 the fields.

 Signed-off-by: Eric Auger 

 ---

 v2 -> v3:
 - protect iova/addr manipulation with CONFIG_ARCH_DMA_ADDR_T_64BIT and
   CONFIG_PHYS_ADDR_T_64BIT
 - only expose gic_pci_msi_domain_write_msg in case CONFIG_IOMMU_API &
   CONFIG_PCI_MSI_IRQ_DOMAIN are set.
 - gic_set/unset_msi_addr duly become static
 ---
  drivers/irqchip/irq-gic-common.c | 69 
 
  drivers/irqchip/irq-gic-common.h |  5 +++
  drivers/irqchip/irq-gic-v2m.c|  7 +++-
  drivers/irqchip/irq-gic-v3-its-pci-msi.c |  5 +++
  4 files changed, 85 insertions(+), 1 deletion(-)

 diff --git a/drivers/irqchip/irq-gic-common.c 
 b/drivers/irqchip/irq-gic-common.c
 index f174ce0..46cd06c 100644
 --- a/drivers/irqchip/irq-gic-common.c
 +++ b/drivers/irqchip/irq-gic-common.c
 @@ -18,6 +18,8 @@
  #include 
  #include 
  #include 
 +#include 
 +#include 
  
  #include "irq-gic-common.h"
  
 @@ -121,3 +123,70 @@ void gic_cpu_config(void __iomem *base, void 
 (*sync_access)(void))
if (sync_access)
sync_access();
  }
 +
 +#if defined(CONFIG_IOMMU_API) && defined(CONFIG_PCI_MSI_IRQ_DOMAIN)
 +static int gic_set_msi_addr(struct irq_data *data, struct msi_msg *msg)
 +{
 +  struct msi_desc *desc = irq_data_get_msi_desc(data);
 +  struct device *dev = msi_desc_to_dev(desc);
 +  struct iommu_domain *d;
 +  phys_addr_t addr;
 +  dma_addr_t iova;
 +  int ret;
 +
 +  d = iommu_get_domain_for_dev(dev);
 +  if (!d)
 +  return 0;
 +#ifdef CONFIG_PHYS_ADDR_T_64BIT
 +  addr = ((phys_addr_t)(msg->address_hi) << 32) | msg->address_lo;
 +#else
 +  addr = msg->address_lo;
 +#endif
 +
 +  ret = iommu_get_single_reserved(d, addr, IOMMU_WRITE, );
 +
 +  if (!ret) {
 +  msg->address_lo = lower_32_bits(iova);
 +  msg->address_hi = upper_32_bits(iova);
 +  }
 +  return ret;
 +}
 +
 +
 +static void gic_unset_msi_addr(struct irq_data *data)
 +{
 +  struct msi_desc *desc = irq_data_get_msi_desc(data);
 +  struct device *dev;
 +  struct iommu_domain *d;
 +  dma_addr_t iova;
 +
 +#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
 +  iova = ((dma_addr_t)(desc->msg.address_hi) << 32) |
 +  desc->msg.address_lo;
 +#else
 +  iova = desc->msg.address_lo;
 +#endif
 +
 +  dev = msi_desc_to_dev(desc);
 +  if (!dev)
 +  return;
 +
 +  d = iommu_get_domain_for_dev(dev);
 +  if (!d)
 +  return;
 +
 +  iommu_put_single_reserved(d, iova);
 +}
 +
 +void gic_pci_msi_domain_write_msg(struct irq_data *irq_data,
 +struct msi_msg *msg)
 +{
 +  if (!msg->address_hi && !msg->address_lo && !msg->data)
 +  gic_unset_msi_addr(irq_data); /* deactivate */
 +  else
 +  gic_set_msi_addr(irq_data, msg); /* activate, set_affinity */
 +
 +  pci_msi_domain_write_msg(irq_data, msg);
 +}
>>>
>>> So by doing that, you are specializing this infrastructure to PCI.
>>> If you hijacked irq_compose_msi_msg() instead, you'd have both
>>> platform and PCI MSI for the same price.
>>>
>>> I can see a potential problem with the teardown of an MSI (I don't
>>> think the compose method is called on teardown), but I think this could
>>> be easily addressed.
>> Yes effectively this is the reason why I moved it from
>> irq_compose_msi_msg (my original implementation) to irq_write_msi_msg. I
>> noticed I had no way to detect the teardown whereas the
>> msi_domain_deactivate also calls irq_write_msi_msg which is quite
>> practical ;-) To be honest I need to further look at the non PCI
>> implementation.
> 
> Another thing to consider is that MSI controllers may use different
> doorbells for different CPU affinities.

OK thanks for pointing this out.

This is 

Re: [PATCH v8 3/5] memory: mediatek: Add SMI driver

2016-02-18 Thread Philipp Zabel
Hi Joerg,

Am Dienstag, den 16.02.2016, 17:26 +0100 schrieb Joerg Roedel:
> On Tue, Feb 16, 2016 at 04:20:00PM +, Will Deacon wrote:
> > I'm more than happy to bake a branch containing all the page table stuff,
> > but Yong's stuff depends on it so I'll work with whatever is easiest for
> > you.
> 
> Okay, to the best is if you would collect all the page table patches and
> send me a pull-request. I pull that into my tree then and put Yong's
> stuff on-top.

could you provide a branch that I may then pull in, stack the
mediatek-drm driver patches (that depend on the SMI driver) on top and
have them merged via drm-next?

regards
Philipp

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 07/15] iommu: iommu_get/put_single_reserved

2016-02-18 Thread Eric Auger
Hello,
On 02/18/2016 12:06 PM, Marc Zyngier wrote:
> On Fri, 12 Feb 2016 08:13:09 +
> Eric Auger  wrote:
> 
>> This patch introduces iommu_get/put_single_reserved.
>>
>> iommu_get_single_reserved allows to allocate a new reserved iova page
>> and map it onto the physical page that contains a given physical address.
>> It returns the iova that is mapped onto the provided physical address.
>> Hence the physical address passed in argument does not need to be aligned.
>>
>> In case a mapping already exists between both pages, the IOVA mapped
>> to the PA is directly returned.
>>
>> Each time an iova is successfully returned a binding ref count is
>> incremented.
>>
>> iommu_put_single_reserved decrements the ref count and when this latter
>> is null, the mapping is destroyed and the iova is released.
> 
> I wonder if there is a requirement for the caller to find out about the
> size of the mapping, or to impose a given size... MSIs clearly do not
> have that requirement (this is always a 32bit value), but since. 
> allocations usually pair address and size, I though I'd ask...
Yes. Currently this only makes sure the host PA is mapped and returns
the corresponding IOVA. It is part of the discussion we need to have on
the API besides the problematic of which API it should belong to.

Thanks

Eric
> 
> Thanks,
> 
>   M.
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: RFC: extend IOMMU attributes

2016-02-18 Thread Will Deacon
Hi Stuart,

On Thu, Feb 18, 2016 at 04:16:26PM +, Stuart Yoder wrote:
> We are implementing support for some specialized NXP SoC network
> devices and have the desire to extend the mapping attributes beyond
> those currently in iommu.h. (I see there is a recent proposal to
> add an IOMMU_MMIO flag)
> 
> What we have right now in Linux is a least-common-denominator set of
> attributes, while for the ARM SMMU there is a much richer set of
> attributes that seem useful to support. Specifically, we have one
> SoC device we're dealing with right now that is the target of DMA
> that functionally requires a "cacheable, non-shareable" attribute
> in its SMMU mapping.
> 
> In addition, there are many other attributes such as r/w allocate
> hints, transient hints, write-back/write-thru, etc in the SMMU.
> 
> We wanted to see what your thinking is with respect to the
> direction the Linux IOMMU layer will head over the longer term with
> respect to attributes.
> 
> Is there anything problematic with continuing to grow the
> attributes in iommu.h?...e.g.:
> 
>  #define IOMMU_READ(1 << 0)
>  #define IOMMU_WRITE   (1 << 1)
> -#define IOMMU_CACHE   (1 << 2) /* DMA cache coherency */
> +#define IOMMU_CACHE_COHERENT  (1 << 2) /* cacheable and coherent */
>  #define IOMMU_NOEXEC  (1 << 3)
>  #define IOMMU_MMIO(1 << 4) /* e.g. things like MSI doorbells */
> +#define IOMMU_CACHEABLE   (1 << 5) /* cacheable, not coherent */

What does that even mean?

> +#define IOMMU_CACHE_ALLOCATE  (1 << 6) /* hint to allocate in the cache */
> 
> Also, are we willing to let somewhat architecture specific flags
> onto that list?  For, example the ARM 'transient' hint.

If we're going to support fine-grained attribute control, I think it needs
to be done in a page-table specific manner. That is, io-pgtable-arm could
provide those attribute controls which feature in the ARMv8 architecture...

...but that brings me onto my next question: Who on Earth is actually
going to provide these attributes to the IOMMU API?

There seems to be a missing piece.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 05/15] iommu/arm-smmu: implement alloc/free_reserved_iova_domain

2016-02-18 Thread Alex Williamson
On Thu, 18 Feb 2016 11:09:17 +
Robin Murphy  wrote:

> Hi Eric,
> 
> On 12/02/16 08:13, Eric Auger wrote:
> > Implement alloc/free_reserved_iova_domain for arm-smmu. we use
> > the iova allocator (iova.c). The iova_domain is attached to the
> > arm_smmu_domain struct. A mutex is introduced to protect it.  
> 
> The IOMMU API currently leaves IOVA management entirely up to the caller 
> - VFIO is already managing its own IOVA space, so what warrants this 
> being pushed all the way down to the IOMMU driver? All I see here is 
> abstract code with no hardware-specific details that'll have to be 
> copy-pasted into other IOMMU drivers (e.g. SMMUv3), which strongly 
> suggests it's the wrong place to do it.
> 
> As I understand the problem, VFIO has a generic "configure an IOMMU to 
> point at an MSI doorbell" step to do in the process of attaching a 
> device, which hasn't needed implementing yet due to VT-d's 
> IOMMU_CAP_I_AM_ALSO_ACTUALLY_THE_MSI_CONTROLLER_IN_DISGUISE flag, which 
> most of us have managed to misinterpret so far. AFAICS all the IOMMU 
> driver should need to know about this is an iommu_map() call (which will 
> want a slight extension[1] to make things behave properly). We should be 
> fixing the abstraction to be less x86-centric, not hacking up all the 
> ARM drivers to emulate x86 hardware behaviour in software.

The gap I see is that, that the I_AM_ALSO_ACTUALLY_THE_MSI...
solution transparently fixes, is that there's no connection between
pci_enable_msi{x}_range and the IOMMU API.  If I want to allow a device
managed by an IOMMU API domain to perform MSI, I need to go scrape the
MSI vectors out of the device, setup a translation into my IOVA space,
and re-write those vectors.  Not to mention that as an end user, I
have no idea what might be sharing the page where those vectors are
targeted and what I might be allowing the user DMA access to.  MSI
setup is necessarily making use of the IOVA space of the device, so
there's clearly an opportunity to interact with the IOMMU API to manage
that IOVA usage.  x86 has an implicit range of IOVA space for MSI, this
makes an explicit range, reserved by the IOMMU API user for this
purpose.  At the vfio level, I just want to be able to call the PCI
MSI/X setup routines and have them automatically program vectors that
make use of IOVA space that I've already marked reserved for this
purpose.  I don't see how that's x86-centric other than x86 has already
managed to make this transparent and spoiled users into expecting
working IOVAs on the device after using standard MSI vector setup
callbacks.  That's the goal I'm looking for.  Thanks,

Alex
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 15/15] irqchip/gicv2m/v3-its-pci-msi: IOMMU map the MSI frame when needed

2016-02-18 Thread Eric Auger
Hi Marc,
On 02/18/2016 12:33 PM, Marc Zyngier wrote:
> On Fri, 12 Feb 2016 08:13:17 +
> Eric Auger  wrote:
> 
>> In case the msi_desc references a device attached to an iommu
>> domain, the msi address needs to be mapped in the IOMMU. Else any
>> MSI write transaction will cause a fault.
>>
>> gic_set_msi_addr detects that case and allocates an iova bound
>> to the physical address page comprising the MSI frame. This iova
>> then is used as the msi_msg address. Unset operation decrements the
>> reference on the binding.
>>
>> The functions are called in the irq_write_msi_msg ops implementation.
>> At that time we can recognize whether the msi is setup or teared down
>> looking at the msi_msg content. Indeed msi_domain_deactivate zeroes all
>> the fields.
>>
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> v2 -> v3:
>> - protect iova/addr manipulation with CONFIG_ARCH_DMA_ADDR_T_64BIT and
>>   CONFIG_PHYS_ADDR_T_64BIT
>> - only expose gic_pci_msi_domain_write_msg in case CONFIG_IOMMU_API &
>>   CONFIG_PCI_MSI_IRQ_DOMAIN are set.
>> - gic_set/unset_msi_addr duly become static
>> ---
>>  drivers/irqchip/irq-gic-common.c | 69 
>> 
>>  drivers/irqchip/irq-gic-common.h |  5 +++
>>  drivers/irqchip/irq-gic-v2m.c|  7 +++-
>>  drivers/irqchip/irq-gic-v3-its-pci-msi.c |  5 +++
>>  4 files changed, 85 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/irqchip/irq-gic-common.c 
>> b/drivers/irqchip/irq-gic-common.c
>> index f174ce0..46cd06c 100644
>> --- a/drivers/irqchip/irq-gic-common.c
>> +++ b/drivers/irqchip/irq-gic-common.c
>> @@ -18,6 +18,8 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>> +#include 
>>  
>>  #include "irq-gic-common.h"
>>  
>> @@ -121,3 +123,70 @@ void gic_cpu_config(void __iomem *base, void 
>> (*sync_access)(void))
>>  if (sync_access)
>>  sync_access();
>>  }
>> +
>> +#if defined(CONFIG_IOMMU_API) && defined(CONFIG_PCI_MSI_IRQ_DOMAIN)
>> +static int gic_set_msi_addr(struct irq_data *data, struct msi_msg *msg)
>> +{
>> +struct msi_desc *desc = irq_data_get_msi_desc(data);
>> +struct device *dev = msi_desc_to_dev(desc);
>> +struct iommu_domain *d;
>> +phys_addr_t addr;
>> +dma_addr_t iova;
>> +int ret;
>> +
>> +d = iommu_get_domain_for_dev(dev);
>> +if (!d)
>> +return 0;
>> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
>> +addr = ((phys_addr_t)(msg->address_hi) << 32) | msg->address_lo;
>> +#else
>> +addr = msg->address_lo;
>> +#endif
>> +
>> +ret = iommu_get_single_reserved(d, addr, IOMMU_WRITE, );
>> +
>> +if (!ret) {
>> +msg->address_lo = lower_32_bits(iova);
>> +msg->address_hi = upper_32_bits(iova);
>> +}
>> +return ret;
>> +}
>> +
>> +
>> +static void gic_unset_msi_addr(struct irq_data *data)
>> +{
>> +struct msi_desc *desc = irq_data_get_msi_desc(data);
>> +struct device *dev;
>> +struct iommu_domain *d;
>> +dma_addr_t iova;
>> +
>> +#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
>> +iova = ((dma_addr_t)(desc->msg.address_hi) << 32) |
>> +desc->msg.address_lo;
>> +#else
>> +iova = desc->msg.address_lo;
>> +#endif
>> +
>> +dev = msi_desc_to_dev(desc);
>> +if (!dev)
>> +return;
>> +
>> +d = iommu_get_domain_for_dev(dev);
>> +if (!d)
>> +return;
>> +
>> +iommu_put_single_reserved(d, iova);
>> +}
>> +
>> +void gic_pci_msi_domain_write_msg(struct irq_data *irq_data,
>> +  struct msi_msg *msg)
>> +{
>> +if (!msg->address_hi && !msg->address_lo && !msg->data)
>> +gic_unset_msi_addr(irq_data); /* deactivate */
>> +else
>> +gic_set_msi_addr(irq_data, msg); /* activate, set_affinity */
>> +
>> +pci_msi_domain_write_msg(irq_data, msg);
>> +}
> 
> So by doing that, you are specializing this infrastructure to PCI.
> If you hijacked irq_compose_msi_msg() instead, you'd have both
> platform and PCI MSI for the same price.
> 
> I can see a potential problem with the teardown of an MSI (I don't
> think the compose method is called on teardown), but I think this could
> be easily addressed.
Yes effectively this is the reason why I moved it from
irq_compose_msi_msg (my original implementation) to irq_write_msi_msg. I
noticed I had no way to detect the teardown whereas the
msi_domain_deactivate also calls irq_write_msi_msg which is quite
practical ;-) To be honest I need to further look at the non PCI
implementation.


> 
>> +#endif
>> +
>> diff --git a/drivers/irqchip/irq-gic-common.h 
>> b/drivers/irqchip/irq-gic-common.h
>> index fff697d..98681fd 100644
>> --- a/drivers/irqchip/irq-gic-common.h
>> +++ b/drivers/irqchip/irq-gic-common.h
>> @@ -35,4 +35,9 @@ void gic_cpu_config(void __iomem *base, void 
>> (*sync_access)(void));
>>  void gic_enable_quirks(u32 iidr, const struct gic_quirk *quirks,
>>  void *data);
>>  
>> +#if 

Re: [RFC v3 02/15] vfio: expose MSI mapping requirement through VFIO_IOMMU_GET_INFO

2016-02-18 Thread Eric Auger
Hi Marc,
On 02/18/2016 10:34 AM, Marc Zyngier wrote:
> On Fri, 12 Feb 2016 08:13:04 +
> Eric Auger  wrote:
> 
>> This patch allows the user-space to retrieve whether msi write
>> transaction addresses must be mapped. This is returned through the
>> VFIO_IOMMU_GET_INFO API and its new flag: VFIO_IOMMU_INFO_REQUIRE_MSI_MAP.
>>
>> Signed-off-by: Bharat Bhushan 
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> RFC v1 -> v1:
>> - derived from
>>   [RFC PATCH 3/6] vfio: Extend iommu-info to return MSIs automap state
>> - renamed allow_msi_reconfig into require_msi_mapping
>> - fixed VFIO_IOMMU_GET_INFO
>> ---
>>  drivers/vfio/vfio_iommu_type1.c | 26 ++
>>  include/uapi/linux/vfio.h   |  1 +
>>  2 files changed, 27 insertions(+)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c 
>> b/drivers/vfio/vfio_iommu_type1.c
>> index 6f1ea3d..c5b57e1 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -255,6 +255,29 @@ static int vaddr_get_pfn(unsigned long vaddr, int prot, 
>> unsigned long *pfn)
>>  }
>>  
>>  /*
>> + * vfio_domains_require_msi_mapping: indicates whether MSI write transaction
>> + * addresses must be mapped
>> + *
>> + * returns true if it does
>> + */
>> +static bool vfio_domains_require_msi_mapping(struct vfio_iommu *iommu)
>> +{
>> +struct vfio_domain *d;
>> +bool ret;
>> +
>> +mutex_lock(>lock);
>> +/* All domains have same require_msi_map property, pick first */
>> +d = list_first_entry(>domain_list, struct vfio_domain, next);
>> +if (iommu_domain_get_attr(d->domain, DOMAIN_ATTR_MSI_MAPPING, NULL) < 0)
>> +ret = false;
>> +else
>> +ret = true;
> 
> nit: this could be simplified as:
> 
> ret = (iommu_domain_get_attr(d->domain, DOMAIN_ATTR_MSI_MAPPING, NULL) == 0);
sure ;-)
> 
>> +mutex_unlock(>lock);
>> +
>> +return ret;
>> +}
>> +
>> +/*
>>   * Attempt to pin pages.  We really don't want to track all the pfns and
>>   * the iommu can only map chunks of consecutive pfns anyway, so get the
>>   * first page and all consecutive pages with the same locking.
>> @@ -997,6 +1020,9 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>>  
>>  info.flags = VFIO_IOMMU_INFO_PGSIZES;
>>  
>> +if (vfio_domains_require_msi_mapping(iommu))
>> +info.flags |= VFIO_IOMMU_INFO_REQUIRE_MSI_MAP;
>> +
>>  info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
>>  
>>  return copy_to_user((void __user *)arg, , minsz);
>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
>> index 7d7a4c6..43e183b 100644
>> --- a/include/uapi/linux/vfio.h
>> +++ b/include/uapi/linux/vfio.h
>> @@ -400,6 +400,7 @@ struct vfio_iommu_type1_info {
>>  __u32   argsz;
>>  __u32   flags;
>>  #define VFIO_IOMMU_INFO_PGSIZES (1 << 0)/* supported page sizes info */
>> +#define VFIO_IOMMU_INFO_REQUIRE_MSI_MAP (1 << 1)/* MSI must be mapped */
>>  __u64   iova_pgsizes;   /* Bitmap of supported page sizes */
>>  };
>>  
> 
> 
> FWIW:
> 
> Acked-by: Marc Zyngier 
thanks

Eric
> 
>   M.
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 05/15] iommu/arm-smmu: implement alloc/free_reserved_iova_domain

2016-02-18 Thread Eric Auger
Hi Robin,
On 02/18/2016 12:09 PM, Robin Murphy wrote:
> Hi Eric,
> 
> On 12/02/16 08:13, Eric Auger wrote:
>> Implement alloc/free_reserved_iova_domain for arm-smmu. we use
>> the iova allocator (iova.c). The iova_domain is attached to the
>> arm_smmu_domain struct. A mutex is introduced to protect it.
> 
> The IOMMU API currently leaves IOVA management entirely up to the caller
I agree

> - VFIO is already managing its own IOVA space, so what warrants this
> being pushed all the way down to the IOMMU driver?
In practice, with upstreamed code, VFIO uses IOVA = GPA provided by the
user-space (corresponding to RAM regions) and does not allocate IOVA
itself. The IOVA is passed through the VFIO_IOMMU_MAP_DMA ioctl.

With that series we propose that the user-space provides a pool of
unused IOVA that can be used to map Host physical addresses (MSI frame
address). So effectively someone needs to use an iova allocator to
allocate within that window. This can be vfio or the iommu driver. But
in both cases this is a new capability introduced in either component.

In the first version of this series
(https://lkml.org/lkml/2016/1/26/371) I put this iova allocation in
vfio_iommu_type1. the vfio-pci driver then was using this vfio internal
API to overwrite the physical address written in the PCI device by the
MSI controller.

However I was advised by Alex to move things at a lower level
(http://www.spinics.net/lists/kvm/msg126809.html), IOMMU core or irq
remapping driver; also the MSI controller should directly program the
IOVA address in the PCI device.

On ARM, irq remapping is somehow abstracted in ITS driver. Also we need
that functionality in GICv2M so I eventually chose to put the
functionality in the IOMMU driver. Since iova.c is not compiled by
everyone and since that functionality is needed for a restricted set of
architectures, ARM/ARM64 & PowerPC I chose to implement this in arhc
specific code, for the time being in arm-smmu.c.

This allows the MSI controller to interact with the IOMMU API to bind
its MSI address. I think it may be feasible to have the MSI controller
interact with the vfio external user API but does it look better?

Assuming we can agree on the relevance of adding that functionality at
IOMMU API level, maybe we can create a separate .c file to share code
between arm-smmu and arm-smmu-v3.c? or even I could dare to add this
into the iommu generic part. What is your opinion?

 All I see here is
> abstract code with no hardware-specific details that'll have to be
> copy-pasted into other IOMMU drivers (e.g. SMMUv3), which strongly
> suggests it's the wrong place to do it.
> 
> As I understand the problem, VFIO has a generic "configure an IOMMU to
> point at an MSI doorbell" step to do in the process of attaching a
> device, which hasn't needed implementing yet due to VT-d's
> IOMMU_CAP_I_AM_ALSO_ACTUALLY_THE_MSI_CONTROLLER_IN_DISGUISE flag, which
> most of us have managed to misinterpret so far.

Maybe I misunderstood the above comment but I would say this is the
contrary: ie up to now, VFIO did not need to care about that issue since
MSI addresses were not mapped in the IOMMU on x86. Now they need to be
so we need to extend an existing API, would it be VFIO external user API
or IOMMU API. But please correct if I misunderstood you.

Also I found it more practical to have a all-in-one API doing both the
iova allocation and binding (dma_map_single like). the user of the API
does not have to care about the iommu page size.

Thanks for your time and looking forward to reading from you!

Best Regards

Eric

 AFAICS all the IOMMU
> driver should need to know about this is an iommu_map() call (which will
> want a slight extension[1] to make things behave properly). We should be
> fixing the abstraction to be less x86-centric, not hacking up all the
> ARM drivers to emulate x86 hardware behaviour in software.
> 
> Robin.
> 
> [1]:http://article.gmane.org/gmane.linux.kernel.cross-arch/30833
> 
>> Signed-off-by: Eric Auger 
>>
>> ---
>> v2 -> v3:
>> - select IOMMU_IOVA when ARM_SMMU or ARM_SMMU_V3 is set
>>
>> v1 -> v2:
>> - formerly implemented in vfio_iommu_type1
>> ---
>>   drivers/iommu/Kconfig|  2 ++
>>   drivers/iommu/arm-smmu.c | 87
>> +++-
>>   2 files changed, 74 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index a1e75cb..1106528 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -289,6 +289,7 @@ config ARM_SMMU
>>   bool "ARM Ltd. System MMU (SMMU) Support"
>>   depends on (ARM64 || ARM) && MMU
>>   select IOMMU_API
>> +select IOMMU_IOVA
>>   select IOMMU_IO_PGTABLE_LPAE
>>   select ARM_DMA_USE_IOMMU if ARM
>>   help
>> @@ -302,6 +303,7 @@ config ARM_SMMU_V3
>>   bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
>>   depends on ARM64 && PCI
>>   select IOMMU_API
>> +select IOMMU_IOVA
>>   select 

[PATCH v2 10/13] iommu: exynos: update device tree documentation

2016-02-18 Thread Marek Szyprowski
Exynos SYSMMU bindings documentation was merged before generic IOMMU
binding have been introduced. This patch updates documentation to match
current state.

Signed-off-by: Marek Szyprowski 
---
 .../devicetree/bindings/iommu/samsung,sysmmu.txt  | 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/samsung,sysmmu.txt 
b/Documentation/devicetree/bindings/iommu/samsung,sysmmu.txt
index bc620fe32a70..f61ca25ca136 100644
--- a/Documentation/devicetree/bindings/iommu/samsung,sysmmu.txt
+++ b/Documentation/devicetree/bindings/iommu/samsung,sysmmu.txt
@@ -23,28 +23,23 @@ MMUs.
   for window 1, 2 and 3.
 * M2M Scalers and G2D in Exynos5420 has one System MMU on the read channel and
   the other System MMU on the write channel.
-The drivers must consider how to handle those System MMUs. One of the idea is
-to implement child devices or sub-devices which are the client devices of the
-System MMU.
 
-Note:
-The current DT binding for the Exynos System MMU is incomplete.
-The following properties can be removed or changed, if found incompatible with
-the "Generic IOMMU Binding" support for attaching devices to the IOMMU.
+For information on assigning System MMU controller to its peripheral devices,
+see generic IOMMU bindings.
 
 Required properties:
 - compatible: Should be "samsung,exynos-sysmmu"
 - reg: A tuple of base address and size of System MMU registers.
+- #iommu-cells: Should be <0>.
 - interrupt-parent: The phandle of the interrupt controller of System MMU
 - interrupts: An interrupt specifier for interrupt signal of System MMU,
  according to the format defined by a particular interrupt
  controller.
 - clock-names: Should be "sysmmu" if the System MMU is needed to gate its 
clock.
   Optional "master" if the clock to the System MMU is gated by
-  another gate clock other than "sysmmu".
-  Exynos4 SoCs, there needs no "master" clock.
-  Exynos5 SoCs, some System MMUs must have "master" clocks.
-- clocks: Required if the System MMU is needed to gate its clock.
+  another gate clock other than "sysmmu" (usually main gate clock
+  of peripheral device this SYSMMU belongs to).
+- clocks: Phandles for respective clocks described by clock-names.
 - power-domains: Required if the System MMU is needed to gate its power.
  Please refer to the following document:
  Documentation/devicetree/bindings/power/pd-samsung.txt
@@ -57,6 +52,7 @@ Examples:
power-domains = <_gsc>;
clocks = < CLK_GSCL0>;
clock-names = "gscl";
+   iommus = <_gsc0>;
};
 
sysmmu_gsc0: sysmmu@13E8 {
@@ -67,4 +63,5 @@ Examples:
clock-names = "sysmmu", "master";
clocks = < CLK_SMMU_GSCL0>, < CLK_GSCL0>;
power-domains = <_gsc>;
+   #iommu-cells = <0>;
};
-- 
1.9.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 02/13] iommu: exynos: add support for IOMMU_DOMAIN_DMA domain type

2016-02-18 Thread Marek Szyprowski
This patch adds support for DMA domain type. Such domain have DMA cookie
prepared and can be used by generic DMA-IOMMU glue layer.

Signed-off-by: Marek Szyprowski 
---
 drivers/iommu/exynos-iommu.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 4fc079073c86..595e0da55db4 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -25,9 +25,9 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
-#include 
 #include 
 
 typedef u32 sysmmu_iova_t;
@@ -662,16 +662,21 @@ static struct iommu_domain 
*exynos_iommu_domain_alloc(unsigned type)
struct exynos_iommu_domain *domain;
int i;
 
-   if (type != IOMMU_DOMAIN_UNMANAGED)
-   return NULL;
 
domain = kzalloc(sizeof(*domain), GFP_KERNEL);
if (!domain)
return NULL;
 
+   if (type == IOMMU_DOMAIN_DMA) {
+   if (iommu_get_dma_cookie(>domain) != 0)
+   goto err_pgtable;
+   } else if (type != IOMMU_DOMAIN_UNMANAGED) {
+   goto err_pgtable;
+   }
+
domain->pgtable = (sysmmu_pte_t *)__get_free_pages(GFP_KERNEL, 2);
if (!domain->pgtable)
-   goto err_pgtable;
+   goto err_dma_cookie;
 
domain->lv2entcnt = (short *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 
1);
if (!domain->lv2entcnt)
@@ -703,6 +708,9 @@ static struct iommu_domain 
*exynos_iommu_domain_alloc(unsigned type)
 
 err_counter:
free_pages((unsigned long)domain->pgtable, 2);
+err_dma_cookie:
+   if (type == IOMMU_DOMAIN_DMA)
+   iommu_put_dma_cookie(>domain);
 err_pgtable:
kfree(domain);
return NULL;
@@ -727,6 +735,9 @@ static void exynos_iommu_domain_free(struct iommu_domain 
*iommu_domain)
 
spin_unlock_irqrestore(>lock, flags);
 
+   if (iommu_domain->type == IOMMU_DOMAIN_DMA)
+   iommu_put_dma_cookie(iommu_domain);
+
for (i = 0; i < NUM_LV1ENTRIES; i++)
if (lv1ent_page(domain->pgtable + i))
kmem_cache_free(lv2table_kmem_cache,
-- 
1.9.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 11/13] iommu: exynos: add support for v5 SYSMMU

2016-02-18 Thread Marek Szyprowski
This patch adds support for v5 of SYSMMU controller, found in Samsung
Exynos 5433 SoCs. The main difference of v5 is support for 36-bit physical
address space and some changes in register layout and core clocks hanging.
This patch also adds support for ARM64 architecture, which is used by
Exynos 5433 SoCs.

Signed-off-by: Marek Szyprowski 
---
 .../devicetree/bindings/iommu/samsung,sysmmu.txt   |   5 +-
 drivers/iommu/Kconfig  |   2 +-
 drivers/iommu/exynos-iommu.c   | 187 +++--
 3 files changed, 143 insertions(+), 51 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/samsung,sysmmu.txt 
b/Documentation/devicetree/bindings/iommu/samsung,sysmmu.txt
index f61ca25ca136..85f068805dd8 100644
--- a/Documentation/devicetree/bindings/iommu/samsung,sysmmu.txt
+++ b/Documentation/devicetree/bindings/iommu/samsung,sysmmu.txt
@@ -35,9 +35,10 @@ Required properties:
 - interrupts: An interrupt specifier for interrupt signal of System MMU,
  according to the format defined by a particular interrupt
  controller.
-- clock-names: Should be "sysmmu" if the System MMU is needed to gate its 
clock.
+- clock-names: Should be "sysmmu" or a pair of "aclk" and "pclk" to gate
+  SYSMMU core clocks.
   Optional "master" if the clock to the System MMU is gated by
-  another gate clock other than "sysmmu" (usually main gate clock
+  another gate clock other core  (usually main gate clock
   of peripheral device this SYSMMU belongs to).
 - clocks: Phandles for respective clocks described by clock-names.
 - power-domains: Required if the System MMU is needed to gate its power.
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index a1e75cba18e0..1674de1cfed0 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -243,7 +243,7 @@ config TEGRA_IOMMU_SMMU
 
 config EXYNOS_IOMMU
bool "Exynos IOMMU Support"
-   depends on ARCH_EXYNOS && ARM && MMU
+   depends on ARCH_EXYNOS && MMU
select IOMMU_API
select ARM_DMA_USE_IOMMU
help
diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 8e289e2a05fb..9c8ce951158d 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -1,6 +1,5 @@
-/* linux/drivers/iommu/exynos_iommu.c
- *
- * Copyright (c) 2011 Samsung Electronics Co., Ltd.
+/*
+ * Copyright (c) 2011,2016 Samsung Electronics Co., Ltd.
  * http://www.samsung.com
  *
  * This program is free software; you can redistribute it and/or modify
@@ -55,17 +54,25 @@ typedef u32 sysmmu_pte_t;
 #define lv2ent_small(pent) ((*(pent) & 2) == 2)
 #define lv2ent_large(pent) ((*(pent) & 3) == 1)
 
-static u32 sysmmu_page_offset(sysmmu_iova_t iova, u32 size)
-{
-   return iova & (size - 1);
-}
-
-#define section_phys(sent) (*(sent) & SECT_MASK)
-#define section_offs(iova) sysmmu_page_offset((iova), SECT_SIZE)
-#define lpage_phys(pent) (*(pent) & LPAGE_MASK)
-#define lpage_offs(iova) sysmmu_page_offset((iova), LPAGE_SIZE)
-#define spage_phys(pent) (*(pent) & SPAGE_MASK)
-#define spage_offs(iova) sysmmu_page_offset((iova), SPAGE_SIZE)
+/*
+ * v1.x - v3.x SYSMMU supports 32bit physical and 32bit virtual address spaces
+ * v5.0 introduced support for 36bit physical address space by shifting
+ * all page entry values by 4 bits.
+ * All SYSMMU controllers in the system support the address spaces of the same
+ * size, so PG_ENT_SHIFT can be initialized on first SYSMMU probe to proper
+ * value (0 or 4).
+ */
+static short PG_ENT_SHIFT = -1;
+#define SYSMMU_PG_ENT_SHIFT 0
+#define SYSMMU_V5_PG_ENT_SHIFT 4
+
+#define sect_to_phys(ent) (((phys_addr_t) ent) << PG_ENT_SHIFT)
+#define section_phys(sent) (sect_to_phys(*(sent)) & SECT_MASK)
+#define section_offs(iova) (iova & (SECT_SIZE - 1))
+#define lpage_phys(pent) (sect_to_phys(*(pent)) & LPAGE_MASK)
+#define lpage_offs(iova) (iova & (LPAGE_SIZE - 1))
+#define spage_phys(pent) (sect_to_phys(*(pent)) & SPAGE_MASK)
+#define spage_offs(iova) (iova & (SPAGE_SIZE - 1))
 
 #define NUM_LV1ENTRIES 4096
 #define NUM_LV2ENTRIES (SECT_SIZE / SPAGE_SIZE)
@@ -84,13 +91,12 @@ static u32 lv2ent_offset(sysmmu_iova_t iova)
 #define LV2TABLE_SIZE (NUM_LV2ENTRIES * sizeof(sysmmu_pte_t))
 
 #define SPAGES_PER_LPAGE (LPAGE_SIZE / SPAGE_SIZE)
+#define lv2table_base(sent) (sect_to_phys(*(sent) & 0xFFC0))
 
-#define lv2table_base(sent) (*(sent) & 0xFC00)
-
-#define mk_lv1ent_sect(pa) ((pa) | 2)
-#define mk_lv1ent_page(pa) ((pa) | 1)
-#define mk_lv2ent_lpage(pa) ((pa) | 1)
-#define mk_lv2ent_spage(pa) ((pa) | 2)
+#define mk_lv1ent_sect(pa) ((pa >> PG_ENT_SHIFT) | 2)
+#define mk_lv1ent_page(pa) ((pa >> PG_ENT_SHIFT) | 1)
+#define mk_lv2ent_lpage(pa) ((pa >> PG_ENT_SHIFT) | 1)
+#define mk_lv2ent_spage(pa) ((pa >> PG_ENT_SHIFT) | 2)
 
 #define CTRL_ENABLE0x5
 #define CTRL_BLOCK 0x7
@@ -98,14 +104,23 @@ static u32 

[PATCH v2 13/13] iommu: exynos: support multiple attach_device calls

2016-02-18 Thread Marek Szyprowski
IOMMU core calls attach_device callback without detaching device from
the previous domain. This patch adds support for such unballanced calls.

Signed-off-by: Marek Szyprowski 
---
 drivers/iommu/exynos-iommu.c | 72 
 1 file changed, 40 insertions(+), 32 deletions(-)

diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 9c8ce951158d..b0665042bf29 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -201,6 +201,7 @@ static const struct sysmmu_fault_info sysmmu_v5_faults[] = {
 */
 struct exynos_iommu_owner {
struct list_head controllers;   /* list of sysmmu_drvdata.owner_node */
+   struct iommu_domain *domain;/* domain this device is attached */
 };
 
 /*
@@ -825,6 +826,41 @@ static void exynos_iommu_domain_free(struct iommu_domain 
*iommu_domain)
kfree(domain);
 }
 
+static void exynos_iommu_detach_device(struct iommu_domain *iommu_domain,
+   struct device *dev)
+{
+   struct exynos_iommu_owner *owner = dev->archdata.iommu;
+   struct exynos_iommu_domain *domain = to_exynos_domain(iommu_domain);
+   phys_addr_t pagetable = virt_to_phys(domain->pgtable);
+   struct sysmmu_drvdata *data, *next;
+   unsigned long flags;
+   bool found = false;
+
+   if (!has_sysmmu(dev) || owner->domain != iommu_domain)
+   return;
+
+   spin_lock_irqsave(>lock, flags);
+   list_for_each_entry_safe(data, next, >clients, domain_node) {
+   if (data->master == dev) {
+   if (__sysmmu_disable(data)) {
+   data->master = NULL;
+   list_del_init(>domain_node);
+   }
+   pm_runtime_put(data->sysmmu);
+   found = true;
+   }
+   }
+   spin_unlock_irqrestore(>lock, flags);
+
+   owner->domain = NULL;
+
+   if (found)
+   dev_dbg(dev, "%s: Detached IOMMU with pgtable %pa\n",
+   __func__, );
+   else
+   dev_err(dev, "%s: No IOMMU is attached\n", __func__);
+}
+
 static int exynos_iommu_attach_device(struct iommu_domain *iommu_domain,
   struct device *dev)
 {
@@ -838,6 +874,9 @@ static int exynos_iommu_attach_device(struct iommu_domain 
*iommu_domain,
if (!has_sysmmu(dev))
return -ENODEV;
 
+   if (owner->domain)
+   exynos_iommu_detach_device(owner->domain, dev);
+
list_for_each_entry(data, >controllers, owner_node) {
pm_runtime_get_sync(data->sysmmu);
ret = __sysmmu_enable(data, pagetable, domain);
@@ -856,44 +895,13 @@ static int exynos_iommu_attach_device(struct iommu_domain 
*iommu_domain,
return ret;
}
 
+   owner->domain = iommu_domain;
dev_dbg(dev, "%s: Attached IOMMU with pgtable %pa %s\n",
__func__, , (ret == 0) ? "" : ", again");
 
return ret;
 }
 
-static void exynos_iommu_detach_device(struct iommu_domain *iommu_domain,
-   struct device *dev)
-{
-   struct exynos_iommu_domain *domain = to_exynos_domain(iommu_domain);
-   phys_addr_t pagetable = virt_to_phys(domain->pgtable);
-   struct sysmmu_drvdata *data, *next;
-   unsigned long flags;
-   bool found = false;
-
-   if (!has_sysmmu(dev))
-   return;
-
-   spin_lock_irqsave(>lock, flags);
-   list_for_each_entry_safe(data, next, >clients, domain_node) {
-   if (data->master == dev) {
-   if (__sysmmu_disable(data)) {
-   data->master = NULL;
-   list_del_init(>domain_node);
-   }
-   pm_runtime_put(data->sysmmu);
-   found = true;
-   }
-   }
-   spin_unlock_irqrestore(>lock, flags);
-
-   if (found)
-   dev_dbg(dev, "%s: Detached IOMMU with pgtable %pa\n",
-   __func__, );
-   else
-   dev_err(dev, "%s: No IOMMU is attached\n", __func__);
-}
-
 static sysmmu_pte_t *alloc_lv2entry(struct exynos_iommu_domain *domain,
sysmmu_pte_t *sent, sysmmu_iova_t iova, short *pgcounter)
 {
-- 
1.9.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 06/13] iommu: exynos: refactor fault handling code

2016-02-18 Thread Marek Szyprowski
This patch provides a new implementation for page fault handing code. The
new implementation is ready for future extensions. No functional changes
have been made.

Signed-off-by: Marek Szyprowski 
---
 drivers/iommu/exynos-iommu.c | 109 ---
 1 file changed, 41 insertions(+), 68 deletions(-)

diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 4275222cead3..3a577a473f3c 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -148,40 +148,25 @@ static sysmmu_pte_t *page_entry(sysmmu_pte_t *sent, 
sysmmu_iova_t iova)
lv2table_base(sent)) + lv2ent_offset(iova);
 }
 
-enum exynos_sysmmu_inttype {
-   SYSMMU_PAGEFAULT,
-   SYSMMU_AR_MULTIHIT,
-   SYSMMU_AW_MULTIHIT,
-   SYSMMU_BUSERROR,
-   SYSMMU_AR_SECURITY,
-   SYSMMU_AR_ACCESS,
-   SYSMMU_AW_SECURITY,
-   SYSMMU_AW_PROTECTION, /* 7 */
-   SYSMMU_FAULT_UNKNOWN,
-   SYSMMU_FAULTS_NUM
-};
-
-static unsigned short fault_reg_offset[SYSMMU_FAULTS_NUM] = {
-   REG_PAGE_FAULT_ADDR,
-   REG_AR_FAULT_ADDR,
-   REG_AW_FAULT_ADDR,
-   REG_DEFAULT_SLAVE_ADDR,
-   REG_AR_FAULT_ADDR,
-   REG_AR_FAULT_ADDR,
-   REG_AW_FAULT_ADDR,
-   REG_AW_FAULT_ADDR
+/*
+ * IOMMU fault information register
+ */
+struct sysmmu_fault_info {
+   unsigned int bit;   /* bit number in STATUS register */
+   unsigned short addr_reg; /* register to read VA fault address */
+   const char *name;   /* human readable fault name */
+   unsigned int type;  /* fault type for report_iommu_fault */
 };
 
-static char *sysmmu_fault_name[SYSMMU_FAULTS_NUM] = {
-   "PAGE FAULT",
-   "AR MULTI-HIT FAULT",
-   "AW MULTI-HIT FAULT",
-   "BUS ERROR",
-   "AR SECURITY PROTECTION FAULT",
-   "AR ACCESS PROTECTION FAULT",
-   "AW SECURITY PROTECTION FAULT",
-   "AW ACCESS PROTECTION FAULT",
-   "UNKNOWN FAULT"
+static const struct sysmmu_fault_info sysmmu_faults[] = {
+   { 0, REG_PAGE_FAULT_ADDR, "PAGE", IOMMU_FAULT_READ },
+   { 1, REG_AR_FAULT_ADDR, "AR MULTI-HIT", IOMMU_FAULT_READ },
+   { 2, REG_AW_FAULT_ADDR, "AW MULTI-HIT", IOMMU_FAULT_WRITE },
+   { 3, REG_DEFAULT_SLAVE_ADDR, "BUS ERROR", IOMMU_FAULT_READ },
+   { 4, REG_AR_FAULT_ADDR, "AR SECURITY PROTECTION", IOMMU_FAULT_READ },
+   { 5, REG_AR_FAULT_ADDR, "AR ACCESS PROTECTION", IOMMU_FAULT_READ },
+   { 6, REG_AW_FAULT_ADDR, "AW SECURITY PROTECTION", IOMMU_FAULT_WRITE },
+   { 7, REG_AW_FAULT_ADDR, "AW ACCESS PROTECTION", IOMMU_FAULT_WRITE },
 };
 
 /*
@@ -299,24 +284,19 @@ static void __sysmmu_set_ptbase(struct sysmmu_drvdata 
*data, phys_addr_t pgd)
__sysmmu_tlb_invalidate(data);
 }
 
-static void show_fault_information(const char *name,
-   enum exynos_sysmmu_inttype itype,
-   phys_addr_t pgtable_base, sysmmu_iova_t fault_addr)
+static void show_fault_information(struct sysmmu_drvdata *data,
+  const struct sysmmu_fault_info *finfo,
+  sysmmu_iova_t fault_addr)
 {
sysmmu_pte_t *ent;
 
-   if ((itype >= SYSMMU_FAULTS_NUM) || (itype < SYSMMU_PAGEFAULT))
-   itype = SYSMMU_FAULT_UNKNOWN;
-
-   pr_err("%s occurred at %#x by %s(Page table base: %pa)\n",
-   sysmmu_fault_name[itype], fault_addr, name, _base);
-
-   ent = section_entry(phys_to_virt(pgtable_base), fault_addr);
-   pr_err("\tLv1 entry: %#x\n", *ent);
-
+   dev_err(data->sysmmu, "%s FAULT occurred at %#x (page table base: 
%pa)\n",
+   finfo->name, fault_addr, >pgtable);
+   ent = section_entry(phys_to_virt(data->pgtable), fault_addr);
+   dev_err(data->sysmmu, "\tLv1 entry: %#x\n", *ent);
if (lv1ent_page(ent)) {
ent = page_entry(ent, fault_addr);
-   pr_err("\t Lv2 entry: %#x\n", *ent);
+   dev_err(data->sysmmu, "\t Lv2 entry: %#x\n", *ent);
}
 }
 
@@ -324,8 +304,10 @@ static irqreturn_t exynos_sysmmu_irq(int irq, void *dev_id)
 {
/* SYSMMU is in blocked state when interrupt occurred. */
struct sysmmu_drvdata *data = dev_id;
-   enum exynos_sysmmu_inttype itype;
-   sysmmu_iova_t addr = -1;
+   const struct sysmmu_fault_info *finfo = sysmmu_faults;
+   int i, n = ARRAY_SIZE(sysmmu_faults);
+   unsigned int itype;
+   sysmmu_iova_t fault_addr = -1;
int ret = -ENOSYS;
 
WARN_ON(!is_sysmmu_active(data));
@@ -334,29 +316,20 @@ static irqreturn_t exynos_sysmmu_irq(int irq, void 
*dev_id)
 
clk_enable(data->clk_master);
 
-   itype = (enum exynos_sysmmu_inttype)
-   __ffs(__raw_readl(data->sfrbase + REG_INT_STATUS));
-   if (WARN_ON(!((itype >= 0) && (itype < SYSMMU_FAULT_UNKNOWN
-   itype = SYSMMU_FAULT_UNKNOWN;
-   else
-   addr = 

[PATCH v2 09/13] iommu: exynos: add support for SYSMMU controller with bogus version reg

2016-02-18 Thread Marek Szyprowski
SYSMMU on some SoCs reports bogus values in VERSION register. Force
hardware version to 1.0 for such controllers. This patch also moves reading
version register to driver's probe() function.

Signed-off-by: Marek Szyprowski 
---
 drivers/iommu/exynos-iommu.c | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index e42a76cc9674..8e289e2a05fb 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -284,6 +284,28 @@ static void __sysmmu_set_ptbase(struct sysmmu_drvdata 
*data, phys_addr_t pgd)
__sysmmu_tlb_invalidate(data);
 }
 
+static void __sysmmu_get_version(struct sysmmu_drvdata *data)
+{
+   u32 ver;
+
+   clk_enable(data->clk_master);
+   clk_enable(data->clk);
+
+   ver = __raw_readl(data->sfrbase + REG_MMU_VERSION);
+
+   /* controllers on some SoCs don't report proper version */
+   if (ver == 0x8001u)
+   data->version = MAKE_MMU_VER(1, 0);
+   else
+   data->version = MMU_RAW_VER(ver);
+
+   dev_dbg(data->sysmmu, "hardware version: %d.%d\n",
+   MMU_MAJ_VER(data->version), MMU_MIN_VER(data->version));
+
+   clk_disable(data->clk);
+   clk_disable(data->clk_master);
+}
+
 static void show_fault_information(struct sysmmu_drvdata *data,
   const struct sysmmu_fault_info *finfo,
   sysmmu_iova_t fault_addr)
@@ -385,7 +407,6 @@ static void __sysmmu_init_config(struct sysmmu_drvdata 
*data)
 {
unsigned int cfg;
 
-   data->version = MMU_RAW_VER(__raw_readl(data->sfrbase + 
REG_MMU_VERSION));
if (data->version <= MAKE_MMU_VER(3, 1))
cfg = CFG_LRU | CFG_QOS(15);
else if (data->version <= MAKE_MMU_VER(3, 2))
@@ -551,6 +572,7 @@ static int __init exynos_sysmmu_probe(struct 
platform_device *pdev)
 
platform_set_drvdata(pdev, data);
 
+   __sysmmu_get_version(data);
pm_runtime_enable(dev);
 
return 0;
-- 
1.9.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 04/13] iommu: exynos: simplify master clock operations

2016-02-18 Thread Marek Szyprowski
All clock API function can be called on NULL clock, so simplify code avoid
checking of master clock presence.

Signed-off-by: Marek Szyprowski 
---
 drivers/iommu/exynos-iommu.c | 32 
 1 file changed, 12 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 8c8a7f7968d1..bf6b826b1d8b 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -333,8 +333,7 @@ static irqreturn_t exynos_sysmmu_irq(int irq, void *dev_id)
 
spin_lock(>lock);
 
-   if (!IS_ERR(data->clk_master))
-   clk_enable(data->clk_master);
+   clk_enable(data->clk_master);
 
itype = (enum exynos_sysmmu_inttype)
__ffs(__raw_readl(data->sfrbase + REG_INT_STATUS));
@@ -366,8 +365,7 @@ static irqreturn_t exynos_sysmmu_irq(int irq, void *dev_id)
 
sysmmu_unblock(data->sfrbase);
 
-   if (!IS_ERR(data->clk_master))
-   clk_disable(data->clk_master);
+   clk_disable(data->clk_master);
 
spin_unlock(>lock);
 
@@ -376,15 +374,13 @@ static irqreturn_t exynos_sysmmu_irq(int irq, void 
*dev_id)
 
 static void __sysmmu_disable_nocount(struct sysmmu_drvdata *data)
 {
-   if (!IS_ERR(data->clk_master))
-   clk_enable(data->clk_master);
+   clk_enable(data->clk_master);
 
__raw_writel(CTRL_DISABLE, data->sfrbase + REG_MMU_CTRL);
__raw_writel(0, data->sfrbase + REG_MMU_CFG);
 
clk_disable(data->clk);
-   if (!IS_ERR(data->clk_master))
-   clk_disable(data->clk_master);
+   clk_disable(data->clk_master);
 }
 
 static bool __sysmmu_disable(struct sysmmu_drvdata *data)
@@ -437,8 +433,7 @@ static void __sysmmu_init_config(struct sysmmu_drvdata 
*data)
 
 static void __sysmmu_enable_nocount(struct sysmmu_drvdata *data)
 {
-   if (!IS_ERR(data->clk_master))
-   clk_enable(data->clk_master);
+   clk_enable(data->clk_master);
clk_enable(data->clk);
 
__raw_writel(CTRL_BLOCK, data->sfrbase + REG_MMU_CTRL);
@@ -449,8 +444,7 @@ static void __sysmmu_enable_nocount(struct sysmmu_drvdata 
*data)
 
__raw_writel(CTRL_ENABLE, data->sfrbase + REG_MMU_CTRL);
 
-   if (!IS_ERR(data->clk_master))
-   clk_disable(data->clk_master);
+   clk_disable(data->clk_master);
 }
 
 static int __sysmmu_enable(struct sysmmu_drvdata *data, phys_addr_t pgtable,
@@ -493,16 +487,14 @@ static void sysmmu_tlb_invalidate_flpdcache(struct 
sysmmu_drvdata *data,
 {
unsigned long flags;
 
-   if (!IS_ERR(data->clk_master))
-   clk_enable(data->clk_master);
+   clk_enable(data->clk_master);
 
spin_lock_irqsave(>lock, flags);
if (is_sysmmu_active(data))
__sysmmu_tlb_invalidate_flpdcache(data, iova);
spin_unlock_irqrestore(>lock, flags);
 
-   if (!IS_ERR(data->clk_master))
-   clk_disable(data->clk_master);
+   clk_disable(data->clk_master);
 }
 
 static void sysmmu_tlb_invalidate_entry(struct sysmmu_drvdata *data,
@@ -514,8 +506,7 @@ static void sysmmu_tlb_invalidate_entry(struct 
sysmmu_drvdata *data,
if (is_sysmmu_active(data)) {
unsigned int num_inv = 1;
 
-   if (!IS_ERR(data->clk_master))
-   clk_enable(data->clk_master);
+   clk_enable(data->clk_master);
 
/*
 * L2TLB invalidation required
@@ -535,8 +526,7 @@ static void sysmmu_tlb_invalidate_entry(struct 
sysmmu_drvdata *data,
data->sfrbase, iova, num_inv);
sysmmu_unblock(data->sfrbase);
}
-   if (!IS_ERR(data->clk_master))
-   clk_disable(data->clk_master);
+   clk_disable(data->clk_master);
} else {
dev_dbg(data->master,
"disabled. Skipping TLB invalidation @ %#x\n", iova);
@@ -593,6 +583,8 @@ static int __init exynos_sysmmu_probe(struct 
platform_device *pdev)
dev_err(dev, "Failed to prepare master's clk\n");
return ret;
}
+   } else {
+   data->clk_master = NULL;
}
 
data->sysmmu = dev;
-- 
1.9.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 07/13] iommu: exynos: refactor init config code

2016-02-18 Thread Marek Szyprowski
This patch rewrites sysmmu_init_config function to make it easier to read
and understand.

Signed-off-by: Marek Szyprowski 
---
 drivers/iommu/exynos-iommu.c | 25 +
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 3a577a473f3c..15787a177a16 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -383,24 +383,17 @@ static bool __sysmmu_disable(struct sysmmu_drvdata *data)
 
 static void __sysmmu_init_config(struct sysmmu_drvdata *data)
 {
-   unsigned int cfg = CFG_LRU | CFG_QOS(15);
-   unsigned int ver;
-
-   ver = MMU_RAW_VER(__raw_readl(data->sfrbase + REG_MMU_VERSION));
-   if (MMU_MAJ_VER(ver) == 3) {
-   if (MMU_MIN_VER(ver) >= 2) {
-   cfg |= CFG_FLPDCACHE;
-   if (MMU_MIN_VER(ver) == 3) {
-   cfg |= CFG_ACGEN;
-   cfg &= ~CFG_LRU;
-   } else {
-   cfg |= CFG_SYSSEL;
-   }
-   }
-   }
+   unsigned int cfg;
+
+   data->version = MMU_RAW_VER(__raw_readl(data->sfrbase + 
REG_MMU_VERSION));
+   if (data->version <= MAKE_MMU_VER(3, 1))
+   cfg = CFG_LRU | CFG_QOS(15);
+   else if (data->version <= MAKE_MMU_VER(3, 2))
+   cfg = CFG_LRU | CFG_QOS(15) | CFG_FLPDCACHE | CFG_SYSSEL;
+   else
+   cfg = CFG_QOS(15) | CFG_FLPDCACHE | CFG_ACGEN;
 
__raw_writel(cfg, data->sfrbase + REG_MMU_CFG);
-   data->version = ver;
 }
 
 static void __sysmmu_enable_nocount(struct sysmmu_drvdata *data)
-- 
1.9.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 08/13] iommu: exynos: unify code for fldp cache invalidation

2016-02-18 Thread Marek Szyprowski
This patch simplifies the code for handling of flpdcache invalidation.

Signed-off-by: Marek Szyprowski 
---
 drivers/iommu/exynos-iommu.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 15787a177a16..e42a76cc9674 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -440,13 +440,6 @@ static int __sysmmu_enable(struct sysmmu_drvdata *data, 
phys_addr_t pgtable,
return ret;
 }
 
-static void __sysmmu_tlb_invalidate_flpdcache(struct sysmmu_drvdata *data,
- sysmmu_iova_t iova)
-{
-   if (data->version == MAKE_MMU_VER(3, 3))
-   __raw_writel(iova | 0x1, data->sfrbase + REG_MMU_FLUSH_ENTRY);
-}
-
 static void sysmmu_tlb_invalidate_flpdcache(struct sysmmu_drvdata *data,
sysmmu_iova_t iova)
 {
@@ -455,8 +448,10 @@ static void sysmmu_tlb_invalidate_flpdcache(struct 
sysmmu_drvdata *data,
clk_enable(data->clk_master);
 
spin_lock_irqsave(>lock, flags);
-   if (is_sysmmu_active(data))
-   __sysmmu_tlb_invalidate_flpdcache(data, iova);
+   if (is_sysmmu_active(data)) {
+   if (data->version >= MAKE_MMU_VER(3, 3))
+   __sysmmu_tlb_invalidate_entry(data, iova, 1);
+   }
spin_unlock_irqrestore(>lock, flags);
 
clk_disable(data->clk_master);
-- 
1.9.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 01/13] iommu: exynos: rework iommu group initialization

2016-02-18 Thread Marek Szyprowski
This patch replaces custom code in add_device implementation with
iommu_group_get_for_dev() call and provides the needed callback.

Signed-off-by: Marek Szyprowski 
---
 drivers/iommu/exynos-iommu.c | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 97c41b8ab5d9..4fc079073c86 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -1114,28 +1114,32 @@ static phys_addr_t exynos_iommu_iova_to_phys(struct 
iommu_domain *iommu_domain,
return phys;
 }
 
+static struct iommu_group *get_device_iommu_group(struct device *dev)
+{
+   struct iommu_group *group;
+
+   group = iommu_group_get(dev);
+   if (!group)
+   group = iommu_group_alloc();
+
+   return group;
+}
+
 static int exynos_iommu_add_device(struct device *dev)
 {
struct iommu_group *group;
-   int ret;
 
if (!has_sysmmu(dev))
return -ENODEV;
 
-   group = iommu_group_get(dev);
+   group = iommu_group_get_for_dev(dev);
 
-   if (!group) {
-   group = iommu_group_alloc();
-   if (IS_ERR(group)) {
-   dev_err(dev, "Failed to allocate IOMMU group\n");
-   return PTR_ERR(group);
-   }
-   }
+   if (IS_ERR(group))
+   return PTR_ERR(group);
 
-   ret = iommu_group_add_device(group, dev);
iommu_group_put(group);
 
-   return ret;
+   return 0;
 }
 
 static void exynos_iommu_remove_device(struct device *dev)
@@ -1182,6 +1186,7 @@ static struct iommu_ops exynos_iommu_ops = {
.unmap = exynos_iommu_unmap,
.map_sg = default_iommu_map_sg,
.iova_to_phys = exynos_iommu_iova_to_phys,
+   .device_group = get_device_iommu_group,
.add_device = exynos_iommu_add_device,
.remove_device = exynos_iommu_remove_device,
.pgsize_bitmap = SECT_SIZE | LPAGE_SIZE | SPAGE_SIZE,
-- 
1.9.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 05/13] iommu: exynos: refactor code (no direct register access)

2016-02-18 Thread Marek Szyprowski
This patch changes some internal functions to have access to the state of
sysmmu device instead of having only it's registers. This will make the
code ready for future extensions.

Signed-off-by: Marek Szyprowski 
---
 drivers/iommu/exynos-iommu.c | 40 +++-
 1 file changed, 19 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index bf6b826b1d8b..4275222cead3 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -254,50 +254,49 @@ static bool is_sysmmu_active(struct sysmmu_drvdata *data)
return data->activations > 0;
 }
 
-static void sysmmu_unblock(void __iomem *sfrbase)
+static void sysmmu_unblock(struct sysmmu_drvdata *data)
 {
-   __raw_writel(CTRL_ENABLE, sfrbase + REG_MMU_CTRL);
+   __raw_writel(CTRL_ENABLE, data->sfrbase + REG_MMU_CTRL);
 }
 
-static bool sysmmu_block(void __iomem *sfrbase)
+static bool sysmmu_block(struct sysmmu_drvdata *data)
 {
int i = 120;
 
-   __raw_writel(CTRL_BLOCK, sfrbase + REG_MMU_CTRL);
-   while ((i > 0) && !(__raw_readl(sfrbase + REG_MMU_STATUS) & 1))
+   __raw_writel(CTRL_BLOCK, data->sfrbase + REG_MMU_CTRL);
+   while ((i > 0) && !(__raw_readl(data->sfrbase + REG_MMU_STATUS) & 1))
--i;
 
-   if (!(__raw_readl(sfrbase + REG_MMU_STATUS) & 1)) {
-   sysmmu_unblock(sfrbase);
+   if (!(__raw_readl(data->sfrbase + REG_MMU_STATUS) & 1)) {
+   sysmmu_unblock(data);
return false;
}
 
return true;
 }
 
-static void __sysmmu_tlb_invalidate(void __iomem *sfrbase)
+static void __sysmmu_tlb_invalidate(struct sysmmu_drvdata *data)
 {
-   __raw_writel(0x1, sfrbase + REG_MMU_FLUSH);
+   __raw_writel(0x1, data->sfrbase + REG_MMU_FLUSH);
 }
 
-static void __sysmmu_tlb_invalidate_entry(void __iomem *sfrbase,
+static void __sysmmu_tlb_invalidate_entry(struct sysmmu_drvdata *data,
sysmmu_iova_t iova, unsigned int num_inv)
 {
unsigned int i;
 
for (i = 0; i < num_inv; i++) {
__raw_writel((iova & SPAGE_MASK) | 1,
-   sfrbase + REG_MMU_FLUSH_ENTRY);
+   data->sfrbase + REG_MMU_FLUSH_ENTRY);
iova += SPAGE_SIZE;
}
 }
 
-static void __sysmmu_set_ptbase(void __iomem *sfrbase,
-  phys_addr_t pgd)
+static void __sysmmu_set_ptbase(struct sysmmu_drvdata *data, phys_addr_t pgd)
 {
-   __raw_writel(pgd, sfrbase + REG_PT_BASE_ADDR);
+   __raw_writel(pgd, data->sfrbase + REG_PT_BASE_ADDR);
 
-   __sysmmu_tlb_invalidate(sfrbase);
+   __sysmmu_tlb_invalidate(data);
 }
 
 static void show_fault_information(const char *name,
@@ -363,7 +362,7 @@ static irqreturn_t exynos_sysmmu_irq(int irq, void *dev_id)
 
__raw_writel(1 << itype, data->sfrbase + REG_INT_CLEAR);
 
-   sysmmu_unblock(data->sfrbase);
+   sysmmu_unblock(data);
 
clk_disable(data->clk_master);
 
@@ -440,7 +439,7 @@ static void __sysmmu_enable_nocount(struct sysmmu_drvdata 
*data)
 
__sysmmu_init_config(data);
 
-   __sysmmu_set_ptbase(data->sfrbase, data->pgtable);
+   __sysmmu_set_ptbase(data, data->pgtable);
 
__raw_writel(CTRL_ENABLE, data->sfrbase + REG_MMU_CTRL);
 
@@ -521,10 +520,9 @@ static void sysmmu_tlb_invalidate_entry(struct 
sysmmu_drvdata *data,
if (MMU_MAJ_VER(data->version) == 2)
num_inv = min_t(unsigned int, size / PAGE_SIZE, 64);
 
-   if (sysmmu_block(data->sfrbase)) {
-   __sysmmu_tlb_invalidate_entry(
-   data->sfrbase, iova, num_inv);
-   sysmmu_unblock(data->sfrbase);
+   if (sysmmu_block(data)) {
+   __sysmmu_tlb_invalidate_entry(data, iova, num_inv);
+   sysmmu_unblock(data);
}
clk_disable(data->clk_master);
} else {
-- 
1.9.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 00/13] SYSMMU driver update and support for Exynos 5433

2016-02-18 Thread Marek Szyprowski
Hello,

This patchset updates Exynos SYSMMU (IOMMU) driver to make use of the
new features in the IOMMU core (support for IOMMU_DOMAIN_DMA) and adds
support for SYSMMU v5 controllers, which are available in Samsung Exynos
5433 SoCs. The driver has been also updated to compile and work on ARM64
architecture.

Best regards
Marek Szyprowski
Samsung R Institute Poland

Changelog:
v2:
- added support for multiple calls of device_attach (without detach),
  needed for default domain handling in iommu core (patch no 13), more
  information in the following thread:
  https://lists.linaro.org/pipermail/linaro-mm-sig/2016-February/004625.html
- fixed support for SYSMMU controllers with bogus version register value
  (patch no 9)

v1: http://www.spinics.net/lists/arm-kernel/msg483531.html
- initial version

Patch summary:

Marek Szyprowski (13):
  iommu: exynos: rework iommu group initialization
  iommu: exynos: add support for IOMMU_DOMAIN_DMA domain type
  iommu: exynos: remove ARM-specific cache flush interface
  iommu: exynos: simplify master clock operations
  iommu: exynos: refactor code (no direct register access)
  iommu: exynos: refactor fault handling code
  iommu: exynos: refactor init config code
  iommu: exynos: unify code for fldp cache invalidation
  iommu: exynos: add support for SYSMMU controller with bogus version
reg
  iommu: exynos: update device tree documentation
  iommu: exynos: add support for v5 SYSMMU
  iommu: exynos: add Maintainers entry for Exynos SYSMMU driver
  iommu: exynos: support multiple attach_device calls

 .../devicetree/bindings/iommu/samsung,sysmmu.txt   |  22 +-
 MAINTAINERS|   6 +
 drivers/iommu/Kconfig  |   2 +-
 drivers/iommu/exynos-iommu.c   | 598 -
 4 files changed, 372 insertions(+), 256 deletions(-)

-- 
1.9.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 03/13] iommu: exynos: remove ARM-specific cache flush interface

2016-02-18 Thread Marek Szyprowski
This patch replaces custom ARM-specific code for performing CPU cache flush
operations with generic code based on DMA-mapping. Domain managing code
is independent of particular SYSMMU device, so the first registered SYSMMU
device is used for DMA-mapping calls. This simplification works fine
because all SYSMMU controllers are in the same address space (where
DMA address equals physical address) and the DMA-mapping calls are done
mainly to flush CPU cache to make changes visible to SYSMMU controllers.

Signed-off-by: Marek Szyprowski 
---
 drivers/iommu/exynos-iommu.c | 74 +---
 1 file changed, 49 insertions(+), 25 deletions(-)

diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 595e0da55db4..8c8a7f7968d1 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -27,9 +27,6 @@
 #include 
 #include 
 
-#include 
-#include 
-
 typedef u32 sysmmu_iova_t;
 typedef u32 sysmmu_pte_t;
 
@@ -83,6 +80,7 @@ static u32 lv2ent_offset(sysmmu_iova_t iova)
return (iova >> SPAGE_ORDER) & (NUM_LV2ENTRIES - 1);
 }
 
+#define LV1TABLE_SIZE (NUM_LV1ENTRIES * sizeof(sysmmu_pte_t))
 #define LV2TABLE_SIZE (NUM_LV2ENTRIES * sizeof(sysmmu_pte_t))
 
 #define SPAGES_PER_LPAGE (LPAGE_SIZE / SPAGE_SIZE)
@@ -134,6 +132,7 @@ static u32 lv2ent_offset(sysmmu_iova_t iova)
 
 #define has_sysmmu(dev)(dev->archdata.iommu != NULL)
 
+static struct device *dma_dev;
 static struct kmem_cache *lv2table_kmem_cache;
 static sysmmu_pte_t *zero_lv2_table;
 #define ZERO_LV2LINK mk_lv1ent_page(virt_to_phys(zero_lv2_table))
@@ -650,16 +649,19 @@ static struct platform_driver exynos_sysmmu_driver 
__refdata = {
}
 };
 
-static inline void pgtable_flush(void *vastart, void *vaend)
+static inline void update_pte(sysmmu_pte_t *ent, sysmmu_pte_t val)
 {
-   dmac_flush_range(vastart, vaend);
-   outer_flush_range(virt_to_phys(vastart),
-   virt_to_phys(vaend));
+   dma_sync_single_for_cpu(dma_dev, virt_to_phys(ent), sizeof(*ent),
+   DMA_TO_DEVICE);
+   *ent = val;
+   dma_sync_single_for_device(dma_dev, virt_to_phys(ent), sizeof(*ent),
+  DMA_TO_DEVICE);
 }
 
 static struct iommu_domain *exynos_iommu_domain_alloc(unsigned type)
 {
struct exynos_iommu_domain *domain;
+   dma_addr_t handle;
int i;
 
 
@@ -694,7 +696,10 @@ static struct iommu_domain 
*exynos_iommu_domain_alloc(unsigned type)
domain->pgtable[i + 7] = ZERO_LV2LINK;
}
 
-   pgtable_flush(domain->pgtable, domain->pgtable + NUM_LV1ENTRIES);
+   handle = dma_map_single(dma_dev, domain->pgtable, LV1TABLE_SIZE,
+   DMA_TO_DEVICE);
+   /* For mapping page table entries we rely on dma == phys */
+   BUG_ON(handle != virt_to_phys(domain->pgtable));
 
spin_lock_init(>lock);
spin_lock_init(>pgtablelock);
@@ -738,10 +743,18 @@ static void exynos_iommu_domain_free(struct iommu_domain 
*iommu_domain)
if (iommu_domain->type == IOMMU_DOMAIN_DMA)
iommu_put_dma_cookie(iommu_domain);
 
+   dma_unmap_single(dma_dev, virt_to_phys(domain->pgtable), LV1TABLE_SIZE,
+DMA_TO_DEVICE);
+
for (i = 0; i < NUM_LV1ENTRIES; i++)
-   if (lv1ent_page(domain->pgtable + i))
+   if (lv1ent_page(domain->pgtable + i)) {
+   phys_addr_t base = lv2table_base(domain->pgtable + i);
+
+   dma_unmap_single(dma_dev, base, LV2TABLE_SIZE,
+DMA_TO_DEVICE);
kmem_cache_free(lv2table_kmem_cache,
-   phys_to_virt(lv2table_base(domain->pgtable + 
i)));
+   phys_to_virt(base));
+   }
 
free_pages((unsigned long)domain->pgtable, 2);
free_pages((unsigned long)domain->lv2entcnt, 1);
@@ -834,11 +847,10 @@ static sysmmu_pte_t *alloc_lv2entry(struct 
exynos_iommu_domain *domain,
if (!pent)
return ERR_PTR(-ENOMEM);
 
-   *sent = mk_lv1ent_page(virt_to_phys(pent));
+   update_pte(sent, mk_lv1ent_page(virt_to_phys(pent)));
kmemleak_ignore(pent);
*pgcounter = NUM_LV2ENTRIES;
-   pgtable_flush(pent, pent + NUM_LV2ENTRIES);
-   pgtable_flush(sent, sent + 1);
+   dma_map_single(dma_dev, pent, LV2TABLE_SIZE, DMA_TO_DEVICE);
 
/*
 * If pre-fetched SLPD is a faulty SLPD in zero_l2_table,
@@ -891,9 +903,7 @@ static int lv1set_section(struct exynos_iommu_domain 
*domain,
*pgcnt = 0;
}
 
-   *sent = mk_lv1ent_sect(paddr);
-
-   pgtable_flush(sent, sent + 1);
+   update_pte(sent, mk_lv1ent_sect(paddr));
 
spin_lock(>lock);
if 

Re: [RFC PATCH 1/2] PCI: Add PCI device flag PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS

2016-02-18 Thread Jayachandran Chandrashekaran Nair
On Wed, Feb 17, 2016 at 8:58 PM, Alex Williamson
 wrote:
> On Wed, 17 Feb 2016 17:15:09 +0530
> Jayachandran Chandrashekaran Nair
>  wrote:
>
>> On Wed, Feb 17, 2016 at 3:55 AM, Alex Williamson
>>  wrote:
>> > On Wed, 17 Feb 2016 02:38:24 +0530
>> > Jayachandran Chandrashekaran Nair
>> >  wrote:
>> >
>> >> On Tue, Feb 16, 2016 at 1:09 AM, Alex Williamson
>> >>  wrote:
>> >> > On Mon, 15 Feb 2016 12:20:23 -0600
>> >> > Bjorn Helgaas  wrote:
>> >> >
>> >> >> [+cc Alex, iommu list]
>> >> >>
>> >> >> On Mon, Feb 15, 2016 at 03:35:00AM +0530, Jayachandran C wrote:
>> >> >> > Add a new flag PCI_DEV_FLAGS_BRIDGE_SKIP_ALIAS to indicate bridges
>> >> >> > that should not be considered during DMA alias search. This is
>> >> >> > to support hardware (in this case Broadcom Vulcan PCIe subsystem)
>> >> >> > that has internal bridges which have either missing or wrong PCIe
>> >> >> > capabilities.
>> >> >
>> >> > I figured this would come at some point, the right answer is of course
>> >> > to follow the PCIe spec and implement the required PCIe capability in
>> >> > the hardware.
>> >>
>> >> There are  PCIe controllers on the chip that follows the spec, the issue 
>> >> is
>> >> how it is integrated to the northbridge (equivalent) on the chip, I have
>> >> tried to explain it below.
>> >>
>> >> >> This needs more explanation, like what exactly is wrong with this
>> >> >> device?  A missing PCIe capability might cause other problems.
>> >> >>
>> >> >> What problem does this fix?  Without these patches, do we merely add
>> >> >> aliases that are unnecessary?  Do we crash because something goes
>> >> >> wrong in the pci_pcie_type() switch because of the incorrect
>> >> >> capability?
>> >>
>> >> Here's how (for example) how the PCI enumeration of a 2 node Vulcan
>> >> processor will look like:
>> >>
>> >>
>> >> [0] +-0.0.0--[1]---+--1.a.0[2]-2.0.0---[3]3.0.0
>> >> |  +--1.a.1[4]-4.0.0---[5]5.0.0
>> >> |  .
>> >> |  ... etc...
>> >> |
>> >> +-0.0.1--[10]--+-10.a.0[11]---11.0.0---[12]---12.0.0
>> >>+-10.a.1[13]---13.0.0---[14]---14.0.0
>> >>.
>> >>... etc...
>> >
>> > So we have:
>> >
>> > "Glue Bridge"---"Glue Bridges"---Root Ports---Endpoints
>> >   (no pcie)  (pci/x-pcie)
>>
>> The top level is one glue bridge per chip in a multi-chip board,
>> but otherwise this is accurate.
>>
>> >>
>> >> The devices 0.0.x and x.a.x are glue bridges that are there to
>> >> bring the real PCIe controllers (pcie cap type 4) 2.0.0, 4.0.0,
>> >> 11.0.0, 13.0.0 under a single PCI hierarchy. The x.a.x bridges
>> >> have a PCIe capability (type 8) and 0.0.x does not have any pcie
>> >> capability.
>> >>
>> >> In case of Vulcan, both the GICv3 ITS driver code and the SMMUv3
>> >> driver code does dma alias walk to find the device id to use
>> >> in ITS and SMMU. In both cases it will ignore the x.0.0 bridges
>> >> since they are type root port, but will continue on up and end
>> >> up with incorrect device id.
>> >>
>> >> The flag I have added is to make the pci_for_each_dma_alias()
>> >> ignore the last 2 levels of glue/internal bridges.
>> >
>> > Per the PCIe spec, I'm not even sure what you have is a valid
>> > hierarchy, root ports sit on root complexes, not behind a PCI-to-PCIe
>> > bridge.  So really you're pretending that the downstream "glue bridge"
>> > is your host bridge and therefore root complex, but the PCI topology
>> > would clearly dictate that the top-most bus is conventional PCI and
>> > therefore everything is an alias of everything else and the DMA alias
>> > code is doing exactly what it should.
>>
>> Yes, I am not arguing that there is any issue in the current code. I
>> am trying to figure out the correct way to implement the quirk. We
>> have to support the hardware we have, not the hardware we want to
>> have :)
>>
>> > Also note that the caller of pci_for_each_dma_alias() owns the callback
>> > function triggered for each alias, that function could choose to prune
>> > out these "glue bridges" itself if that's the appropriate thing to do.
>>
>> I had implemented this first, but moved to the current approach because
>> it is needed in multiple places. The problem is: "On vulcan, while
>> going up the pcie bus heirarchy for finding aliases, skip the glue bridges",
>> So the logical place to put the fix is in pci_for_each_dma_alias()
>>
>> > Where do the SMMU and ITS actually reside in the above diagram?  You
>> > can use the callout function to stop the traversal anywhere you wish,
>> > it's free to return a -errno to stop or positive number, which the
>> > caller can interpret as error or non-failure stop condition.
>>
>> The SMMU (for translation requests) and ITS (for MSI 

Re: [PATCH V4 6/6] perf/amd/iommu: Clean up print messages pr_debug

2016-02-18 Thread Borislav Petkov
On Thu, Feb 11, 2016 at 04:15:27PM +0700, Suravee Suthikulpanit wrote:
> From: Suravee Suthikulpanit 
> 
> This patch declares pr_fmt for perf/amd_iommu, removes unnecessary
> pr_debug, and clean up error messages.

For the next version, move all the cleanups first so that the later
patches are simpler and easier to review.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V4 5/6] perf/amd/iommu: Enable support for multiple IOMMUs

2016-02-18 Thread Borislav Petkov
On Thu, Feb 11, 2016 at 04:15:26PM +0700, Suravee Suthikulpanit wrote:
> The current amd_iommu_pc_get_set_reg_val() does not support muli-IOMMU

multi

> system. This patch replace amd_iommu_pc_get_set_reg_val() with

You don't need to say in the commit message what this patch does - I
think most of us can see it - but concentrate on the why. So if you catch
yourself starting a commit message with "This patch does... " then that
description is most of the time redundant and needless.

> amd_iommu_pc_set_reg_val() and amd_iommu_pc_[set|get]_cnt_vals().
> 
> Also, the current struct hw_perf_event.prev_count can only store the
> previous counter value only from one IOMMU. So, this patch introduces
> a new data structure, perf_amd_iommu.prev_cnts, to track previous value
> of each performance counter of each bank of each IOMMU.
> 
> Last, this patch introduce perf_iommu_cnts to temporary hold counter
> values from each IOMMU for internal use when manages counters among
> multiple IOMMUs.
> 
> Note that this implementation makes an assumption that the counters
> on all IOMMUs will be programmed the same way (i.e with the same events).

Why?

Is that some hw restriction or what's going on?

This needs to at least be documented why or even completely lifted. I'd
prefer the latter.

> 
> Signed-off-by: Suravee Suthikulpanit 
> ---
>  arch/x86/events/amd/iommu.c   | 146 
> +++---
>  arch/x86/include/asm/perf/amd/iommu.h |   7 +-
>  drivers/iommu/amd_iommu_init.c| 107 ++---
>  3 files changed, 201 insertions(+), 59 deletions(-)
> 
> diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c
> index 812eff2..947ae8a 100644
> --- a/arch/x86/events/amd/iommu.c
> +++ b/arch/x86/events/amd/iommu.c
> @@ -42,6 +42,13 @@ struct perf_amd_iommu {
>   u64 cntr_assign_mask;
>   raw_spinlock_t lock;
>   const struct attribute_group *attr_groups[4];
> +
> + /*
> +  * This is a 3D array used to store the previous count values
> +  * from each performance counter of each bank of each IOMMU.
> +  * I.E. size of array = (num iommus * num banks * num counters)
> +  */
> + local64_t *prev_cnts;
>  };
>  
>  #define format_group attr_groups[0]
> @@ -121,6 +128,12 @@ static struct amd_iommu_event_desc 
> amd_iommu_v2_event_descs[] = {
>   { /* end: all zeroes */ },
>  };
>  
> +/*
> + * This is an array used to temporary hold the current values
> + * read from a particular perf counter from each IOMMU.
> + */
> +static u64 *perf_iommu_cnts;

Is this going to be accessed in parallel on different CPUs? If so, you
need synchronization for it.

> +
>  /*-
>   * sysfs cpumask attributes
>   *-*/
> @@ -255,44 +268,42 @@ static void perf_iommu_enable_event(struct perf_event 
> *ev)
>   u64 reg = 0ULL;
>  
>   reg = csource;
> - amd_iommu_pc_get_set_reg_val(devid,
> - _GET_BANK(ev), _GET_CNTR(ev) ,
> -  IOMMU_PC_COUNTER_SRC_REG, , true);
> + amd_iommu_pc_set_reg(devid, _GET_BANK(ev), _GET_CNTR(ev),
> +  IOMMU_PC_COUNTER_SRC_REG, );
>  
> - reg = 0ULL | devid | (_GET_DEVID_MASK(ev) << 32);
> + reg = devid | (_GET_DEVID_MASK(ev) << 32);
>   if (reg)
> - reg |= (1UL << 31);
> - amd_iommu_pc_get_set_reg_val(devid,
> - _GET_BANK(ev), _GET_CNTR(ev) ,
> -  IOMMU_PC_DEVID_MATCH_REG, , true);
> + reg |= BIT(31);
> + amd_iommu_pc_set_reg(devid, _GET_BANK(ev), _GET_CNTR(ev),
> +  IOMMU_PC_DEVID_MATCH_REG, );
>  
> - reg = 0ULL | _GET_PASID(ev) | (_GET_PASID_MASK(ev) << 32);
> + reg = _GET_PASID(ev) | (_GET_PASID_MASK(ev) << 32);
>   if (reg)
> - reg |= (1UL << 31);
> - amd_iommu_pc_get_set_reg_val(devid,
> - _GET_BANK(ev), _GET_CNTR(ev) ,
> -  IOMMU_PC_PASID_MATCH_REG, , true);
> + reg |= BIT(31);
> + amd_iommu_pc_set_reg(devid, _GET_BANK(ev), _GET_CNTR(ev),
> +  IOMMU_PC_PASID_MATCH_REG, );
>  
> - reg = 0ULL | _GET_DOMID(ev) | (_GET_DOMID_MASK(ev) << 32);
> + reg = _GET_DOMID(ev) | (_GET_DOMID_MASK(ev) << 32);
>   if (reg)
> - reg |= (1UL << 31);
> - amd_iommu_pc_get_set_reg_val(devid,
> - _GET_BANK(ev), _GET_CNTR(ev) ,
> -  IOMMU_PC_DOMID_MATCH_REG, , true);
> + reg |= BIT(31);
> + amd_iommu_pc_set_reg(devid, _GET_BANK(ev), _GET_CNTR(ev),
> +  IOMMU_PC_DOMID_MATCH_REG, );
>  }
>  

This is a cleanup and belongs in a separate patch. Each patch should
contain one logical change: patch 1 cleans up things, patch2 cleans up
things even more, ... patch N adds functionality, patch N + 1 adds more
functionality ... and so on.

>  

Re: [PATCH V4 5/6] perf/amd/iommu: Enable support for multiple IOMMUs

2016-02-18 Thread Peter Zijlstra
On Thu, Feb 11, 2016 at 04:15:26PM +0700, Suravee Suthikulpanit wrote:
>  static void perf_iommu_read(struct perf_event *event)
>  {
> + int i;
>   u64 delta = 0ULL;
>   struct hw_perf_event *hwc = >hw;
> + struct perf_amd_iommu *perf_iommu = container_of(event->pmu,
> +  struct perf_amd_iommu,
> +  pmu);
>  
> + if (amd_iommu_pc_get_counters(_GET_BANK(event), _GET_CNTR(event),
> +   amd_iommu_get_num_iommus(),
> +   perf_iommu_cnts))
>   return;
>  
> + /*
> +  * Now we re-aggregating event counts and prev-counts
> +  * from all IOMMUs
> +  */
> + local64_set(>prev_count, 0);
> +
> + for (i = 0; i < amd_iommu_get_num_iommus(); i++) {
> + int indx = get_iommu_bnk_cnt_evt_idx(perf_iommu, i,
> +  _GET_BANK(event),
> +  _GET_CNTR(event));
> + u64 prev_raw_count = local64_read(_iommu->prev_cnts[indx]);
> +
> + /* IOMMU pc counter register is only 48 bits */
> + perf_iommu_cnts[i] &= GENMASK_ULL(48, 0);
> +
> + /*
> +  * Since we do not enable counter overflow interrupts,
> +  * we do not have to worry about prev_count changing on us
> +  */
> + local64_set(_iommu->prev_cnts[indx], perf_iommu_cnts[i]);
> + local64_add(prev_raw_count, >prev_count);
> +
> + /* Handle 48-bit counter overflow */
> + delta = (perf_iommu_cnts[i] << COUNTER_SHIFT) - (prev_raw_count 
> << COUNTER_SHIFT);
> + delta >>= COUNTER_SHIFT;
> + local64_add(delta, >count);
> + }
>  }

So I really don't have time to review new muck while I'm hunting perf
core fail, but Boris made me look at this.

This is crazy, if you have multiple IOMMUs then create an event per
IOMMU, do _NOT_ fold them all into a single event.

In any case, the reason Boris asked me to look at this is that your
overflow handling is broken, you want delta to be s64. Otherwise:

delta >>= COUNTER_SHIFT;

ends up as a SHR and you loose the MSB sign bits.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 2/4] iommu/arm-smmu: Invoke DT probe from arm_smmu_of_setup()

2016-02-18 Thread Sricharan
Hi,

> -Original Message-
> From: linux-arm-kernel [mailto:linux-arm-kernel-
> boun...@lists.infradead.org] On Behalf Of Anup Patel
> Sent: Monday, February 08, 2016 10:48 AM
> To: Catalin Marinas; Joerg Roedel; Will Deacon; Robin Murphy; Sricharan R;
> Linux IOMMU; Linux ARM Kernel
> Cc: Mark Rutland; Device Tree; Scott Branden; Pawel Moll; Ian Campbell;
Ray
> Jui; Linux Kernel; Vikram Prakash; Rob Herring; BCM Kernel Feedback; Kumar
> Gala; Anup Patel
> Subject: [PATCH v2 2/4] iommu/arm-smmu: Invoke DT probe from
> arm_smmu_of_setup()
> 
> The SMMUv1/SMMUv2 driver is initialized very early using the
> IOMMU_OF_DECLARE() but the actual platform device is probed via normal
> DT probing.
> 
> This patch uses of_platform_device_create() from arm_smmu_of_setup() to
> ensure that SMMU platform device is probed immediately.
> 
> Signed-off-by: Anup Patel 
> ---
>  drivers/iommu/arm-smmu.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index
> 2c8f871..02cd67d 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -40,6 +40,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1956,10 +1957,15 @@ static int __init arm_smmu_init(void)
> 
>  static int __init arm_smmu_of_setup(struct device_node *np)  {
> + struct platform_device *pdev;
> 
>   if (!init_done)
>   arm_smmu_init();
> 
> + pdev = of_platform_device_create(np, NULL, NULL);
> + if (IS_ERR(pdev))
> + return PTR_ERR(pdev);
> +
>   of_iommu_set_ops(np, _smmu_ops);

 A question here is,  There was a probe deferral series [1], to take care of
deferred
 probing of devices behind iommus. With that this sort of early device
probing during setup
 should not be required and also it clears other dependencies of iommus on
clocks, etc, if any.
 My intention was to check whats the right way to do this  ? (or) point me
to any updates 
 on the probe deferral series that I miss ?

[1]  http://lkml.iu.edu/hypermail/linux/kernel/1505.3/03280.html
 
Regards,
 Sricharan

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V4 4/6] perf/amd/iommu: Introduce get_iommu_bnk_cnt_evt_idx

2016-02-18 Thread Borislav Petkov
On Thu, Feb 11, 2016 at 04:15:25PM +0700, Suravee Suthikulpanit wrote:
> Introduce a helper function to calculate bit-index for assigning
> performance counter assignment.
> 
> Signed-off-by: Suravee Suthikulpanit 
> ---
>  arch/x86/events/amd/iommu.c | 19 ++-
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c
> index debf22d..812eff2 100644
> --- a/arch/x86/events/amd/iommu.c
> +++ b/arch/x86/events/amd/iommu.c
> @@ -145,18 +145,27 @@ static struct attribute_group amd_iommu_cpumask_group = 
> {
>  
>  /*-*/
>  
> +static inline
> +int get_iommu_bnk_cnt_evt_idx(struct perf_amd_iommu *perf_iommu,
> +   int iommu_index, int bank_index,
> +   int cntr_index)

Align args at the opening brace.

> +{
> + int cntrs_per_iommu = perf_iommu->max_banks * perf_iommu->max_counters;
> + int index = (perf_iommu->max_counters * bank_index) + cntr_index;
> +
> + return (cntrs_per_iommu * iommu_index) + index;
> +}
> +
>  static int get_next_avail_iommu_bnk_cntr(struct perf_amd_iommu *perf_iommu)
>  {
>   unsigned long flags;
>   int shift, bank, cntr, retval;
> - int max_banks = perf_iommu->max_banks;
> - int max_cntrs = perf_iommu->max_counters;
>  
>   raw_spin_lock_irqsave(_iommu->lock, flags);
>  
> - for (bank = 0, shift = 0; bank < max_banks; bank++) {
> - for (cntr = 0; cntr < max_cntrs; cntr++) {
> - shift = bank + (bank*3) + cntr;
> + for (bank = 0, shift = 0; bank < perf_iommu->max_banks; bank++) {

Please init that shift variable above in the function entry - it
confuses unnecessarily here as if it had anything to do with the loop
stride.

> + for (cntr = 0; cntr < perf_iommu->max_counters; cntr++) {
> + shift = get_iommu_bnk_cnt_evt_idx(perf_iommu, 0, bank, 
> cntr);

\n here

>   if (perf_iommu->cntr_assign_mask & (1ULL<   continue;
>   } else {
> -- 
> 2.5.0
> 
> 

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 15/15] irqchip/gicv2m/v3-its-pci-msi: IOMMU map the MSI frame when needed

2016-02-18 Thread Marc Zyngier
On Fri, 12 Feb 2016 08:13:17 +
Eric Auger  wrote:

> In case the msi_desc references a device attached to an iommu
> domain, the msi address needs to be mapped in the IOMMU. Else any
> MSI write transaction will cause a fault.
> 
> gic_set_msi_addr detects that case and allocates an iova bound
> to the physical address page comprising the MSI frame. This iova
> then is used as the msi_msg address. Unset operation decrements the
> reference on the binding.
> 
> The functions are called in the irq_write_msi_msg ops implementation.
> At that time we can recognize whether the msi is setup or teared down
> looking at the msi_msg content. Indeed msi_domain_deactivate zeroes all
> the fields.
> 
> Signed-off-by: Eric Auger 
> 
> ---
> 
> v2 -> v3:
> - protect iova/addr manipulation with CONFIG_ARCH_DMA_ADDR_T_64BIT and
>   CONFIG_PHYS_ADDR_T_64BIT
> - only expose gic_pci_msi_domain_write_msg in case CONFIG_IOMMU_API &
>   CONFIG_PCI_MSI_IRQ_DOMAIN are set.
> - gic_set/unset_msi_addr duly become static
> ---
>  drivers/irqchip/irq-gic-common.c | 69 
> 
>  drivers/irqchip/irq-gic-common.h |  5 +++
>  drivers/irqchip/irq-gic-v2m.c|  7 +++-
>  drivers/irqchip/irq-gic-v3-its-pci-msi.c |  5 +++
>  4 files changed, 85 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/irq-gic-common.c 
> b/drivers/irqchip/irq-gic-common.c
> index f174ce0..46cd06c 100644
> --- a/drivers/irqchip/irq-gic-common.c
> +++ b/drivers/irqchip/irq-gic-common.c
> @@ -18,6 +18,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  
>  #include "irq-gic-common.h"
>  
> @@ -121,3 +123,70 @@ void gic_cpu_config(void __iomem *base, void 
> (*sync_access)(void))
>   if (sync_access)
>   sync_access();
>  }
> +
> +#if defined(CONFIG_IOMMU_API) && defined(CONFIG_PCI_MSI_IRQ_DOMAIN)
> +static int gic_set_msi_addr(struct irq_data *data, struct msi_msg *msg)
> +{
> + struct msi_desc *desc = irq_data_get_msi_desc(data);
> + struct device *dev = msi_desc_to_dev(desc);
> + struct iommu_domain *d;
> + phys_addr_t addr;
> + dma_addr_t iova;
> + int ret;
> +
> + d = iommu_get_domain_for_dev(dev);
> + if (!d)
> + return 0;
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> + addr = ((phys_addr_t)(msg->address_hi) << 32) | msg->address_lo;
> +#else
> + addr = msg->address_lo;
> +#endif
> +
> + ret = iommu_get_single_reserved(d, addr, IOMMU_WRITE, );
> +
> + if (!ret) {
> + msg->address_lo = lower_32_bits(iova);
> + msg->address_hi = upper_32_bits(iova);
> + }
> + return ret;
> +}
> +
> +
> +static void gic_unset_msi_addr(struct irq_data *data)
> +{
> + struct msi_desc *desc = irq_data_get_msi_desc(data);
> + struct device *dev;
> + struct iommu_domain *d;
> + dma_addr_t iova;
> +
> +#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> + iova = ((dma_addr_t)(desc->msg.address_hi) << 32) |
> + desc->msg.address_lo;
> +#else
> + iova = desc->msg.address_lo;
> +#endif
> +
> + dev = msi_desc_to_dev(desc);
> + if (!dev)
> + return;
> +
> + d = iommu_get_domain_for_dev(dev);
> + if (!d)
> + return;
> +
> + iommu_put_single_reserved(d, iova);
> +}
> +
> +void gic_pci_msi_domain_write_msg(struct irq_data *irq_data,
> +   struct msi_msg *msg)
> +{
> + if (!msg->address_hi && !msg->address_lo && !msg->data)
> + gic_unset_msi_addr(irq_data); /* deactivate */
> + else
> + gic_set_msi_addr(irq_data, msg); /* activate, set_affinity */
> +
> + pci_msi_domain_write_msg(irq_data, msg);
> +}

So by doing that, you are specializing this infrastructure to PCI.
If you hijacked irq_compose_msi_msg() instead, you'd have both
platform and PCI MSI for the same price.

I can see a potential problem with the teardown of an MSI (I don't
think the compose method is called on teardown), but I think this could
be easily addressed.

> +#endif
> +
> diff --git a/drivers/irqchip/irq-gic-common.h 
> b/drivers/irqchip/irq-gic-common.h
> index fff697d..98681fd 100644
> --- a/drivers/irqchip/irq-gic-common.h
> +++ b/drivers/irqchip/irq-gic-common.h
> @@ -35,4 +35,9 @@ void gic_cpu_config(void __iomem *base, void 
> (*sync_access)(void));
>  void gic_enable_quirks(u32 iidr, const struct gic_quirk *quirks,
>   void *data);
>  
> +#if defined(CONFIG_PCI_MSI_IRQ_DOMAIN) && defined(CONFIG_IOMMU_API)
> +void gic_pci_msi_domain_write_msg(struct irq_data *irq_data,
> +   struct msi_msg *msg);
> +#endif
> +
>  #endif /* _IRQ_GIC_COMMON_H */
> diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
> index c779f83..692d809 100644
> --- a/drivers/irqchip/irq-gic-v2m.c
> +++ b/drivers/irqchip/irq-gic-v2m.c
> @@ -24,6 +24,7 @@
>  #include 
>  #include 
>  #include 
> +#include "irq-gic-common.h"
>  
>  

Re: [PATCH V4 2/6] perf/amd/iommu: Modify functions to query max banks and counters

2016-02-18 Thread Borislav Petkov
On Thu, Feb 11, 2016 at 04:15:23PM +0700, Suravee Suthikulpanit wrote:
> Currently, amd_iommu_pc_get_max_[banks|counters]() require devid,
> which should not be the case.

Why?

Commit message could use an explanation.

> Also, these don't properly support
> multi-IOMMU system.
> 
> Current and future AMD systems with IOMMU that support perf counter

"with an IOMMU that supports performance 
counters"

> would likely contain homogeneous IOMMUs where multiple IOMMUs are

What are homogeneous IOMMUs?

> availalbe. So, this patch modifies these function to iterate all IOMMU

Please integrate a spellchecker in your patch creation workflow:

s/availalbe/available/

> to check the max banks and counters reported by the hardware.
> 
> Reviewed-by: Joerg Roedel 
> Signed-off-by: Suravee Suthikulpanit 
> ---
>  arch/x86/events/amd/iommu.c   | 17 +++--
>  arch/x86/include/asm/perf/amd/iommu.h |  7 ++-
>  drivers/iommu/amd_iommu_init.c| 20 
>  3 files changed, 21 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/x86/events/amd/iommu.c b/arch/x86/events/amd/iommu.c
> index 2f96072..debf22d 100644
> --- a/arch/x86/events/amd/iommu.c
> +++ b/arch/x86/events/amd/iommu.c
> @@ -232,14 +232,6 @@ static int perf_iommu_event_init(struct perf_event 
> *event)
>   return -EINVAL;
>   }
>  
> - /* integrate with iommu base devid (), assume one iommu */
> - perf_iommu->max_banks =
> - amd_iommu_pc_get_max_banks(IOMMU_BASE_DEVID);
> - perf_iommu->max_counters =
> - amd_iommu_pc_get_max_counters(IOMMU_BASE_DEVID);
> - if ((perf_iommu->max_banks == 0) || (perf_iommu->max_counters == 0))
> - return -EINVAL;
> -
>   /* update the hw_perf_event struct with the iommu config data */
>   hwc->config = config;
>   hwc->extra_reg.config = config1;
> @@ -450,6 +442,11 @@ static __init int _init_perf_amd_iommu(

Btw, that _init_perf_amd_iommu is unnecessarily split from
amd_iommu_pc_init(). You should merge those two. But that's another
patch. In that same cleanup patch you could do

s/perf_iommu/pi/g

or so because that perf_iommu local var is unnecesarily long and impairs
readability.

>   if (_init_events_attrs(perf_iommu) != 0)
>   pr_err("perf: amd_iommu: Only support raw events.\n");
>  
> + perf_iommu->max_banks = amd_iommu_pc_get_max_banks();
> + perf_iommu->max_counters = amd_iommu_pc_get_max_counters();
> + if ((perf_iommu->max_banks == 0) || (perf_iommu->max_counters == 0))

Simplify:

if (!perf_iommu->max_banks ||
!perf_iommu->max_counters)

> + return -EINVAL;
> +
>   /* Init null attributes */
>   perf_iommu->null_group = NULL;
>   perf_iommu->pmu.attr_groups = perf_iommu->attr_groups;
> @@ -460,8 +457,8 @@ static __init int _init_perf_amd_iommu(
>   amd_iommu_pc_exit();
>   } else {
>   pr_info("perf: amd_iommu: Detected. (%d banks, %d 
> counters/bank)\n",
> - amd_iommu_pc_get_max_banks(IOMMU_BASE_DEVID),
> - amd_iommu_pc_get_max_counters(IOMMU_BASE_DEVID));
> + amd_iommu_pc_get_max_banks(),
> + amd_iommu_pc_get_max_counters());
>   }
>  
>   return ret;
> diff --git a/arch/x86/include/asm/perf/amd/iommu.h 
> b/arch/x86/include/asm/perf/amd/iommu.h
> index 72f64b7..97e1be5 100644
> --- a/arch/x86/include/asm/perf/amd/iommu.h
> +++ b/arch/x86/include/asm/perf/amd/iommu.h
> @@ -24,15 +24,12 @@
>  #define PC_MAX_SPEC_BNKS 64
>  #define PC_MAX_SPEC_CNTRS16
>  
> -/* iommu pc reg masks*/
> -#define IOMMU_BASE_DEVID 0x
> -
>  /* amd_iommu_init.c external support functions */
>  bool amd_iommu_pc_supported(void);
>  
> -u8 amd_iommu_pc_get_max_banks(u16 devid);
> +u8 amd_iommu_pc_get_max_banks(void);
>  
> -u8 amd_iommu_pc_get_max_counters(u16 devid);
> +u8 amd_iommu_pc_get_max_counters(void);
>  
>  int amd_iommu_pc_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn, u64 
> *value);
>  
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index d30f4b2..a62b5ce 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -2251,15 +2251,17 @@ EXPORT_SYMBOL(amd_iommu_v2_supported);
>   *
>   
> /
>  
> -u8 amd_iommu_pc_get_max_banks(u16 devid)
> +u8 amd_iommu_pc_get_max_banks(void)
>  {
>   struct amd_iommu *iommu;
>   u8 ret = 0;
>  
> - /* locate the iommu governing the devid */
> - iommu = amd_iommu_rlookup_table[devid];
> - if (iommu)
> + for_each_iommu(iommu) {
> + if (!iommu->max_banks ||
> + (ret && (iommu->max_banks != ret)))

What is that supposed to do?

Check that the max_banks of 

Re: [RFC v3 05/15] iommu/arm-smmu: implement alloc/free_reserved_iova_domain

2016-02-18 Thread Robin Murphy

Hi Eric,

On 12/02/16 08:13, Eric Auger wrote:

Implement alloc/free_reserved_iova_domain for arm-smmu. we use
the iova allocator (iova.c). The iova_domain is attached to the
arm_smmu_domain struct. A mutex is introduced to protect it.


The IOMMU API currently leaves IOVA management entirely up to the caller 
- VFIO is already managing its own IOVA space, so what warrants this 
being pushed all the way down to the IOMMU driver? All I see here is 
abstract code with no hardware-specific details that'll have to be 
copy-pasted into other IOMMU drivers (e.g. SMMUv3), which strongly 
suggests it's the wrong place to do it.


As I understand the problem, VFIO has a generic "configure an IOMMU to 
point at an MSI doorbell" step to do in the process of attaching a 
device, which hasn't needed implementing yet due to VT-d's 
IOMMU_CAP_I_AM_ALSO_ACTUALLY_THE_MSI_CONTROLLER_IN_DISGUISE flag, which 
most of us have managed to misinterpret so far. AFAICS all the IOMMU 
driver should need to know about this is an iommu_map() call (which will 
want a slight extension[1] to make things behave properly). We should be 
fixing the abstraction to be less x86-centric, not hacking up all the 
ARM drivers to emulate x86 hardware behaviour in software.


Robin.

[1]:http://article.gmane.org/gmane.linux.kernel.cross-arch/30833


Signed-off-by: Eric Auger 

---
v2 -> v3:
- select IOMMU_IOVA when ARM_SMMU or ARM_SMMU_V3 is set

v1 -> v2:
- formerly implemented in vfio_iommu_type1
---
  drivers/iommu/Kconfig|  2 ++
  drivers/iommu/arm-smmu.c | 87 +++-
  2 files changed, 74 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index a1e75cb..1106528 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -289,6 +289,7 @@ config ARM_SMMU
bool "ARM Ltd. System MMU (SMMU) Support"
depends on (ARM64 || ARM) && MMU
select IOMMU_API
+   select IOMMU_IOVA
select IOMMU_IO_PGTABLE_LPAE
select ARM_DMA_USE_IOMMU if ARM
help
@@ -302,6 +303,7 @@ config ARM_SMMU_V3
bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
depends on ARM64 && PCI
select IOMMU_API
+   select IOMMU_IOVA
select IOMMU_IO_PGTABLE_LPAE
select GENERIC_MSI_IRQ_DOMAIN
help
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index c8b7e71..f42341d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -42,6 +42,7 @@
  #include 
  #include 
  #include 
+#include 

  #include 

@@ -347,6 +348,9 @@ struct arm_smmu_domain {
enum arm_smmu_domain_stage  stage;
struct mutexinit_mutex; /* Protects smmu pointer */
struct iommu_domain domain;
+   struct iova_domain  *reserved_iova_domain;
+   /* protects reserved domain manipulation */
+   struct mutexreserved_mutex;
  };

  static struct iommu_ops arm_smmu_ops;
@@ -975,6 +979,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned 
type)
return NULL;

mutex_init(_domain->init_mutex);
+   mutex_init(_domain->reserved_mutex);
spin_lock_init(_domain->pgtbl_lock);

return _domain->domain;
@@ -1446,22 +1451,74 @@ out_unlock:
return ret;
  }

+static int arm_smmu_alloc_reserved_iova_domain(struct iommu_domain *domain,
+  dma_addr_t iova, size_t size,
+  unsigned long order)
+{
+   unsigned long granule, mask;
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+   int ret = 0;
+
+   granule = 1UL << order;
+   mask = granule - 1;
+   if (iova & mask || (!size) || (size & mask))
+   return -EINVAL;
+
+   if (smmu_domain->reserved_iova_domain)
+   return -EEXIST;
+
+   mutex_lock(_domain->reserved_mutex);
+
+   smmu_domain->reserved_iova_domain =
+   kzalloc(sizeof(struct iova_domain), GFP_KERNEL);
+   if (!smmu_domain->reserved_iova_domain) {
+   ret = -ENOMEM;
+   goto unlock;
+   }
+
+   init_iova_domain(smmu_domain->reserved_iova_domain,
+granule, iova >> order, (iova + size - 1) >> order);
+
+unlock:
+   mutex_unlock(_domain->reserved_mutex);
+   return ret;
+}
+
+static void arm_smmu_free_reserved_iova_domain(struct iommu_domain *domain)
+{
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+   struct iova_domain *iovad = smmu_domain->reserved_iova_domain;
+
+   if (!iovad)
+   return;
+
+   mutex_lock(_domain->reserved_mutex);
+
+   put_iova_domain(iovad);
+   kfree(iovad);
+
+   mutex_unlock(_domain->reserved_mutex);
+}
+
  static struct iommu_ops arm_smmu_ops = {
-   .capable= arm_smmu_capable,
-   

Re: [RFC v3 07/15] iommu: iommu_get/put_single_reserved

2016-02-18 Thread Marc Zyngier
On Fri, 12 Feb 2016 08:13:09 +
Eric Auger  wrote:

> This patch introduces iommu_get/put_single_reserved.
> 
> iommu_get_single_reserved allows to allocate a new reserved iova page
> and map it onto the physical page that contains a given physical address.
> It returns the iova that is mapped onto the provided physical address.
> Hence the physical address passed in argument does not need to be aligned.
> 
> In case a mapping already exists between both pages, the IOVA mapped
> to the PA is directly returned.
> 
> Each time an iova is successfully returned a binding ref count is
> incremented.
> 
> iommu_put_single_reserved decrements the ref count and when this latter
> is null, the mapping is destroyed and the iova is released.

I wonder if there is a requirement for the caller to find out about the
size of the mapping, or to impose a given size... MSIs clearly do not
have that requirement (this is always a 32bit value), but since
allocations usually pair address and size, I though I'd ask...

Thanks,

M.
-- 
Jazz is not dead. It just smells funny.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v3 02/15] vfio: expose MSI mapping requirement through VFIO_IOMMU_GET_INFO

2016-02-18 Thread Marc Zyngier
On Fri, 12 Feb 2016 08:13:04 +
Eric Auger  wrote:

> This patch allows the user-space to retrieve whether msi write
> transaction addresses must be mapped. This is returned through the
> VFIO_IOMMU_GET_INFO API and its new flag: VFIO_IOMMU_INFO_REQUIRE_MSI_MAP.
> 
> Signed-off-by: Bharat Bhushan 
> Signed-off-by: Eric Auger 
> 
> ---
> 
> RFC v1 -> v1:
> - derived from
>   [RFC PATCH 3/6] vfio: Extend iommu-info to return MSIs automap state
> - renamed allow_msi_reconfig into require_msi_mapping
> - fixed VFIO_IOMMU_GET_INFO
> ---
>  drivers/vfio/vfio_iommu_type1.c | 26 ++
>  include/uapi/linux/vfio.h   |  1 +
>  2 files changed, 27 insertions(+)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 6f1ea3d..c5b57e1 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -255,6 +255,29 @@ static int vaddr_get_pfn(unsigned long vaddr, int prot, 
> unsigned long *pfn)
>  }
>  
>  /*
> + * vfio_domains_require_msi_mapping: indicates whether MSI write transaction
> + * addresses must be mapped
> + *
> + * returns true if it does
> + */
> +static bool vfio_domains_require_msi_mapping(struct vfio_iommu *iommu)
> +{
> + struct vfio_domain *d;
> + bool ret;
> +
> + mutex_lock(>lock);
> + /* All domains have same require_msi_map property, pick first */
> + d = list_first_entry(>domain_list, struct vfio_domain, next);
> + if (iommu_domain_get_attr(d->domain, DOMAIN_ATTR_MSI_MAPPING, NULL) < 0)
> + ret = false;
> + else
> + ret = true;

nit: this could be simplified as:

ret = (iommu_domain_get_attr(d->domain, DOMAIN_ATTR_MSI_MAPPING, NULL) == 0);

> + mutex_unlock(>lock);
> +
> + return ret;
> +}
> +
> +/*
>   * Attempt to pin pages.  We really don't want to track all the pfns and
>   * the iommu can only map chunks of consecutive pfns anyway, so get the
>   * first page and all consecutive pages with the same locking.
> @@ -997,6 +1020,9 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
>  
>   info.flags = VFIO_IOMMU_INFO_PGSIZES;
>  
> + if (vfio_domains_require_msi_mapping(iommu))
> + info.flags |= VFIO_IOMMU_INFO_REQUIRE_MSI_MAP;
> +
>   info.iova_pgsizes = vfio_pgsize_bitmap(iommu);
>  
>   return copy_to_user((void __user *)arg, , minsz);
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 7d7a4c6..43e183b 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -400,6 +400,7 @@ struct vfio_iommu_type1_info {
>   __u32   argsz;
>   __u32   flags;
>  #define VFIO_IOMMU_INFO_PGSIZES (1 << 0) /* supported page sizes info */
> +#define VFIO_IOMMU_INFO_REQUIRE_MSI_MAP (1 << 1)/* MSI must be mapped */
>   __u64   iova_pgsizes;   /* Bitmap of supported page sizes */
>  };
>  


FWIW:

Acked-by: Marc Zyngier 

M.
-- 
Jazz is not dead. It just smells funny.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu