[PATCH V2] iommu: arm-smmu: drop devm_free_irq when driver detach

2016-07-12 Thread Peng Fan
There is no need to call devm_free_irq when driver detach.
devres_release_all which is called after 'drv->remove' will
release all managed resources.

Signed-off-by: Peng Fan 
Reviewed-by: Robin Murphy 
Cc: Will Deacon 
---

V2:
 Fix compile warning. Add Robin's Reviewed-by TAG.

 drivers/iommu/arm-smmu.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 860652e..b7ef1d8 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -2018,7 +2018,6 @@ out_put_masters:
 
 static int arm_smmu_device_remove(struct platform_device *pdev)
 {
-   int i;
struct device *dev = >dev;
struct arm_smmu_device *curr, *smmu = NULL;
struct rb_node *node;
@@ -2045,9 +2044,6 @@ static int arm_smmu_device_remove(struct platform_device 
*pdev)
if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS))
dev_err(dev, "removing device with active domains!\n");
 
-   for (i = 0; i < smmu->num_global_irqs; ++i)
-   devm_free_irq(smmu->dev, smmu->irqs[i], smmu);
-
/* Turn the thing off */
writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
return 0;
-- 
2.6.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2] iommu/arm-smmu-v3: limit use of 2-level stream tables

2016-07-12 Thread Nate Watterson
In the current arm-smmu-v3 driver, all smmus that support 2-level
stream tables are being forced to use them. This is suboptimal for
smmus that support fewer stream id bits than would fill in a single
second level table. This patch limits the use of 2-level tables to
smmus that both support the feature and whose first level table can
possibly contain more than a single entry.

Signed-off-by: Nate Watterson 
---
 drivers/iommu/arm-smmu-v3.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5f6b3bc..f27b8dc 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -2033,17 +2033,9 @@ static int arm_smmu_init_strtab_2lvl(struct 
arm_smmu_device *smmu)
u32 size, l1size;
struct arm_smmu_strtab_cfg *cfg = >strtab_cfg;
 
-   /*
-* If we can resolve everything with a single L2 table, then we
-* just need a single L1 descriptor. Otherwise, calculate the L1
-* size, capped to the SIDSIZE.
-*/
-   if (smmu->sid_bits < STRTAB_SPLIT) {
-   size = 0;
-   } else {
-   size = STRTAB_L1_SZ_SHIFT - (ilog2(STRTAB_L1_DESC_DWORDS) + 3);
-   size = min(size, smmu->sid_bits - STRTAB_SPLIT);
-   }
+   /* Calculate the L1 size, capped to the SIDSIZE. */
+   size = STRTAB_L1_SZ_SHIFT - (ilog2(STRTAB_L1_DESC_DWORDS) + 3);
+   size = min(size, smmu->sid_bits - STRTAB_SPLIT);
cfg->num_l1_ents = 1 << size;
 
size += STRTAB_SPLIT;
@@ -2531,6 +2523,13 @@ static int arm_smmu_device_probe(struct arm_smmu_device 
*smmu)
smmu->ssid_bits = reg >> IDR1_SSID_SHIFT & IDR1_SSID_MASK;
smmu->sid_bits = reg >> IDR1_SID_SHIFT & IDR1_SID_MASK;
 
+   /*
+* If the SMMU supports fewer bits than would fill a single L2 stream
+* table, use a linear table instead.
+*/
+   if (smmu->sid_bits <= STRTAB_SPLIT)
+   smmu->features &= ~ARM_SMMU_FEAT_2_LVL_STRTAB;
+
/* IDR5 */
reg = readl_relaxed(smmu->base + ARM_SMMU_IDR5);
 
-- 
Qualcomm Datacenter Technologies, Inc. on behalf of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux
Foundation Collaborative Project.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 16/20 v2] iommu/amd: Optimize map_sg and unmap_sg

2016-07-12 Thread Robin Murphy
On 12/07/16 14:30, Joerg Roedel wrote:
> On Tue, Jul 12, 2016 at 12:33:43PM +0100, Robin Murphy wrote:
>>> +   for_each_sg(sglist, s, nelems, i)
>>> +   npages += iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
>>
>> This fails to account for the segment boundary mask[1]. Given a typical
>> sglist from the block layer where the boundary mask is 64K, the first
>> segment is 8k long, and subsequent segments are 64K long, those
>> subsequent segments will end up with misaligned addresses which certain
>> hardware may object to.
> 
> Yeah, right. It doesn't matter much on x86, as the smallest
> boundary-mask I have seen is 4G, but to be correct it should be
> accounted in. How does the attached patch look?

The boundary masks for block devices are tricky to track down through so
many layers of indirection in the common frameworks, but there are a lot
of 64K ones there. After some more impromptu digging into the subject
I've finally satisfied my curiosity - it seems this restriction stems
from the ATA DMA PRD table format, so it could perhaps still be a real
concern for anyone using some crusty old PCI IDE card in their modern
system.

>>
>>> +   address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);
>>
>> Since a typical dma_map_sg() call is likely to involve >128K worth of
>> data, I wonder if it's worth going directly to a slow-path IOVA
>> allocation...
> 
> Well, the allocator is the bottle-neck, so I try not to call it for
> every sg-element. The global locks have been removed, but more
> allocations/deallocations also mean that the per-cpu free-lists fill up
> quicker and that we have to flush the IOTLBs more often, which costs
> performance.

Indeed, I wasn't suggesting making more than one call, just that
alloc_iova_fast() is quite likely to have to fall back to alloc_iova()
here, so there may be some mileage in going directly to the latter, with
the benefit of then being able to rely on find_iova() later (since you
know for sure you allocated out of the tree rather than the caches). My
hunch is that dma_map_sg() tends to be called for bulk data transfer
(block devices, DRM, etc.) so is probably a less contended path compared
to the network layer hammering dma_map_single().

> 
>> [1]:http://article.gmane.org/gmane.linux.kernel.iommu/10553 - almost the
>> 1-year anniversary of you making much the same comment to me :D
> 
> Touché ;-)
> 
> Here is the updated patch:
> 
> From 88e1cc6c8e854a2bf55f972ddc5082a44760abe2 Mon Sep 17 00:00:00 2001
> From: Joerg Roedel 
> Date: Wed, 6 Jul 2016 17:20:54 +0200
> Subject: [PATCH 1/5] iommu/amd: Optimize map_sg and unmap_sg
> 
> Optimize these functions so that they need only one call
> into the address alloctor. This also saves a couple of
> io-tlb flushes in the unmap_sg path.
> 
> Signed-off-by: Joerg Roedel 
> ---
>  drivers/iommu/amd_iommu.c | 112 
> +++---
>  1 file changed, 86 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 2cd382e..203c50c 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2395,50 +2395,110 @@ static void unmap_page(struct device *dev, 
> dma_addr_t dma_addr, size_t size,
>   __unmap_single(domain->priv, dma_addr, size, dir);
>  }
>  
> +static int sg_num_pages(struct device *dev,
> + struct scatterlist *sglist,
> + int nelems)
> +{
> + unsigned long mask, boundary_size;
> + struct scatterlist *s;
> + int i, npages = 0;
> +
> + mask  = dma_get_seg_boundary(dev);
> + boundary_size = mask + 1 ? ALIGN(mask + 1, PAGE_SIZE) >> PAGE_SHIFT :
> +1UL << (BITS_PER_LONG - PAGE_SHIFT);

(mask >> PAGE_SHIFT) + 1 ?

> +
> + for_each_sg(sglist, s, nelems, i) {
> + int p, n;
> +
> + s->dma_address = npages << PAGE_SHIFT;
> + p = npages % boundary_size;
> + n = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
> + if (p + n > boundary_size)
> + npages += boundary_size - p;
> + npages += n;
> + }
> +
> + return npages;
> +}

Otherwise, this seems OK to me (as far as you value the judgement of
someone who took at least 3 tries to come up with a correct
implementation themselves...)

> +
>  /*
>   * The exported map_sg function for dma_ops (handles scatter-gather
>   * lists).
>   */
>  static int map_sg(struct device *dev, struct scatterlist *sglist,
> -   int nelems, enum dma_data_direction dir,
> +   int nelems, enum dma_data_direction direction,
> struct dma_attrs *attrs)
>  {
> + int mapped_pages = 0, npages = 0, prot = 0, i;
>   struct protection_domain *domain;
> - int i;
> + struct dma_ops_domain *dma_dom;
>   struct scatterlist *s;
> - phys_addr_t paddr;
> - int mapped_elems = 0;
> + unsigned 

Re: [PATCH v5 24/44] iommu: intel: dma-mapping: Use unsigned long for dma_attrs

2016-07-12 Thread Joerg Roedel
On Thu, Jun 30, 2016 at 10:25:51AM +0200, Krzysztof Kozlowski wrote:
> Split out subsystem specific changes for easier reviews. This will be
> squashed with main commit.
> 
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/iommu/intel-iommu.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)

Acked-by: Joerg Roedel 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 17/44] iommu: dma-mapping: Use unsigned long for dma_attrs

2016-07-12 Thread Joerg Roedel
On Thu, Jun 30, 2016 at 10:25:44AM +0200, Krzysztof Kozlowski wrote:
> Split out subsystem specific changes for easier reviews. This will be
> squashed with main commit.
> 
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/iommu/amd_iommu.c | 12 ++--
>  drivers/iommu/dma-iommu.c |  6 +++---
>  include/linux/dma-iommu.h |  6 +++---
>  3 files changed, 12 insertions(+), 12 deletions(-)

Acked-by: Joerg Roedel 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 00/44] dma-mapping: Use unsigned long for dma_attrs

2016-07-12 Thread Krzysztof Kozlowski
On 07/12/2016 02:16 PM, Daniel Vetter wrote:
> On Thu, Jun 30, 2016 at 10:23:39AM +0200, Krzysztof Kozlowski wrote:
>> Hi,
>>
>>
>> This is fifth approach for replacing struct dma_attrs with unsigned
>> long.
>>
>> The main patch (1/44) doing the change is split into many subpatches
>> for easier review (2-42).  They should be squashed together when
>> applying.
> 
> For all the drm driver patches:
> 
> Acked-by: Daniel Vetter 
> 
> Should I pull these in through drm-misc, or do you prefer to merge them
> through a special topic branch (with everything else) instead on your own?
> -Daniel

Thanks. I saw today that Andrew Morton applied the patchset so I think
he will handle it.

Best regards,
Krzysztof

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path

2016-07-12 Thread Joerg Roedel
Hi Robin,

On Tue, Jul 12, 2016 at 12:42:59PM +0100, Robin Murphy wrote:
> Ah, that's the angle I was missing, yes. So if, say, someone is mapping
> a page at IOVA 0x while someone else is mapping a page at 0x1000,
> there could still be a race between both callers writing the non-leaf
> PTEs, but it's benign since they'd be writing identical entries anyway.
> Seems reasonable to me (I assume in a similar map vs. unmap race, the
> unmapper would just be removing the leaf entry, rather than bothering to
> check for empty tables and tear down intermediate levels, so the same
> still applies).

The non-leaf PTE setup code checks for races with cmpxchg, so we are on
the safe side there. Two threads would write different entries there,
because they are allocating differnt sub-pages, but as I said, this is
checked for using cmpxchg. On the PTE level this problem does not exist.


Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path

2016-07-12 Thread Robin Murphy
On 12/07/16 12:08, Joerg Roedel wrote:
> Hi Robin,
> 
> On Tue, Jul 12, 2016 at 11:55:39AM +0100, Robin Murphy wrote:
>>> start = address;
>>> for (i = 0; i < pages; ++i) {
>>> -   ret = dma_ops_domain_map(dma_dom, start, paddr, dir);
>>> -   if (ret == DMA_ERROR_CODE)
>>> +   ret = iommu_map_page(_dom->domain, start, paddr,
>>> +PAGE_SIZE, prot, GFP_ATOMIC);
>>
>> I see that amd_iommu_map/unmap() takes a lock around calling
>> iommu_map/unmap_page(), but we don't appear to do that here. That seems
>> to suggest that either one is unsafe or the other is unnecessary.
> 
> At this point no locking is required, because in this code path we know
> that we own the memory range and that nobody else is mapping that range.
> 
> In the IOMMU-API path we can't make that assumption, so locking is
> required there. Both code-path use different types of domains, so there
> is also no chance that a domain is used in both code-paths (except when
> a dma-ops domain is set up).

Ah, that's the angle I was missing, yes. So if, say, someone is mapping
a page at IOVA 0x while someone else is mapping a page at 0x1000,
there could still be a race between both callers writing the non-leaf
PTEs, but it's benign since they'd be writing identical entries anyway.
Seems reasonable to me (I assume in a similar map vs. unmap race, the
unmapper would just be removing the leaf entry, rather than bothering to
check for empty tables and tear down intermediate levels, so the same
still applies).

Robin.

> 
> 
>   Joerg
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 16/20] iommu/amd: Optimize map_sg and unmap_sg

2016-07-12 Thread Robin Murphy
On 08/07/16 12:45, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> Optimize these functions so that they need only one call
> into the address alloctor. This also saves a couple of
> io-tlb flushes in the unmap_sg path.
> 
> Signed-off-by: Joerg Roedel 
> ---
>  drivers/iommu/amd_iommu.c | 77 
> ---
>  1 file changed, 52 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 78b278b..e5f8e7f 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2409,45 +2409,70 @@ static void unmap_page(struct device *dev, dma_addr_t 
> dma_addr, size_t size,
>   * lists).
>   */
>  static int map_sg(struct device *dev, struct scatterlist *sglist,
> -   int nelems, enum dma_data_direction dir,
> +   int nelems, enum dma_data_direction direction,
> struct dma_attrs *attrs)
>  {
> + int mapped_pages = 0, npages = 0, prot = 0, i;
> + unsigned long start_addr, address;
>   struct protection_domain *domain;
> - int i;
> + struct dma_ops_domain *dma_dom;
>   struct scatterlist *s;
> - phys_addr_t paddr;
> - int mapped_elems = 0;
>   u64 dma_mask;
>  
>   domain = get_domain(dev);
>   if (IS_ERR(domain))
>   return 0;
>  
> + dma_dom  = domain->priv;
>   dma_mask = *dev->dma_mask;
>  
> + for_each_sg(sglist, s, nelems, i)
> + npages += iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);

This fails to account for the segment boundary mask[1]. Given a typical
sglist from the block layer where the boundary mask is 64K, the first
segment is 8k long, and subsequent segments are 64K long, those
subsequent segments will end up with misaligned addresses which certain
hardware may object to.

> + address = dma_ops_alloc_iova(dev, dma_dom, npages, dma_mask);

Since a typical dma_map_sg() call is likely to involve >128K worth of
data, I wonder if it's worth going directly to a slow-path IOVA
allocation...

> + if (address == DMA_ERROR_CODE)
> + goto out_err;
> +
> + start_addr = address;
> + prot   = dir2prot(direction);
> +
>   for_each_sg(sglist, s, nelems, i) {
> - paddr = sg_phys(s);
> + int j, pages = iommu_num_pages(sg_phys(s), s->length, 
> PAGE_SIZE);
> +
> + for (j = 0; j < pages; ++j) {
> + unsigned long bus_addr, phys_addr;
> + int ret;
>  
> - s->dma_address = __map_single(dev, domain->priv,
> -   paddr, s->length, dir, dma_mask);
> + bus_addr  = address + (j << PAGE_SHIFT);
> + phys_addr = (sg_phys(s) & PAGE_MASK) + (j << 
> PAGE_SHIFT);
> + ret = iommu_map_page(domain, bus_addr, phys_addr, 
> PAGE_SIZE, prot, GFP_ATOMIC);
> + if (ret)
> + goto out_unmap;
> +
> + mapped_pages += 1;
> + }
>  
> - if (s->dma_address) {
> - s->dma_length = s->length;
> - mapped_elems++;
> - } else
> - goto unmap;
> + s->dma_address = address + s->offset;
> + s->dma_length  = s->length;
> + address += pages << PAGE_SHIFT;
>   }
>  
> - return mapped_elems;
> + return nelems;
>  
> -unmap:
> - for_each_sg(sglist, s, mapped_elems, i) {
> - if (s->dma_address)
> - __unmap_single(domain->priv, s->dma_address,
> -s->dma_length, dir);
> - s->dma_address = s->dma_length = 0;
> +
> +out_unmap:
> + pr_err("%s: IOMMU mapping error in map_sg (io-pages: %d)\n",
> +dev_name(dev), npages);
> +
> + for (i = 0; i < mapped_pages; ++i) {
> + iommu_unmap_page(domain,
> +  start_addr + (i << PAGE_SHIFT),
> +  PAGE_SIZE);
>   }
>  
> + free_iova_fast(_dom->iovad, start_addr, npages);
> +
> +out_err:
>   return 0;
>  }
>  
> @@ -2460,18 +2485,20 @@ static void unmap_sg(struct device *dev, struct 
> scatterlist *sglist,
>struct dma_attrs *attrs)
>  {
>   struct protection_domain *domain;
> + unsigned long startaddr;
>   struct scatterlist *s;
> - int i;
> + int i,npages = 0;
>  
>   domain = get_domain(dev);
>   if (IS_ERR(domain))
>   return;
>  
> - for_each_sg(sglist, s, nelems, i) {
> - __unmap_single(domain->priv, s->dma_address,
> -s->dma_length, dir);
> - s->dma_address = s->dma_length = 0;
> - }
> + for_each_sg(sglist, s, nelems, i)
> + npages += iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);

...which would also then allow this to be further 

Re: [GIT PULL] iommu/arm-smmu: Updates for 4.8

2016-07-12 Thread Joerg Roedel
On Fri, Jul 08, 2016 at 04:59:46PM +0100, Will Deacon wrote:
> The following changes since commit af8c34ce6ae32addda3788d54a7e340cad22516b:
> 
>   Linux 4.7-rc2 (2016-06-05 14:31:26 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
> for-joerg/arm-smmu/updates

Pulled, thanks Will.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/20] iommu/amd: Remove special mapping code for dma_ops path

2016-07-12 Thread Joerg Roedel
Hi Robin,

On Tue, Jul 12, 2016 at 11:55:39AM +0100, Robin Murphy wrote:
> > start = address;
> > for (i = 0; i < pages; ++i) {
> > -   ret = dma_ops_domain_map(dma_dom, start, paddr, dir);
> > -   if (ret == DMA_ERROR_CODE)
> > +   ret = iommu_map_page(_dom->domain, start, paddr,
> > +PAGE_SIZE, prot, GFP_ATOMIC);
> 
> I see that amd_iommu_map/unmap() takes a lock around calling
> iommu_map/unmap_page(), but we don't appear to do that here. That seems
> to suggest that either one is unsafe or the other is unnecessary.

At this point no locking is required, because in this code path we know
that we own the memory range and that nobody else is mapping that range.

In the IOMMU-API path we can't make that assumption, so locking is
required there. Both code-path use different types of domains, so there
is also no chance that a domain is used in both code-paths (except when
a dma-ops domain is set up).


Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/20] iommu/amd: Use generic IOVA allocator

2016-07-12 Thread Joerg Roedel
Hey Vincent,

On Tue, Jul 12, 2016 at 05:03:08PM +0800, Wan Zongshun wrote:
> Currently, those patches can not work at my eCarrizo board.
> When I merged your patches, boot failed, and no any info print to me.
> I set iommu=pt, it also does not work; set iommu=soft, boot ok.
> 
> When I removed those patches, kernel boot ok.
> 
> This eCarrizo board uart doesnot work, so I can not get useful
> information from kernel by uart console, I also set 'text' in boot
> option, but still cannot print any log.

Thanks for your testing. The issue turned out to be an older bug, which
just got uncovered by these patches. I updated the branch at

git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git 
amd-iommu-iova

This branch boots now on my Kaveri and Carrizo system. Can you please
give it a test too?

Thanks,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: arm-smmu: drop devm_free_irq when driver detach

2016-07-12 Thread Robin Murphy
On 12/07/16 02:33, Peng Fan wrote:
> There is no need to call devm_free_irq when driver detach.
> devres_release_all which is called after 'drv->remove' will
> release all managed resources.
> 
> Signed-off-by: Peng Fan 
> Cc: Will Deacon 
> Cc: Robin Murphy 

drivers/iommu/arm-smmu.c: In function 'arm_smmu_device_remove':
drivers/iommu/arm-smmu.c:2021:6: warning: unused variable 'i'
[-Wunused-variable]
  int i;
  ^

With that addressed:

Reviewed-by: Robin Murphy 

> ---
>  drivers/iommu/arm-smmu.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 860652e..5837391c 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -2045,9 +2045,6 @@ static int arm_smmu_device_remove(struct 
> platform_device *pdev)
>   if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS))
>   dev_err(dev, "removing device with active domains!\n");
>  
> - for (i = 0; i < smmu->num_global_irqs; ++i)
> - devm_free_irq(smmu->dev, smmu->irqs[i], smmu);
> -
>   /* Turn the thing off */
>   writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
>   return 0;
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: limit use of 2-level stream tables

2016-07-12 Thread Robin Murphy
On 11/07/16 19:00, Nate Watterson wrote:
> In the current arm-smmu-v3 driver, all smmus that support 2-level
> stream tables are being forced to use them. This is suboptimal for
> smmus that support fewer stream id bits than would fill in a single
> second level table. This patch limits the use of 2-level tables to
> smmus that both support the feature and whose first level table can
> possibly contain more than a single entry.

Makes sense to me, in principle.

> Signed-off-by: Nate Watterson 
> ---
>  drivers/iommu/arm-smmu-v3.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 5f6b3bc..742254c 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -2531,6 +2531,17 @@ static int arm_smmu_device_probe(struct 
> arm_smmu_device *smmu)
>   smmu->ssid_bits = reg >> IDR1_SSID_SHIFT & IDR1_SSID_MASK;
>   smmu->sid_bits = reg >> IDR1_SID_SHIFT & IDR1_SID_MASK;
>  
> + /*
> +  * If the SMMU supports fewer bits than would fill a single L2 stream
> +  * table, use a linear table instead.
> +  */
> + if (smmu->sid_bits <= STRTAB_SPLIT &&
> + smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
> + smmu->features &= ~ARM_SMMU_FEAT_2_LVL_STRTAB;
> + dev_info(smmu->dev, "SIDSIZE (%d) <= STRTAB_SPLIT (%d) : 
> disabling 2-level stream tables\n",
> +  smmu->sid_bits, STRTAB_SPLIT);

There's no useful reason to squawk about this; it's just noise.

Whatever old version of the spec I have here would appear to agree: "In
all cases, aside from the lookup of the STE itself, the choice of Stream
Table format is irrelevant to any other SMMU operation."

> + }
> +
>   /* IDR5 */
>   reg = readl_relaxed(smmu->base + ARM_SMMU_IDR5);

I think this now leaves some of the logic in arm_smmu_init_strtab_2lvl()
redundant, so it would probably be worth tidying that up at the same time.

Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu