Re: IOVA allocation dependency between firmware buffer and remaining buffers

2020-09-28 Thread Marek Szyprowski
Hi Robin,

On 24.09.2020 13:06, Robin Murphy wrote:
> On 2020-09-24 11:47, Marek Szyprowski wrote:
>> On 24.09.2020 12:40, Robin Murphy wrote:
>>> On 2020-09-24 11:16, Thierry Reding wrote:
 On Thu, Sep 24, 2020 at 10:46:46AM +0200, Marek Szyprowski wrote:
> On 24.09.2020 10:28, Joerg Roedel wrote:
>> On Wed, Sep 23, 2020 at 08:48:26AM +0200, Marek Szyprowski wrote:
>>> It allows to remap given buffer at the specific IOVA address,
>>> although
>>> it doesn't guarantee that those specific addresses won't be later
>>> used
>>> by the IOVA allocator. Probably it would make sense to add an 
>>> API for
>>> generic IOMMU-DMA framework to mark the given IOVA range as
>>> reserved/unused to protect them.
>> There is an API for that, the IOMMU driver can return IOVA reserved
>> regions per device and the IOMMU core code will take care of mapping
>> these regions and reserving them in the IOVA allocator, so that
>> DMA-IOMMU code will not use it for allocations.
>>
>> Have a look at the iommu_ops->get_resv_regions() and
>> iommu_ops->put_resv_regions().
>
> I know about the reserved regions IOMMU API, but the main problem 
> here,
> in case of Exynos, is that those reserved regions won't be created by
> the IOMMU driver but by the IOMMU client device. It is just a result
> how
> the media drivers manages their IOVA space. They simply have to load
> firmware at the IOVA address lower than the any address of the used
> buffers.

 I've been working on adding a way to automatically add direct mappings
 using reserved-memory regions parsed from device tree, see:

 https://lore.kernel.org/lkml/2020090413.691933-1-thierry.red...@gmail.com/
  


 Perhaps this can be of use? With that you should be able to add a
 reserved-memory region somewhere in the lower range that you need for
 firmware images and have that automatically added as a direct mapping
 so that it won't be reused later on for dynamic allocations.
>>>
>>> It can't easily be a *direct* mapping though - if the driver has to
>>> use the DMA masks to ensure that everything stays within the
>>> addressable range, then (as far as I'm aware) there's no physical
>>> memory that low down to equal the DMA addresses.
>>>
>>> TBH I'm not convinced that this is a sufficiently common concern to
>>> justify new APIs, or even to try to make overly generic. I think just
>>> implementing a new DMA attribute to say "please allocate/map this
>>> particular request at the lowest DMA address possible" would be good
>>> enough. Such a thing could also serve PCI drivers that actually care
>>> about SAC/DAC to give us more of a chance of removing the "try a
>>> 32-bit mask first" trick from everyone's hotpath...
>>
>> Hmm, I like the idea of such DMA attribute! It should make things really
>> simple, especially in the drivers. Thanks for the great idea! I will try
>> to implement it then instead of the workarounds I've proposed in
>> s5p-mfc/exynos4-is drivers.
>
> Right, I think it's fair to draw a line and say that anyone who wants 
> a *specific* address needs to manage their own IOMMU domain.
>
> In the backend I suspect it's going to be cleanest to implement a 
> dedicated iova_alloc_low() (or similar) function in the IOVA API that 
> sidesteps all of the existing allocation paths and goes straight to 
> the rbtree.

Just for the record - I've implemented this approach here: 
https://lore.kernel.org/lkml/20200925141218.13550-1-m.szyprow...@samsung.com/

Best regards
-- 
Marek Szyprowski, PhD
Samsung R Institute Poland

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/arm-smmu-v3: Add rmb after reading event queue prod_reg

2020-09-28 Thread Zhou Wang
In arm_smmu_evtq_thread, reading event queue is from consumer pointer,
which has no address dependency on producer pointer, prog_reg(MMIO) and
event queue memory(Normal memory) can disorder. So the load for event queue
can be done before the load of prod_reg, then perhaps wrong event entry
value will be got.

Add rmb to make sure to get correct event queue entry value.

Signed-off-by: Zhou Wang 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c192544..93c9077 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -819,6 +819,9 @@ static int queue_sync_prod_in(struct arm_smmu_queue *q)
int ret = 0;
u32 prod = readl_relaxed(q->prod_reg);
 
+   /* Ensure that reading event queue is after reading prod_reg */
+   rmb();
+
if (Q_OVF(prod) != Q_OVF(q->llq.prod))
ret = -EOVERFLOW;
 
-- 
2.8.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv5 5/6] iommu: arm-smmu-impl: Use table to list QCOM implementations

2020-09-28 Thread Sai Prakash Ranjan

On 2020-09-23 20:54, Robin Murphy wrote:

On 2020-09-22 07:18, Sai Prakash Ranjan wrote:

Use table and of_match_node() to match qcom implementation
instead of multiple of_device_compatible() calls for each
QCOM SMMU implementation.

Signed-off-by: Sai Prakash Ranjan 
---
  drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 12 
  1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c

index d199b4bff15d..ce78295cfa78 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -9,6 +9,13 @@
#include "arm-smmu.h"
  +static const struct of_device_id __maybe_unused 
qcom_smmu_impl_of_match[] = {

+   { .compatible = "qcom,sc7180-smmu-500" },
+   { .compatible = "qcom,sdm845-smmu-500" },
+   { .compatible = "qcom,sm8150-smmu-500" },
+   { .compatible = "qcom,sm8250-smmu-500" },
+   { }
+};


Can you push the table itself into arm-smmu-qcom? That way you'll be
free to add new SoCs willy-nilly without any possibility of
conflicting with anything else.

Bonus points if you can fold in the Adreno variant and keep everything
together ;)



Sure I can get bonus points :)

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 05/18] ARM/dma-mapping: Switch to iommu_dma_ops

2020-09-28 Thread Marek Szyprowski
Hi Robin,

On 20.08.2020 17:08, Robin Murphy wrote:
> With the IOMMU ops now looking much the same shape as iommu_dma_ops,
> switch them out in favour of the iommu-dma library, currently enhanced
> with temporary workarounds that allow it to also sit underneath the
> arch-specific API. With that in place, we can now start converting the
> remaining IOMMU drivers and consumers to work with IOMMU API default
> domains instead.
>
> Signed-off-by: Robin Murphy 

I've played a bit longer with this and found that reading the kernel 
virtual address of the buffers allocated via dma_alloc_attrs() from 
dma-iommu ops gives trashes from time to time. It took me a while to 
debug this...

Your conversion misses adding arch_dma_prep_coherent() to arch/arm, so 
the buffers are cleared by the mm allocator, but the caches are NOT 
flushed for the newly allocated buffers.

This fixes the issue:

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index fec3e59215b8..8b60bcc5b14f 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -2,6 +2,7 @@
  config ARM
     bool
     default y
+   select ARCH_HAS_DMA_PREP_COHERENT
     select ARCH_32BIT_OFF_T
     select ARCH_HAS_BINFMT_FLAT
     select ARCH_HAS_DEBUG_VIRTUAL if MMU
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index ff6c4962161a..6954681b73da 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -266,6 +266,20 @@ static void __dma_clear_buffer(struct page *page, 
size_t size, int coherent_flag
     }
  }

+void arch_dma_prep_coherent(struct page *page, size_t size)
+{
+
+   if (PageHighMem(page)) {
+   phys_addr_t base = __pfn_to_phys(page_to_pfn(page));
+   phys_addr_t end = base + size;
+   outer_flush_range(base, end);
+   } else {
+   void *ptr = page_address(page);
+   dmac_flush_range(ptr, ptr + size);
+   outer_flush_range(__pa(ptr), __pa(ptr) + size);
+   }
+}
+
  /*
   * Allocate a DMA buffer for 'dev' of size 'size' using the
   * specified gfp mask.  Note that 'size' must be page aligned.

I also wonder if it would be better to use per-arch __dma_clear_buffer() 
instead of setting __GFP_ZERO unconditionally in dma-iommu.c. This 
should be faster on ARM with highmem...

 > ...

Best regards
-- 
Marek Szyprowski, PhD
Samsung R Institute Poland

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v3 1/2] iommu/iova: Retry from last rb tree node if iova search fails

2020-09-28 Thread vjitta
From: Vijayanand Jitta 

When ever a new iova alloc request comes iova is always searched
from the cached node and the nodes which are previous to cached
node. So, even if there is free iova space available in the nodes
which are next to the cached node iova allocation can still fail
because of this approach.

Consider the following sequence of iova alloc and frees on
1GB of iova space

1) alloc - 500MB
2) alloc - 12MB
3) alloc - 499MB
4) free -  12MB which was allocated in step 2
5) alloc - 13MB

After the above sequence we will have 12MB of free iova space and
cached node will be pointing to the iova pfn of last alloc of 13MB
which will be the lowest iova pfn of that iova space. Now if we get an
alloc request of 2MB we just search from cached node and then look
for lower iova pfn's for free iova and as they aren't any, iova alloc
fails though there is 12MB of free iova space.

To avoid such iova search failures do a retry from the last rb tree node
when iova search fails, this will search the entire tree and get an iova
if its available.

Signed-off-by: Vijayanand Jitta 
---
 drivers/iommu/iova.c | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 30d969a..acc8bee 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -184,8 +184,9 @@ static int __alloc_and_insert_iova_range(struct iova_domain 
*iovad,
struct rb_node *curr, *prev;
struct iova *curr_iova;
unsigned long flags;
-   unsigned long new_pfn;
+   unsigned long new_pfn, low_pfn_new;
unsigned long align_mask = ~0UL;
+   unsigned long high_pfn = limit_pfn, low_pfn = iovad->start_pfn;
 
if (size_aligned)
align_mask <<= fls_long(size - 1);
@@ -198,15 +199,25 @@ static int __alloc_and_insert_iova_range(struct 
iova_domain *iovad,
 
curr = __get_cached_rbnode(iovad, limit_pfn);
curr_iova = rb_entry(curr, struct iova, node);
+   low_pfn_new = curr_iova->pfn_hi + 1;
+
+retry:
do {
-   limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
-   new_pfn = (limit_pfn - size) & align_mask;
+   high_pfn = min(high_pfn, curr_iova->pfn_lo);
+   new_pfn = (high_pfn - size) & align_mask;
prev = curr;
curr = rb_prev(curr);
curr_iova = rb_entry(curr, struct iova, node);
-   } while (curr && new_pfn <= curr_iova->pfn_hi);
-
-   if (limit_pfn < size || new_pfn < iovad->start_pfn) {
+   } while (curr && new_pfn <= curr_iova->pfn_hi && new_pfn >= low_pfn);
+
+   if (high_pfn < size || new_pfn < low_pfn) {
+   if (low_pfn == iovad->start_pfn && low_pfn_new < limit_pfn) {
+   high_pfn = limit_pfn;
+   low_pfn = low_pfn_new;
+   curr = >anchor.node;
+   curr_iova = rb_entry(curr, struct iova, node);
+   goto retry;
+   }
iovad->max32_alloc_size = size;
goto iova32_full;
}
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/2] iommu/iova: Free global iova rcache on iova alloc failure

2020-09-28 Thread vjitta
From: Vijayanand Jitta 

When ever an iova alloc request fails we free the iova
ranges present in the percpu iova rcaches and then retry
but the global iova rcache is not freed as a result we could
still see iova alloc failure even after retry as global
rcache is holding the iova's which can cause fragmentation.
So, free the global iova rcache as well and then go for the
retry.

Signed-off-by: Vijayanand Jitta 
---
 drivers/iommu/iova.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index acc8bee..b4f04fd 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -442,6 +442,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long 
size,
flush_rcache = false;
for_each_online_cpu(cpu)
free_cpu_cached_iovas(cpu, iovad);
+   free_global_cached_iovas(iovad);
goto retry;
}
 
@@ -1057,5 +1058,26 @@ void free_cpu_cached_iovas(unsigned int cpu, struct 
iova_domain *iovad)
}
 }
 
+/*
+ * free all the IOVA ranges of global cache
+ */
+static void free_global_cached_iovas(struct iova_domain *iovad)
+{
+   struct iova_rcache *rcache;
+   unsigned long flags;
+   int i, j;
+
+   for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+   rcache = >rcaches[i];
+   spin_lock_irqsave(>lock, flags);
+   for (j = 0; j < rcache->depot_size; ++j) {
+   iova_magazine_free_pfns(rcache->depot[j], iovad);
+   iova_magazine_free(rcache->depot[j]);
+   rcache->depot[j] = NULL;
+   }
+   rcache->depot_size = 0;
+   spin_unlock_irqrestore(>lock, flags);
+   }
+}
 MODULE_AUTHOR("Anil S Keshavamurthy ");
 MODULE_LICENSE("GPL");
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
Code Aurora Forum, hosted by The Linux Foundation
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv5 4/6] drm/msm/a6xx: Add support for using system cache(LLC)

2020-09-28 Thread Sai Prakash Ranjan

Hi Jordan,

On 2020-09-23 20:33, Jordan Crouse wrote:

On Tue, Sep 22, 2020 at 11:48:17AM +0530, Sai Prakash Ranjan wrote:

From: Sharat Masetty 

The last level system cache can be partitioned to 32 different
slices of which GPU has two slices preallocated. One slice is
used for caching GPU buffers and the other slice is used for
caching the GPU SMMU pagetables. This talks to the core system
cache driver to acquire the slice handles, configure the SCID's
to those slices and activates and deactivates the slices upon
GPU power collapse and restore.

Some support from the IOMMU driver is also needed to make use
of the system cache to set the right TCR attributes. GPU then
has the ability to override a few cacheability parameters which
it does to override write-allocate to write-no-allocate as the
GPU hardware does not benefit much from it.

DOMAIN_ATTR_SYS_CACHE is another domain level attribute used by the
IOMMU driver to set the right attributes to cache the hardware
pagetables into the system cache.

Signed-off-by: Sharat Masetty 
[saiprakash.ranjan: fix to set attr before device attach to iommu and 
rebase]

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 83 
+

 drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |  4 ++
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 17 +
 3 files changed, 104 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c

index 8915882e..151190ff62f7 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -8,7 +8,9 @@
 #include "a6xx_gpu.h"
 #include "a6xx_gmu.xml.h"

+#include 
 #include 
+#include 

 #define GPU_PAS_ID 13

@@ -1022,6 +1024,79 @@ static irqreturn_t a6xx_irq(struct msm_gpu 
*gpu)

return IRQ_HANDLED;
 }

+static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 
mask, u32 or)

+{
+   return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or);
+}
+
+static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 
value)

+{
+   return msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2));
+}
+
+static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu)
+{
+   llcc_slice_deactivate(a6xx_gpu->llc_slice);
+   llcc_slice_deactivate(a6xx_gpu->htw_llc_slice);
+}
+
+static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu)
+{
+   u32 cntl1_regval = 0;
+
+   if (IS_ERR(a6xx_gpu->llc_mmio))
+   return;
+
+   if (!llcc_slice_activate(a6xx_gpu->llc_slice)) {
+   u32 gpu_scid = llcc_get_slice_id(a6xx_gpu->llc_slice);
+
+   gpu_scid &= 0x1f;
+		cntl1_regval = (gpu_scid << 0) | (gpu_scid << 5) | (gpu_scid << 10) 
|

+  (gpu_scid << 15) | (gpu_scid << 20);
+   }
+
+   if (!llcc_slice_activate(a6xx_gpu->htw_llc_slice)) {
+   u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice);
+
+   gpuhtw_scid &= 0x1f;
+   cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid);
+   }
+
+   if (cntl1_regval) {
+   /*
+* Program the slice IDs for the various GPU blocks and GPU MMU
+* pagetables
+*/
+		a6xx_llc_write(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1, 
cntl1_regval);

+
+   /*
+* Program cacheability overrides to not allocate cache lines on
+* a write miss
+*/
+		a6xx_llc_rmw(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 0xF, 
0x03);

+   }
+}


This code has been around long enough that it pre-dates a650. On a650 
and other
MMU-500 targets the htw_llc is configured by the firmware and the 
llc_slice is

configured in a different register.

I don't think we need to pause everything and add support for the 
MMU-500 path,
but we do need a way to disallow LLCC on affected targets until such 
time that

we can get it fixed up.



Thanks for taking a close look, does something like below look ok or 
something

else is needed here?

+ /* Till the time we get in LLCC support for A650 */
+ if (!(info && info->revn == 650))
+ a6xx_llc_slices_init(pdev, a6xx_gpu);

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/7] Convert the intel iommu driver to the dma-iommu api

2020-09-28 Thread Tvrtko Ursulin



On 27/09/2020 07:34, Lu Baolu wrote:

Hi,

The previous post of this series could be found here.

https://lore.kernel.org/linux-iommu/20200912032200.11489-1-baolu...@linux.intel.com/

This version introduce a new patch [4/7] to fix an issue reported here.

https://lore.kernel.org/linux-iommu/51a1baec-48d1-c0ac-181b-1fba92aa4...@linux.intel.com/

There aren't any other changes.

Please help to test and review.

Best regards,
baolu

Lu Baolu (3):
   iommu: Add quirk for Intel graphic devices in map_sg


Since I do have patches to fix i915 to handle this, do we want to 
co-ordinate the two and avoid having to add this quirk and then later 
remove it? Or you want to go the staged approach?


Regards,

Tvrtko


   iommu/vt-d: Update domain geometry in iommu_ops.at(de)tach_dev
   iommu/vt-d: Cleanup after converting to dma-iommu ops

Tom Murphy (4):
   iommu: Handle freelists when using deferred flushing in iommu drivers
   iommu: Add iommu_dma_free_cpu_cached_iovas()
   iommu: Allow the dma-iommu api to use bounce buffers
   iommu/vt-d: Convert intel iommu driver to the iommu ops

  .../admin-guide/kernel-parameters.txt |   5 -
  drivers/iommu/dma-iommu.c | 228 -
  drivers/iommu/intel/Kconfig   |   1 +
  drivers/iommu/intel/iommu.c   | 901 +++---
  include/linux/dma-iommu.h |   8 +
  include/linux/iommu.h |   1 +
  6 files changed, 336 insertions(+), 808 deletions(-)


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch V2 34/46] PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable

2020-09-28 Thread Thomas Gleixner
On Sat, Sep 26 2020 at 14:38, Vasily Gorbik wrote:
> On Fri, Sep 25, 2020 at 09:54:52AM -0400, Qian Cai wrote:
> Yes, as well as on mips and sparc which also don't FORCE_PCI.
> This seems to work for s390:
>
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index b0b7acf07eb8..41136fbe909b 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -192,3 +192,3 @@ config S390
> select PCI_MSI  if PCI
> -   select PCI_MSI_ARCH_FALLBACKS
> +   select PCI_MSI_ARCH_FALLBACKS   if PCI
> select SET_FS

lemme fix that for all of them ...
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 2/2] iommu/iova: Free global iova rcache on iova alloc failure

2020-09-28 Thread Vijayanand Jitta


On 9/18/2020 8:11 PM, Robin Murphy wrote:
> On 2020-08-20 13:49, vji...@codeaurora.org wrote:
>> From: Vijayanand Jitta 
>>
>> When ever an iova alloc request fails we free the iova
>> ranges present in the percpu iova rcaches and then retry
>> but the global iova rcache is not freed as a result we could
>> still see iova alloc failure even after retry as global
>> rcache is holding the iova's which can cause fragmentation.
>> So, free the global iova rcache as well and then go for the
>> retry.
>>
>> Signed-off-by: Vijayanand Jitta 
>> ---
>>   drivers/iommu/iova.c | 23 +++
>>   include/linux/iova.h |  6 ++
>>   2 files changed, 29 insertions(+)
>>
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index 4e77116..5836c87 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -442,6 +442,7 @@ struct iova *find_iova(struct iova_domain *iovad,
>> unsigned long pfn)
>>   flush_rcache = false;
>>   for_each_online_cpu(cpu)
>>   free_cpu_cached_iovas(cpu, iovad);
>> +    free_global_cached_iovas(iovad);
>>   goto retry;
>>   }
>>   @@ -1055,5 +1056,27 @@ void free_cpu_cached_iovas(unsigned int cpu,
>> struct iova_domain *iovad)
>>   }
>>   }
>>   +/*
>> + * free all the IOVA ranges of global cache
>> + */
>> +void free_global_cached_iovas(struct iova_domain *iovad)
> 
> As John pointed out last time, this should be static and the header
> changes dropped.
> 
> (TBH we should probably register our own hotplug notifier instance for a
> flush queue, so that external code has no need to poke at the per-CPU
> caches either)
> 
> Robin.
> 

Right, I have made it static and dropped header changes in v3.
can you please review that.

Thanks,
Vijay
>> +{
>> +    struct iova_rcache *rcache;
>> +    unsigned long flags;
>> +    int i, j;
>> +
>> +    for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
>> +    rcache = >rcaches[i];
>> +    spin_lock_irqsave(>lock, flags);
>> +    for (j = 0; j < rcache->depot_size; ++j) {
>> +    iova_magazine_free_pfns(rcache->depot[j], iovad);
>> +    iova_magazine_free(rcache->depot[j]);
>> +    rcache->depot[j] = NULL;
>> +    }
>> +    rcache->depot_size = 0;
>> +    spin_unlock_irqrestore(>lock, flags);
>> +    }
>> +}
>> +
>>   MODULE_AUTHOR("Anil S Keshavamurthy ");
>>   MODULE_LICENSE("GPL");
>> diff --git a/include/linux/iova.h b/include/linux/iova.h
>> index a0637ab..a905726 100644
>> --- a/include/linux/iova.h
>> +++ b/include/linux/iova.h
>> @@ -163,6 +163,7 @@ int init_iova_flush_queue(struct iova_domain *iovad,
>>   struct iova *split_and_remove_iova(struct iova_domain *iovad,
>>   struct iova *iova, unsigned long pfn_lo, unsigned long pfn_hi);
>>   void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain
>> *iovad);
>> +void free_global_cached_iovas(struct iova_domain *iovad);
>>   #else
>>   static inline int iova_cache_get(void)
>>   {
>> @@ -270,6 +271,11 @@ static inline void free_cpu_cached_iovas(unsigned
>> int cpu,
>>    struct iova_domain *iovad)
>>   {
>>   }
>> +
>> +static inline void free_global_cached_iovas(struct iova_domain *iovad)
>> +{
>> +}
>> +
>>   #endif
>>     #endif
>>

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCHv5 4/6] drm/msm/a6xx: Add support for using system cache(LLC)

2020-09-28 Thread Jordan Crouse
On Mon, Sep 28, 2020 at 05:56:55PM +0530, Sai Prakash Ranjan wrote:
> Hi Jordan,
> 
> On 2020-09-23 20:33, Jordan Crouse wrote:
> >On Tue, Sep 22, 2020 at 11:48:17AM +0530, Sai Prakash Ranjan wrote:
> >>From: Sharat Masetty 
> >>
> >>The last level system cache can be partitioned to 32 different
> >>slices of which GPU has two slices preallocated. One slice is
> >>used for caching GPU buffers and the other slice is used for
> >>caching the GPU SMMU pagetables. This talks to the core system
> >>cache driver to acquire the slice handles, configure the SCID's
> >>to those slices and activates and deactivates the slices upon
> >>GPU power collapse and restore.
> >>
> >>Some support from the IOMMU driver is also needed to make use
> >>of the system cache to set the right TCR attributes. GPU then
> >>has the ability to override a few cacheability parameters which
> >>it does to override write-allocate to write-no-allocate as the
> >>GPU hardware does not benefit much from it.
> >>
> >>DOMAIN_ATTR_SYS_CACHE is another domain level attribute used by the
> >>IOMMU driver to set the right attributes to cache the hardware
> >>pagetables into the system cache.
> >>
> >>Signed-off-by: Sharat Masetty 
> >>[saiprakash.ranjan: fix to set attr before device attach to iommu and
> >>rebase]
> >>Signed-off-by: Sai Prakash Ranjan 
> >>---
> >> drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 83 +
> >> drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |  4 ++
> >> drivers/gpu/drm/msm/adreno/adreno_gpu.c | 17 +
> >> 3 files changed, 104 insertions(+)
> >>
> >>diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>index 8915882e..151190ff62f7 100644
> >>--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>@@ -8,7 +8,9 @@
> >> #include "a6xx_gpu.h"
> >> #include "a6xx_gmu.xml.h"
> >>
> >>+#include 
> >> #include 
> >>+#include 
> >>
> >> #define GPU_PAS_ID 13
> >>
> >>@@ -1022,6 +1024,79 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> >>return IRQ_HANDLED;
> >> }
> >>
> >>+static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 mask,
> >>u32 or)
> >>+{
> >>+   return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or);
> >>+}
> >>+
> >>+static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32
> >>value)
> >>+{
> >>+   return msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2));
> >>+}
> >>+
> >>+static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu)
> >>+{
> >>+   llcc_slice_deactivate(a6xx_gpu->llc_slice);
> >>+   llcc_slice_deactivate(a6xx_gpu->htw_llc_slice);
> >>+}
> >>+
> >>+static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu)
> >>+{
> >>+   u32 cntl1_regval = 0;
> >>+
> >>+   if (IS_ERR(a6xx_gpu->llc_mmio))
> >>+   return;
> >>+
> >>+   if (!llcc_slice_activate(a6xx_gpu->llc_slice)) {
> >>+   u32 gpu_scid = llcc_get_slice_id(a6xx_gpu->llc_slice);
> >>+
> >>+   gpu_scid &= 0x1f;
> >>+   cntl1_regval = (gpu_scid << 0) | (gpu_scid << 5) | (gpu_scid << 
> >>10) |
> >>+  (gpu_scid << 15) | (gpu_scid << 20);
> >>+   }
> >>+
> >>+   if (!llcc_slice_activate(a6xx_gpu->htw_llc_slice)) {
> >>+   u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice);
> >>+
> >>+   gpuhtw_scid &= 0x1f;
> >>+   cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid);
> >>+   }
> >>+
> >>+   if (cntl1_regval) {
> >>+   /*
> >>+* Program the slice IDs for the various GPU blocks and GPU MMU
> >>+* pagetables
> >>+*/
> >>+   a6xx_llc_write(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1,
> >>cntl1_regval);
> >>+
> >>+   /*
> >>+* Program cacheability overrides to not allocate cache lines on
> >>+* a write miss
> >>+*/
> >>+   a6xx_llc_rmw(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 
> >>0xF,
> >>0x03);
> >>+   }
> >>+}
> >
> >This code has been around long enough that it pre-dates a650. On a650 and
> >other
> >MMU-500 targets the htw_llc is configured by the firmware and the
> >llc_slice is
> >configured in a different register.
> >
> >I don't think we need to pause everything and add support for the MMU-500
> >path,
> >but we do need a way to disallow LLCC on affected targets until such time
> >that
> >we can get it fixed up.
> >
> 
> Thanks for taking a close look, does something like below look ok or
> something
> else is needed here?
> 
> + /* Till the time we get in LLCC support for A650 */
> + if (!(info && info->revn == 650))
> + a6xx_llc_slices_init(pdev, a6xx_gpu);

It doesn't look like Rob picked this up for 5.10, so we have some time to do it
right.  Would you like me to give you an add-on patch for mmu-500 targets?

Jordan

> Thanks,
> Sai
> 
> -- 
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> of Code Aurora Forum, hosted by The Linux Foundation

-- 
The 

Re: [PATCH v2 1/3] iommu/io-pgtable-arm: Support coherency for Mali LPAE

2020-09-28 Thread Will Deacon
On Tue, Sep 22, 2020 at 03:16:48PM +0100, Robin Murphy wrote:
> Midgard GPUs have ACE-Lite master interfaces which allows systems to
> integrate them in an I/O-coherent manner. It seems that from the GPU's
> viewpoint, the rest of the system is its outer shareable domain, and so
> even when snoop signals are wired up, they are only emitted for outer
> shareable accesses. As such, setting the TTBR_SHARE_OUTER bit does
> indeed get coherent pagetable walks working nicely for the coherent
> T620 in the Arm Juno SoC.
> 
> Reviewed-by: Steven Price 
> Tested-by: Neil Armstrong 
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/io-pgtable-arm.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index dc7bcf858b6d..b4072a18e45d 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -440,7 +440,13 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct 
> arm_lpae_io_pgtable *data,
>   << ARM_LPAE_PTE_ATTRINDX_SHIFT);
>   }
>  
> - if (prot & IOMMU_CACHE)
> + /*
> +  * Also Mali has its own notions of shareability wherein its Inner
> +  * domain covers the cores within the GPU, and its Outer domain is
> +  * "outside the GPU" (i.e. either the Inner or System domain in CPU
> +  * terms, depending on coherency).
> +  */
> + if (prot & IOMMU_CACHE && data->iop.fmt != ARM_MALI_LPAE)
>   pte |= ARM_LPAE_PTE_SH_IS;
>   else
>   pte |= ARM_LPAE_PTE_SH_OS;
> @@ -1049,6 +1055,9 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, 
> void *cookie)
>   cfg->arm_mali_lpae_cfg.transtab = virt_to_phys(data->pgd) |
> ARM_MALI_LPAE_TTBR_READ_INNER |
> ARM_MALI_LPAE_TTBR_ADRMODE_TABLE;
> + if (cfg->coherent_walk)
> + cfg->arm_mali_lpae_cfg.transtab |= 
> ARM_MALI_LPAE_TTBR_SHARE_OUTER;
> +

Acked-by: Will Deacon 

I'm assuming I'm not the right person to merge this, and it needs to go
alongside the other patches in this series.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv5 4/6] drm/msm/a6xx: Add support for using system cache(LLC)

2020-09-28 Thread Sai Prakash Ranjan

On 2020-09-28 21:41, Jordan Crouse wrote:

On Mon, Sep 28, 2020 at 05:56:55PM +0530, Sai Prakash Ranjan wrote:

Hi Jordan,

On 2020-09-23 20:33, Jordan Crouse wrote:
>On Tue, Sep 22, 2020 at 11:48:17AM +0530, Sai Prakash Ranjan wrote:
>>From: Sharat Masetty 
>>
>>The last level system cache can be partitioned to 32 different
>>slices of which GPU has two slices preallocated. One slice is
>>used for caching GPU buffers and the other slice is used for
>>caching the GPU SMMU pagetables. This talks to the core system
>>cache driver to acquire the slice handles, configure the SCID's
>>to those slices and activates and deactivates the slices upon
>>GPU power collapse and restore.
>>
>>Some support from the IOMMU driver is also needed to make use
>>of the system cache to set the right TCR attributes. GPU then
>>has the ability to override a few cacheability parameters which
>>it does to override write-allocate to write-no-allocate as the
>>GPU hardware does not benefit much from it.
>>
>>DOMAIN_ATTR_SYS_CACHE is another domain level attribute used by the
>>IOMMU driver to set the right attributes to cache the hardware
>>pagetables into the system cache.
>>
>>Signed-off-by: Sharat Masetty 
>>[saiprakash.ranjan: fix to set attr before device attach to iommu and
>>rebase]
>>Signed-off-by: Sai Prakash Ranjan 
>>---
>> drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 83 +
>> drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |  4 ++
>> drivers/gpu/drm/msm/adreno/adreno_gpu.c | 17 +
>> 3 files changed, 104 insertions(+)
>>
>>diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>index 8915882e..151190ff62f7 100644
>>--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>@@ -8,7 +8,9 @@
>> #include "a6xx_gpu.h"
>> #include "a6xx_gmu.xml.h"
>>
>>+#include 
>> #include 
>>+#include 
>>
>> #define GPU_PAS_ID 13
>>
>>@@ -1022,6 +1024,79 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
>>return IRQ_HANDLED;
>> }
>>
>>+static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 mask,
>>u32 or)
>>+{
>>+   return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or);
>>+}
>>+
>>+static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32
>>value)
>>+{
>>+   return msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2));
>>+}
>>+
>>+static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu)
>>+{
>>+   llcc_slice_deactivate(a6xx_gpu->llc_slice);
>>+   llcc_slice_deactivate(a6xx_gpu->htw_llc_slice);
>>+}
>>+
>>+static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu)
>>+{
>>+   u32 cntl1_regval = 0;
>>+
>>+   if (IS_ERR(a6xx_gpu->llc_mmio))
>>+   return;
>>+
>>+   if (!llcc_slice_activate(a6xx_gpu->llc_slice)) {
>>+   u32 gpu_scid = llcc_get_slice_id(a6xx_gpu->llc_slice);
>>+
>>+   gpu_scid &= 0x1f;
>>+   cntl1_regval = (gpu_scid << 0) | (gpu_scid << 5) | (gpu_scid 
<< 10) |
>>+  (gpu_scid << 15) | (gpu_scid << 20);
>>+   }
>>+
>>+   if (!llcc_slice_activate(a6xx_gpu->htw_llc_slice)) {
>>+   u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice);
>>+
>>+   gpuhtw_scid &= 0x1f;
>>+   cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid);
>>+   }
>>+
>>+   if (cntl1_regval) {
>>+   /*
>>+* Program the slice IDs for the various GPU blocks and GPU 
MMU
>>+* pagetables
>>+*/
>>+   a6xx_llc_write(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1,
>>cntl1_regval);
>>+
>>+   /*
>>+* Program cacheability overrides to not allocate cache lines 
on
>>+* a write miss
>>+*/
>>+   a6xx_llc_rmw(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 
0xF,
>>0x03);
>>+   }
>>+}
>
>This code has been around long enough that it pre-dates a650. On a650 and
>other
>MMU-500 targets the htw_llc is configured by the firmware and the
>llc_slice is
>configured in a different register.
>
>I don't think we need to pause everything and add support for the MMU-500
>path,
>but we do need a way to disallow LLCC on affected targets until such time
>that
>we can get it fixed up.
>

Thanks for taking a close look, does something like below look ok or
something
else is needed here?

+ /* Till the time we get in LLCC support for A650 */
+ if (!(info && info->revn == 650))
+ a6xx_llc_slices_init(pdev, a6xx_gpu);


It doesn't look like Rob picked this up for 5.10, so we have some time 
to do it
right.  Would you like me to give you an add-on patch for mmu-500 
targets?




Yes that will be great.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list

Re: [PATCH 1/5] iommu/tegra-smmu: Unwrap tegra_smmu_group_get

2020-09-28 Thread Thierry Reding
On Sat, Sep 26, 2020 at 01:07:15AM -0700, Nicolin Chen wrote:
> The tegra_smmu_group_get was added to group devices in different
> SWGROUPs and it'd return a NULL group pointer upon a mismatch at
> tegra_smmu_find_group(), so for most of clients/devices, it very
> likely would mismatch and need a fallback generic_device_group().
> 
> But now tegra_smmu_group_get handles devices in same SWGROUP too,
> which means that it would allocate a group for every new SWGROUP
> or would directly return an existing one upon matching a SWGROUP,
> i.e. any device will go through this function.
> 
> So possibility of having a NULL group pointer in device_group()
> is upon failure of either devm_kzalloc() or iommu_group_alloc().
> In either case, calling generic_device_group() no longer makes a
> sense. Especially for devm_kzalloc() failing case, it'd cause a
> problem if it fails at devm_kzalloc() yet succeeds at a fallback
> generic_device_group(), because it does not create a group->list
> for other devices to match.
> 
> This patch simply unwraps the function to clean it up.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 19 ---
>  1 file changed, 4 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 0becdbfea306..6335285dc373 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -903,11 +903,13 @@ static void tegra_smmu_group_release(void *iommu_data)
>   mutex_unlock(>lock);
>  }
>  
> -static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu,
> - unsigned int swgroup)
> +static struct iommu_group *tegra_smmu_device_group(struct device *dev)
>  {
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> + struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
>   const struct tegra_smmu_group_soc *soc;
>   struct tegra_smmu_group *group;
> + int swgroup = fwspec->ids[0];

This should be unsigned int to match the type of swgroup elsewhere.
Also, it might not be worth having an extra local variable for this
since it's only used once.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/5] iommu/tegra-smmu: Use iommu_fwspec in .probe_/.attach_device()

2020-09-28 Thread Thierry Reding
On Sat, Sep 26, 2020 at 05:48:17PM +0300, Dmitry Osipenko wrote:
> 26.09.2020 11:07, Nicolin Chen пишет:
> ...
> > +   /* NULL smmu pointer means that SMMU driver is not probed yet */
> > +   if (unlikely(!smmu))
> > +   return ERR_PTR(-EPROBE_DEFER);
> 
> Hello, Nicolin!
> 
> Please don't pollute code with likely/unlikely. This is not a
> performance-critical code.
> 
> ...
> > -static struct platform_driver tegra_mc_driver = {
> > +struct platform_driver tegra_mc_driver = {
> > .driver = {
> > .name = "tegra-mc",
> > .of_match_table = tegra_mc_of_match,
> > diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
> > index 1238e35653d1..49a4cf64c4b9 100644
> > --- a/include/soc/tegra/mc.h
> > +++ b/include/soc/tegra/mc.h
> > @@ -184,4 +184,6 @@ struct tegra_mc {
> >  int tegra_mc_write_emem_configuration(struct tegra_mc *mc, unsigned long 
> > rate);
> >  unsigned int tegra_mc_get_emem_device_count(struct tegra_mc *mc);
> >  
> > +extern struct platform_driver tegra_mc_driver;
> 
> No global variables, please. See for the example:
> 
> https://elixir.bootlin.com/linux/v5.9-rc6/source/drivers/devfreq/tegra20-devfreq.c#L100
> 
> The tegra_get_memory_controller() is now needed by multiple Tegra
> drivers, I think it should be good to have it added into the MC driver
> and then make it globally available for all drivers by making use of
> of_find_matching_node_and_match().
> 
> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> index e1db209fd2ea..ed1bd6d00aaf 100644
> --- a/drivers/memory/tegra/mc.c
> +++ b/drivers/memory/tegra/mc.c
> @@ -43,6 +43,29 @@ static const struct of_device_id tegra_mc_of_match[] = {
>  };
>  MODULE_DEVICE_TABLE(of, tegra_mc_of_match);
> 
> +struct tegra_mc *tegra_get_memory_controller(void)
> +{
> + struct platform_device *pdev;
> + struct device_node *np;
> + struct tegra_mc *mc;
> +
> + np = of_find_matching_node_and_match(NULL, tegra_mc_of_match, NULL);
> + if (!np)
> + return ERR_PTR(-ENOENT);
> +
> + pdev = of_find_device_by_node(np);
> + of_node_put(np);
> + if (!pdev)
> + return ERR_PTR(-ENODEV);
> +
> + mc = platform_get_drvdata(pdev);
> + if (!mc)
> + return ERR_PTR(-EPROBE_DEFER);
> +
> + return mc;
> +}
> +EXPORT_SYMBOL_GPL(tegra_get_memory_controller);

We already have tegra_smmu_find(), which should be enough for this
particular use-case.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 4/5] iommu/tegra-smmu: Add PCI support

2020-09-28 Thread Thierry Reding
On Sat, Sep 26, 2020 at 01:07:18AM -0700, Nicolin Chen wrote:
> This patch simply adds support for PCI devices.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 17 -
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 97a7185b4578..9dbc5d7183cc 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -15,6 +15,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -935,6 +936,7 @@ static struct iommu_group *tegra_smmu_device_group(struct 
> device *dev)
>   const struct tegra_smmu_group_soc *soc;
>   struct tegra_smmu_group *group;
>   int swgroup = fwspec->ids[0];
> + bool pci = dev_is_pci(dev);
>   struct iommu_group *grp;
>  
>   /* Find group_soc associating with swgroup */
> @@ -961,7 +963,7 @@ static struct iommu_group *tegra_smmu_device_group(struct 
> device *dev)
>   group->smmu = smmu;
>   group->soc = soc;
>  
> - group->group = iommu_group_alloc();
> + group->group = pci ? pci_device_group(dev) : iommu_group_alloc();
>   if (IS_ERR(group->group)) {
>   devm_kfree(smmu->dev, group);
>   mutex_unlock(>lock);
> @@ -1180,6 +1182,19 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>   return ERR_PTR(err);
>   }
>  
> +#ifdef CONFIG_PCI
> + if (!iommu_present(_bus_type)) {
> + pci_request_acs();
> + err = bus_set_iommu(_bus_type, _smmu_ops);
> + if (err < 0) {
> + bus_set_iommu(_bus_type, NULL);
> + iommu_device_unregister(>iommu);
> + iommu_device_sysfs_remove(>iommu);
> + return ERR_PTR(err);

It might be worth factoring out the cleanup code now that there are
multiple failures from which we may need to clean up.

Also, it'd be great if somehow we could do this without the #ifdef,
but I guess since we're using the pci_bus_type global variable directly,
there isn't much we can do here?

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/5] iommu/tegra-smmu: Use iommu_fwspec in .probe_/.attach_device()

2020-09-28 Thread Thierry Reding
On Sat, Sep 26, 2020 at 01:07:17AM -0700, Nicolin Chen wrote:
> The tegra_smmu_probe_device() function searches in DT for the iommu
> phandler to get "smmu" pointer. This works for most of SMMU clients
> that exist in the DTB. But a PCI device will not be added to iommu,
> since it doesn't have a DT node.
> 
> Fortunately, for a client with a DT node, tegra_smmu_probe_device()
> calls tegra_smmu_of_xlate() via tegra_smmu_configure(), while for a
> PCI device, of_pci_iommu_init() in the IOMMU core calls .of_xlate()
> as well, even before running tegra_smmu_probe_device(). And in both
> cases, tegra_smmu_of_xlate() prepares a valid iommu_fwspec pointer
> that allows us to get the mc->smmu pointer via dev_get_drvdata() by
> calling driver_find_device_by_fwnode().
> 
> So this patch uses iommu_fwspec in .probe_device() and related code
> for a client that does not exist in the DTB, especially a PCI one.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 89 +++---
>  drivers/memory/tegra/mc.c  |  2 +-
>  include/soc/tegra/mc.h |  2 +
>  3 files changed, 56 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index b10e02073610..97a7185b4578 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -13,6 +13,7 @@
>  #include 
>  #include 
>  #include 
> +#include 

Why is this needed? I don't see any of the symbols declared in that file
used here.

>  #include 
>  
>  #include 
> @@ -61,6 +62,8 @@ struct tegra_smmu_as {
>   u32 attr;
>  };
>  
> +static const struct iommu_ops tegra_smmu_ops;
> +
>  static struct tegra_smmu_as *to_smmu_as(struct iommu_domain *dom)
>  {
>   return container_of(dom, struct tegra_smmu_as, domain);
> @@ -484,60 +487,49 @@ static void tegra_smmu_as_unprepare(struct tegra_smmu 
> *smmu,
>  static int tegra_smmu_attach_dev(struct iommu_domain *domain,
>struct device *dev)
>  {
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
>   struct tegra_smmu_as *as = to_smmu_as(domain);
> - struct device_node *np = dev->of_node;
> - struct of_phandle_args args;
> - unsigned int index = 0;
> - int err = 0;
> -
> - while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> -)) {
> - unsigned int swgroup = args.args[0];
> -
> - if (args.np != smmu->dev->of_node) {
> - of_node_put(args.np);
> - continue;
> - }
> + int index, err = 0;
>  
> - of_node_put(args.np);
> + if (!fwspec || fwspec->ops != _smmu_ops)
> + return -ENOENT;
>  
> + for (index = 0; index < fwspec->num_ids; index++) {
>   err = tegra_smmu_as_prepare(smmu, as);
> - if (err < 0)
> - return err;
> + if (err)
> + goto err_disable;

I'd personally drop the err_ prefix here because it's pretty obvious
that we're going to do this as a result of an error happening.

>  
> - tegra_smmu_enable(smmu, swgroup, as->id);
> - index++;
> + tegra_smmu_enable(smmu, fwspec->ids[index], as->id);
>   }
>  
>   if (index == 0)
>   return -ENODEV;
>  
>   return 0;
> +
> +err_disable:
> + for (index--; index >= 0; index--) {
> + tegra_smmu_disable(smmu, fwspec->ids[index], as->id);
> + tegra_smmu_as_unprepare(smmu, as);
> + }

I think a more idiomatic version of doing this would be:

while (index--) {
...
}

> +
> + return err;
>  }
>  
>  static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device 
> *dev)
>  {
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   struct tegra_smmu_as *as = to_smmu_as(domain);
> - struct device_node *np = dev->of_node;
>   struct tegra_smmu *smmu = as->smmu;
> - struct of_phandle_args args;
>   unsigned int index = 0;
>  
> - while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> -)) {
> - unsigned int swgroup = args.args[0];
> -
> - if (args.np != smmu->dev->of_node) {
> - of_node_put(args.np);
> - continue;
> - }
> -
> - of_node_put(args.np);
> + if (!fwspec || fwspec->ops != _smmu_ops)
> + return;
>  
> - tegra_smmu_disable(smmu, swgroup, as->id);
> + for (index = 0; index < fwspec->num_ids; index++) {
> + tegra_smmu_disable(smmu, fwspec->ids[index], as->id);
>   tegra_smmu_as_unprepare(smmu, as);
> - index++;
>   }
>  }
>  
> @@ -845,10 +837,25 @@ static int tegra_smmu_configure(struct tegra_smmu 
> *smmu, struct 

Re: [PATCH 1/5] iommu/tegra-smmu: Unwrap tegra_smmu_group_get

2020-09-28 Thread Nicolin Chen
Hi Thierry,

Thanks for the review.

On Mon, Sep 28, 2020 at 09:13:56AM +0200, Thierry Reding wrote:
> > -static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu,
> > -   unsigned int swgroup)
> > +static struct iommu_group *tegra_smmu_device_group(struct device *dev)
> >  {
> > +   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> > +   struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
> > const struct tegra_smmu_group_soc *soc;
> > struct tegra_smmu_group *group;
> > +   int swgroup = fwspec->ids[0];
> 
> This should be unsigned int to match the type of swgroup elsewhere.
> Also, it might not be worth having an extra local variable for this
> since it's only used once.

Will change to unsigned int. There are actually a few places using
it in this function, previously tegra_smmu_group_get().
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 00/13] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)

2020-09-28 Thread Jean-Philippe Brucker
Hi Will,

On Fri, Sep 18, 2020 at 12:18:40PM +0200, Jean-Philippe Brucker wrote:
> This is version 10 of the page table sharing support for Arm SMMUv3.
> Patch 1 still needs an Ack from mm maintainers. However patches 4-11 do
> not depend on it, and could get merged for v5.10 regardless.

Are you OK with taking patches 4-11 for v5.10?

The rest depends on patch 1 which hasn't been acked yet. It's
uncontroversial and I'm sure it will eventually make it. In case it
doesn't, we'll keep track of mm->pasid within the IOMMU subsystem instead.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 0/6] IOMMU user API enhancement

2020-09-28 Thread Jacob Pan
Hi Joerg,

Just wondering if you will be able to take this for v5.10? There hasn't
been any material changes since we last discussed in LPC. We have VFIO and
other vSVA patches depending on it.

Thanks!

Jacob 

On Fri, 25 Sep 2020 09:32:41 -0700, Jacob Pan 
wrote:

> IOMMU user API header was introduced to support nested DMA translation and
> related fault handling. The current UAPI data structures consist of three
> areas that cover the interactions between host kernel and guest:
>  - fault handling
>  - cache invalidation
>  - bind guest page tables, i.e. guest PASID
> 
> Future extensions are likely to support more architectures and vIOMMU
> features.
> 
> In the previous discussion, using user-filled data size and feature flags
> is made a preferred approach over a unified version number.
> https://lkml.org/lkml/2020/1/29/45
> 
> In addition to introduce argsz field to data structures, this patchset is
> also trying to document the UAPI design, usage, and extension rules. VT-d
> driver changes to utilize the new argsz field is included, VFIO usage is
> to follow.
> 
> This set is available at:
> https://github.com/jacobpan/linux.git vsva_v5.9_uapi_v12
> 
> Thanks,
> 
> Jacob
> 
> 
> Changelog:
> v12
>   - Removed a redundant check in cache invalidate API
> v11
>   - Use #define instead of enum in PASID data format, squashed
> change into "iommu/uapi: Handle data and argsz filled by users"
>   - Remove alloc/free from documentation per Yi's comment. IOMMU
> UAPI does not perform IOASID alloc/free.
> v10
>   - Documentation grammar fixes based on Randy's review
> v9
>   - Directly pass PASID value to iommu_sva_unbind_gpasid() without
> the superfluous data in struct iommu_gpasid_bind_data.
> v8
>   - Rebased to v5.9-rc2
>   - Addressed review comments from Eric Auger
> 1. added a check for the unused vendor flags
> 2. commit message improvements
> v7
>   - Added PASID data format enum for range checking
>   - Tidy up based on reviews from Alex W.
>   - Removed doc section for vIOMMU fault handling
> v6
>   - Renamed all UAPI functions with iommu_uapi_ prefix
>   - Replaced argsz maxsz checking with flag specific size checks
>   - Documentation improvements based on suggestions by Eric Auger
> Replaced example code with a pointer to the actual code
>   - Added more checks for illegal flags combinations
>   - Added doc file to MAINTAINERS
> v5
>   - Addjusted paddings in UAPI data to be 8 byte aligned
>   - Do not clobber argsz in IOMMU core before passing on to vendor
> driver
>   - Removed pr_warn_ for invalid UAPI data check, just return
> -EINVAL
>   - Clarified VFIO responsibility in UAPI data handling
>   - Use iommu_uapi prefix to differentiate APIs has in-kernel caller
>   - Added comment for unchecked flags of invalidation granularity
>   - Added example in doc to show vendor data checking
> 
> v4
>   - Added checks of UAPI data for reserved fields, version, and
> flags.
>   - Removed version check from vendor driver (vt-d)
>   - Relaxed argsz check to match the UAPI struct size instead of
> variable union size
>   - Updated documentation
> 
> v3:
>   - Rewrote backward compatibility rule to support existing code
> re-compiled with newer kernel UAPI header that runs on older
> kernel. Based on review comment from Alex W.
> https://lore.kernel.org/linux-iommu/20200611094741.6d118...@w520.home/
>   - Take user pointer directly in UAPI functions. Perform argsz
> check and copy_from_user() in IOMMU driver. Eliminate the need for
> VFIO or other upper layer to parse IOMMU data.
>   - Create wrapper function for in-kernel users of UAPI functions
> v2:
>   - Removed unified API version and helper
>   - Introduced argsz for each UAPI data
>   - Introduced UAPI doc
> 
> 
> Jacob Pan (6):
>   docs: IOMMU user API
>   iommu/uapi: Add argsz for user filled data
>   iommu/uapi: Use named union for user data
>   iommu/uapi: Rename uapi functions
>   iommu/uapi: Handle data and argsz filled by users
>   iommu/vt-d: Check UAPI data processed by IOMMU core
> 
>  Documentation/userspace-api/iommu.rst | 209
> ++ MAINTAINERS
> |   1 + drivers/iommu/intel/iommu.c   |  25 ++--
>  drivers/iommu/intel/svm.c |  13 ++-
>  drivers/iommu/iommu.c | 196
> +-- include/linux/iommu.h |
> 35 -- include/uapi/linux/iommu.h|  18 ++-
>  7 files changed, 456 insertions(+), 41 deletions(-)
>  create mode 100644 Documentation/userspace-api/iommu.rst
> 


Thanks,

Jacob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 00/13] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)

2020-09-28 Thread Will Deacon
Hi Jean-Philippe,

On Mon, Sep 28, 2020 at 06:47:31PM +0200, Jean-Philippe Brucker wrote:
> On Fri, Sep 18, 2020 at 12:18:40PM +0200, Jean-Philippe Brucker wrote:
> > This is version 10 of the page table sharing support for Arm SMMUv3.
> > Patch 1 still needs an Ack from mm maintainers. However patches 4-11 do
> > not depend on it, and could get merged for v5.10 regardless.
> 
> Are you OK with taking patches 4-11 for v5.10?
> 
> The rest depends on patch 1 which hasn't been acked yet. It's
> uncontroversial and I'm sure it will eventually make it. In case it
> doesn't, we'll keep track of mm->pasid within the IOMMU subsystem instead.

I was off most of last week, but I plan to see how much of this I can queue
tonight. Stay tuned...

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 10/14] iommu/ioasid: Introduce notification APIs

2020-09-28 Thread Jacob Pan
Relations among IOASID users largely follow a publisher-subscriber
pattern. E.g. to support guest SVA on Intel Scalable I/O Virtualization
(SIOV) enabled platforms, VFIO, IOMMU, device drivers, KVM are all users
of IOASIDs. When a state change occurs, VFIO publishes the change event
that needs to be processed by other users/subscribers.

This patch introduced two types of notifications: global and per
ioasid_set. The latter is intended for users who only needs to handle
events related to the IOASID of a given set.
For more information, refer to the kernel documentation at
Documentation/ioasid.rst.

Signed-off-by: Liu Yi L 
Signed-off-by: Wu Hao 
Signed-off-by: Jacob Pan 
---
 drivers/iommu/ioasid.c | 141 +
 include/linux/ioasid.h |  57 +++-
 2 files changed, 197 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
index 378fef8f23d9..894b17c06ead 100644
--- a/drivers/iommu/ioasid.c
+++ b/drivers/iommu/ioasid.c
@@ -10,12 +10,35 @@
 #include 
 #include 
 
+/*
+ * An IOASID can have multiple consumers where each consumer may have
+ * hardware contexts associated with the IOASID.
+ * When a status change occurs, like on IOASID deallocation, notifier chains
+ * are used to keep the consumers in sync.
+ * This is a publisher-subscriber pattern where publisher can change the
+ * state of each IOASID, e.g. alloc/free, bind IOASID to a device and mm.
+ * On the other hand, subscribers get notified for the state change and
+ * keep local states in sync.
+ */
+static ATOMIC_NOTIFIER_HEAD(ioasid_notifier);
+/* List to hold pending notification block registrations */
+static LIST_HEAD(ioasid_nb_pending_list);
+static DEFINE_SPINLOCK(ioasid_nb_lock);
+
 /* Default to PCIe standard 20 bit PASID */
 #define PCI_PASID_MAX 0x10
 static ioasid_t ioasid_capacity = PCI_PASID_MAX;
 static ioasid_t ioasid_capacity_avail = PCI_PASID_MAX;
 static DEFINE_XARRAY_ALLOC(ioasid_sets);
 
+struct ioasid_set_nb {
+   struct list_headlist;
+   struct notifier_block   *nb;
+   void*token;
+   struct ioasid_set   *set;
+   boolactive;
+};
+
 enum ioasid_state {
IOASID_STATE_INACTIVE,
IOASID_STATE_ACTIVE,
@@ -365,6 +388,42 @@ void ioasid_detach_data(ioasid_t ioasid)
 }
 EXPORT_SYMBOL_GPL(ioasid_detach_data);
 
+/**
+ * ioasid_notify - Send notification on a given IOASID for status change.
+ *
+ * @data:  The IOASID data to which the notification will send
+ * @cmd:   Notification event sent by IOASID external users, can be
+ * IOASID_BIND or IOASID_UNBIND.
+ *
+ * @flags: Special instructions, e.g. notify within a set or global by
+ * IOASID_NOTIFY_FLAG_SET or IOASID_NOTIFY_FLAG_ALL flags
+ * Caller must hold ioasid_allocator_lock and reference to the IOASID
+ */
+static int ioasid_notify(struct ioasid_data *data,
+enum ioasid_notify_val cmd, unsigned int flags)
+{
+   struct ioasid_nb_args args = { 0 };
+   int ret = 0;
+
+   /* IOASID_FREE/ALLOC are internal events emitted by IOASID core only */
+   if (cmd <= IOASID_NOTIFY_FREE)
+   return -EINVAL;
+
+   if (flags & ~(IOASID_NOTIFY_FLAG_ALL | IOASID_NOTIFY_FLAG_SET))
+   return -EINVAL;
+
+   args.id = data->id;
+   args.set = data->set;
+   args.pdata = data->private;
+   args.spid = data->spid;
+   if (flags & IOASID_NOTIFY_FLAG_ALL)
+   ret = atomic_notifier_call_chain(_notifier, cmd, );
+   if (flags & IOASID_NOTIFY_FLAG_SET)
+   ret = atomic_notifier_call_chain(>set->nh, cmd, );
+
+   return ret;
+}
+
 static ioasid_t ioasid_find_by_spid_locked(struct ioasid_set *set, ioasid_t 
spid)
 {
ioasid_t ioasid = INVALID_IOASID;
@@ -417,6 +476,7 @@ int ioasid_attach_spid(ioasid_t ioasid, ioasid_t spid)
goto done_unlock;
}
data->spid = spid;
+   ioasid_notify(data, IOASID_NOTIFY_BIND, IOASID_NOTIFY_FLAG_SET);
 
 done_unlock:
spin_unlock(_allocator_lock);
@@ -436,6 +496,7 @@ void ioasid_detach_spid(ioasid_t ioasid)
goto done_unlock;
}
data->spid = INVALID_IOASID;
+   ioasid_notify(data, IOASID_NOTIFY_UNBIND, IOASID_NOTIFY_FLAG_SET);
 
 done_unlock:
spin_unlock(_allocator_lock);
@@ -469,6 +530,28 @@ static inline bool ioasid_set_is_valid(struct ioasid_set 
*set)
return xa_load(_sets, set->id) == set;
 }
 
+static void ioasid_add_pending_nb(struct ioasid_set *set)
+{
+   struct ioasid_set_nb *curr;
+
+   if (set->type != IOASID_SET_TYPE_MM)
+   return;
+
+   /*
+* Check if there are any pending nb requests for the given token, if so
+* add them to the notifier chain.
+*/
+   spin_lock(_nb_lock);
+   list_for_each_entry(curr, _nb_pending_list, list) {
+   if (curr->token == set->token 

[PATCH v3 01/14] docs: Document IO Address Space ID (IOASID) APIs

2020-09-28 Thread Jacob Pan
IOASID is used to identify address spaces that can be targeted by device
DMA. It is a system-wide resource that is essential to its many users.
This document is an attempt to help developers from all vendors navigate
the APIs. At this time, ARM SMMU and Intel’s Scalable IO Virtualization
(SIOV) enabled platforms are the primary users of IOASID. Examples of
how SIOV components interact with IOASID APIs are provided in that many
APIs are driven by the requirements from SIOV.

Cc: Jonathan Corbet 
Cc: linux-...@vger.kernel.org
Cc: Randy Dunlap 
Signed-off-by: Liu Yi L 
Signed-off-by: Wu Hao 
Signed-off-by: Jacob Pan 
---
 Documentation/driver-api/ioasid.rst | 648 
 1 file changed, 648 insertions(+)
 create mode 100644 Documentation/driver-api/ioasid.rst

diff --git a/Documentation/driver-api/ioasid.rst 
b/Documentation/driver-api/ioasid.rst
new file mode 100644
index ..7f8e702997ab
--- /dev/null
+++ b/Documentation/driver-api/ioasid.rst
@@ -0,0 +1,648 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. ioasid:
+
+===
+IO Address Space ID
+===
+
+IOASID is a generic name for PCIe Process Address ID (PASID) or ARM
+SMMU SubstreamID. An IOASID identifies an address space that DMA
+requests can target.
+
+The primary use cases for IOASID are Shared Virtual Address (SVA) and
+multiple IOVA spaces per device. However, the requirements for IOASID
+management can vary among hardware architectures.
+
+For baremetal IOVA, IOASID #0 is used for DMA request without
+PASID. Even though some architectures such as VT-d also offers
+the flexibility of using any PASIDs for DMA request without PASID.
+PASID #0 is reserved and not allocated from any ioasid_set.
+
+Multiple IOVA spaces per device are mapped to auxiliary domains which
+can be used for mediated device assignment with and without a virtual
+IOMMU (vIOMMU). An IOASID is allocated for each auxiliary domain as default
+PASID. Without vIOMMU, default IOASID is used for DMA map/unmap
+APIs. With vIOMMU, default IOASID is used for guest IOVA where DMA
+request with PASID is required for the device. The reason is that
+there is only one PASID #0 per device, e.g. VT-d, RID_PASID is per PCI
+device.
+
+This document covers the generic features supported by IOASID
+APIs. Vendor-specific use cases are also illustrated with Intel's VT-d
+based platforms as the first example.
+
+.. contents:: :local:
+
+Glossary
+
+PASID - Process Address Space ID
+
+IOASID - IO Address Space ID (generic term for PCIe PASID and
+SubstreamID in SMMU)
+
+SVA/SVM - Shared Virtual Addressing/Memory
+
+ENQCMD - Intel X86 ISA for efficient workqueue submission [1]
+!!!TODO: Link to Spec at the bottom
+
+DSA - Intel Data Streaming Accelerator [2]
+
+VDCM - Virtual Device Composition Module [3]
+
+SIOV - Intel Scalable IO Virtualization
+
+
+Key Concepts
+
+
+IOASID Set
+---
+An IOASID set is a group of IOASIDs allocated from the system-wide
+IOASID pool. Refer to IOASID set APIs for more details.
+
+IOASID set is particularly useful for guest SVA where each guest could
+have its own IOASID set for security and efficiency reasons.
+
+IOASID Set Private ID (SPID)
+
+Each IOASID set has a private namespace of SPIDs. An SPID maps to a
+single system-wide IOASID. Conversely, each IOASID may be associated
+with an alias ID, local to the IOASID set, named SPID.
+SPIDs can be used as guest IOASIDs where each guest could do
+IOASID allocation from its own pool and map them to host physical
+IOASIDs. SPIDs are particularly useful for supporting live migration
+where decoupling guest and host physical resources are necessary.
+
+For example, two VMs can both allocate guest PASID/SPID #101 but map to
+different host PASIDs #201 and #202 respectively as shown in the
+diagram below.
+::
+
+ .--..--.
+ |   VM 1   ||   VM 2   |
+ |  ||  |
+ |--||--|
+ | GPASID/SPID 101  || GPASID/SPID 101  |
+ '--'---' Guest
+ __|__|
+   |  |   Host
+   v  v
+ .--..--.
+ | Host IOASID 201  || Host IOASID 202  |
+ '--''--'
+ |   IOASID set 1   ||   IOASID set 2   |
+ '--''--'
+
+Guest PASID is treated as IOASID set private ID (SPID) within an
+IOASID set, mappings between guest and host IOASIDs are stored in the
+set for inquiry.
+
+IOASID APIs
+===
+To get the IOASID APIs, users must #include . These APIs
+serve the following functionalities:
+
+  - IOASID allocation/Free
+  - Group management in the form of ioasid_set
+  - Private data storage and lookup
+  - Reference counting
+  - Event notification 

[PATCH v3 06/14] iommu/ioasid: Introduce API to adjust the quota of an ioasid_set

2020-09-28 Thread Jacob Pan
Each ioasid_set is given a quota during allocation. As system
administrators balance resources among VMs, we shall support the
adjustment of quota at runtime. The new quota cannot be less than the
outstanding IOASIDs already allocated within the set. The extra quota
will be returned to the system-wide IOASID pool if the new quota is
smaller than the existing one.

Signed-off-by: Jacob Pan 
---
 drivers/iommu/ioasid.c | 47 +++
 include/linux/ioasid.h |  6 ++
 2 files changed, 53 insertions(+)

diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
index 61e25c2375ab..cf8c7d34e2de 100644
--- a/drivers/iommu/ioasid.c
+++ b/drivers/iommu/ioasid.c
@@ -654,6 +654,53 @@ void ioasid_set_put(struct ioasid_set *set)
 EXPORT_SYMBOL_GPL(ioasid_set_put);
 
 /**
+ * ioasid_adjust_set - Adjust the quota of an IOASID set
+ * @set:   IOASID set to be assigned
+ * @quota: Quota allowed in this set
+ *
+ * Return 0 on success. If the new quota is smaller than the number of
+ * IOASIDs already allocated, -EINVAL will be returned. No change will be
+ * made to the existing quota.
+ */
+int ioasid_adjust_set(struct ioasid_set *set, int quota)
+{
+   int ret = 0;
+
+   if (quota <= 0)
+   return -EINVAL;
+
+   spin_lock(_allocator_lock);
+   if (set->nr_ioasids > quota) {
+   pr_err("New quota %d is smaller than outstanding IOASIDs %d\n",
+   quota, set->nr_ioasids);
+   ret = -EINVAL;
+   goto done_unlock;
+   }
+
+   if ((quota > set->quota) &&
+   (quota - set->quota > ioasid_capacity_avail)) {
+   ret = -ENOSPC;
+   goto done_unlock;
+   }
+
+   /* Return the delta back to system pool */
+   ioasid_capacity_avail += set->quota - quota;
+
+   /*
+* May have a policy to prevent giving all available IOASIDs
+* to one set. But we don't enforce here, it should be in the
+* upper layers.
+*/
+   set->quota = quota;
+
+done_unlock:
+   spin_unlock(_allocator_lock);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(ioasid_adjust_set);
+
+/**
  * ioasid_find - Find IOASID data
  * @set: the IOASID set
  * @ioasid: the IOASID to find
diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h
index 1ae213b660f0..0a5e82148eb9 100644
--- a/include/linux/ioasid.h
+++ b/include/linux/ioasid.h
@@ -62,6 +62,7 @@ struct ioasid_allocator_ops {
 void ioasid_install_capacity(ioasid_t total);
 ioasid_t ioasid_get_capacity(void);
 struct ioasid_set *ioasid_set_alloc(void *token, ioasid_t quota, int type);
+int ioasid_adjust_set(struct ioasid_set *set, int quota);
 void ioasid_set_get(struct ioasid_set *set);
 void ioasid_set_put(struct ioasid_set *set);
 
@@ -99,6 +100,11 @@ static inline struct ioasid_set *ioasid_set_alloc(void 
*token, ioasid_t quota, i
return ERR_PTR(-ENOTSUPP);
 }
 
+static inline int ioasid_adjust_set(struct ioasid_set *set, int quota)
+{
+   return -ENOTSUPP;
+}
+
 static inline void ioasid_set_put(struct ioasid_set *set)
 {
 }
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 09/14] iommu/ioasid: Introduce ioasid_set private ID

2020-09-28 Thread Jacob Pan
When an IOASID set is used for guest SVA, each VM will acquire its
ioasid_set for IOASID allocations. IOASIDs within the VM must have a
host/physical IOASID backing, mapping between guest and host IOASIDs can
be non-identical. IOASID set private ID (SPID) is introduced in this
patch to be used as guest IOASID. However, the concept of ioasid_set
specific namespace is generic, thus named SPID.

As SPID namespace is within the IOASID set, the IOASID core can provide
lookup services at both directions. SPIDs may not be available when its
IOASID is allocated, the mapping between SPID and IOASID is usually
established when a guest page table is bound to a host PASID.

Signed-off-by: Jacob Pan 
---
 drivers/iommu/ioasid.c | 102 +
 include/linux/ioasid.h |  19 +
 2 files changed, 121 insertions(+)

diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
index 828cc44b1b1c..378fef8f23d9 100644
--- a/drivers/iommu/ioasid.c
+++ b/drivers/iommu/ioasid.c
@@ -26,6 +26,7 @@ enum ioasid_state {
  * struct ioasid_data - Meta data about ioasid
  *
  * @id:Unique ID
+ * @spid:  Private ID unique within a set
  * @users: Number of active users
  * @state: Track state of the IOASID
  * @set:   ioasid_set of the IOASID belongs to
@@ -34,6 +35,7 @@ enum ioasid_state {
  */
 struct ioasid_data {
ioasid_t id;
+   ioasid_t spid;
refcount_t users;
enum ioasid_state state;
struct ioasid_set *set;
@@ -363,6 +365,105 @@ void ioasid_detach_data(ioasid_t ioasid)
 }
 EXPORT_SYMBOL_GPL(ioasid_detach_data);
 
+static ioasid_t ioasid_find_by_spid_locked(struct ioasid_set *set, ioasid_t 
spid)
+{
+   ioasid_t ioasid = INVALID_IOASID;
+   struct ioasid_data *entry;
+   unsigned long index;
+
+   if (!xa_load(_sets, set->id)) {
+   pr_warn("Invalid set\n");
+   goto done;
+   }
+
+   xa_for_each(>xa, index, entry) {
+   if (spid == entry->spid) {
+   refcount_inc(>users);
+   ioasid = index;
+   }
+   }
+done:
+   return ioasid;
+}
+
+/**
+ * ioasid_attach_spid - Attach ioasid_set private ID to an IOASID
+ *
+ * @ioasid: the system-wide IOASID to attach
+ * @spid:   the ioasid_set private ID of @ioasid
+ *
+ * After attching SPID, future lookup can be done via ioasid_find_by_spid().
+ */
+int ioasid_attach_spid(ioasid_t ioasid, ioasid_t spid)
+{
+   struct ioasid_data *data;
+   int ret = 0;
+
+   if (spid == INVALID_IOASID)
+   return -EINVAL;
+
+   spin_lock(_allocator_lock);
+   data = xa_load(_allocator->xa, ioasid);
+
+   if (!data) {
+   pr_err("No IOASID entry %d to attach SPID %d\n",
+   ioasid, spid);
+   ret = -ENOENT;
+   goto done_unlock;
+   }
+   /* Check if SPID is unique within the set */
+   if (ioasid_find_by_spid_locked(data->set, spid) != INVALID_IOASID) {
+   ret = -EINVAL;
+   goto done_unlock;
+   }
+   data->spid = spid;
+
+done_unlock:
+   spin_unlock(_allocator_lock);
+   return ret;
+}
+EXPORT_SYMBOL_GPL(ioasid_attach_spid);
+
+void ioasid_detach_spid(ioasid_t ioasid)
+{
+   struct ioasid_data *data;
+
+   spin_lock(_allocator_lock);
+   data = xa_load(_allocator->xa, ioasid);
+
+   if (!data || data->spid == INVALID_IOASID) {
+   pr_err("Invalid IOASID entry %d to detach\n", ioasid);
+   goto done_unlock;
+   }
+   data->spid = INVALID_IOASID;
+
+done_unlock:
+   spin_unlock(_allocator_lock);
+}
+EXPORT_SYMBOL_GPL(ioasid_detach_spid);
+
+/**
+ * ioasid_find_by_spid - Find the system-wide IOASID by a set private ID and
+ * its set.
+ *
+ * @set:   the ioasid_set to search within
+ * @spid:  the set private ID
+ *
+ * Given a set private ID and its IOASID set, find the system-wide IOASID. Take
+ * a reference upon finding the matching IOASID. Return INVALID_IOASID if the
+ * IOASID is not found in the set or the set is not valid.
+ */
+ioasid_t ioasid_find_by_spid(struct ioasid_set *set, ioasid_t spid)
+{
+   ioasid_t ioasid;
+
+   spin_lock(_allocator_lock);
+   ioasid = ioasid_find_by_spid_locked(set, spid);
+   spin_unlock(_allocator_lock);
+   return ioasid;
+}
+EXPORT_SYMBOL_GPL(ioasid_find_by_spid);
+
 static inline bool ioasid_set_is_valid(struct ioasid_set *set)
 {
return xa_load(_sets, set->id) == set;
@@ -529,6 +630,7 @@ ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, 
ioasid_t max,
goto exit_free;
}
data->id = id;
+   data->spid = INVALID_IOASID;
data->state = IOASID_STATE_ACTIVE;
refcount_set(>users, 1);
 
diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h
index 16d421357173..2dfe85e6cb7e 100644
--- a/include/linux/ioasid.h
+++ b/include/linux/ioasid.h
@@ -79,6 

[PATCH v3 12/14] iommu/vt-d: Remove mm reference for guest SVA

2020-09-28 Thread Jacob Pan
Now that IOASID core keeps track of the IOASID to mm_struct ownership in
the forms of ioasid_set with IOASID_SET_TYPE_MM token type, there is no
need to keep the same mapping in VT-d driver specific data. Native SVM
usage is not affected by the change.

Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/svm.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 2e764e283469..39a09a93300e 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -338,12 +338,6 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
ret = -ENOMEM;
goto out;
}
-   /* REVISIT: upper layer/VFIO can track host process that bind
-* the PASID. ioasid_set = mm might be sufficient for vfio to
-* check pasid VMM ownership. We can drop the following line
-* once VFIO and IOASID set check is in place.
-*/
-   svm->mm = get_task_mm(current);
svm->pasid = data->hpasid;
if (data->flags & IOMMU_SVA_GPASID_VAL) {
svm->gpasid = data->gpasid;
@@ -351,7 +345,6 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
}
ioasid_attach_data(data->hpasid, svm);
INIT_LIST_HEAD_RCU(>devs);
-   mmput(svm->mm);
}
sdev = kzalloc(sizeof(*sdev), GFP_KERNEL);
if (!sdev) {
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 02/14] iommu/ioasid: Rename ioasid_set_data()

2020-09-28 Thread Jacob Pan
Rename ioasid_set_data() to ioasid_attach_data() to avoid confusion with
struct ioasid_set. ioasid_set is a group of IOASIDs that share a common
token.

Reviewed-by: Jean-Philippe Brucker 
Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/svm.c | 6 +++---
 drivers/iommu/ioasid.c| 6 +++---
 include/linux/ioasid.h| 4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 0cb9a15f1112..2c5645f0737a 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -346,7 +346,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
svm->gpasid = data->gpasid;
svm->flags |= SVM_FLAG_GUEST_PASID;
}
-   ioasid_set_data(data->hpasid, svm);
+   ioasid_attach_data(data->hpasid, svm);
INIT_LIST_HEAD_RCU(>devs);
mmput(svm->mm);
}
@@ -398,7 +398,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
list_add_rcu(>list, >devs);
  out:
if (!IS_ERR_OR_NULL(svm) && list_empty(>devs)) {
-   ioasid_set_data(data->hpasid, NULL);
+   ioasid_attach_data(data->hpasid, NULL);
kfree(svm);
}
 
@@ -441,7 +441,7 @@ int intel_svm_unbind_gpasid(struct device *dev, int pasid)
 * the unbind, IOMMU driver will get notified
 * and perform cleanup.
 */
-   ioasid_set_data(pasid, NULL);
+   ioasid_attach_data(pasid, NULL);
kfree(svm);
}
}
diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
index 0f8dd377aada..5f63af07acd5 100644
--- a/drivers/iommu/ioasid.c
+++ b/drivers/iommu/ioasid.c
@@ -258,14 +258,14 @@ void ioasid_unregister_allocator(struct 
ioasid_allocator_ops *ops)
 EXPORT_SYMBOL_GPL(ioasid_unregister_allocator);
 
 /**
- * ioasid_set_data - Set private data for an allocated ioasid
+ * ioasid_attach_data - Set private data for an allocated ioasid
  * @ioasid: the ID to set data
  * @data:   the private data
  *
  * For IOASID that is already allocated, private data can be set
  * via this API. Future lookup can be done via ioasid_find.
  */
-int ioasid_set_data(ioasid_t ioasid, void *data)
+int ioasid_attach_data(ioasid_t ioasid, void *data)
 {
struct ioasid_data *ioasid_data;
int ret = 0;
@@ -287,7 +287,7 @@ int ioasid_set_data(ioasid_t ioasid, void *data)
 
return ret;
 }
-EXPORT_SYMBOL_GPL(ioasid_set_data);
+EXPORT_SYMBOL_GPL(ioasid_attach_data);
 
 /**
  * ioasid_alloc - Allocate an IOASID
diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h
index 6f000d7a0ddc..9c44947a68c8 100644
--- a/include/linux/ioasid.h
+++ b/include/linux/ioasid.h
@@ -39,7 +39,7 @@ void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid,
  bool (*getter)(void *));
 int ioasid_register_allocator(struct ioasid_allocator_ops *allocator);
 void ioasid_unregister_allocator(struct ioasid_allocator_ops *allocator);
-int ioasid_set_data(ioasid_t ioasid, void *data);
+int ioasid_attach_data(ioasid_t ioasid, void *data);
 
 #else /* !CONFIG_IOASID */
 static inline ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min,
@@ -67,7 +67,7 @@ static inline void ioasid_unregister_allocator(struct 
ioasid_allocator_ops *allo
 {
 }
 
-static inline int ioasid_set_data(ioasid_t ioasid, void *data)
+static inline int ioasid_attach_data(ioasid_t ioasid, void *data)
 {
return -ENOTSUPP;
 }
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 03/14] iommu/ioasid: Add a separate function for detach data

2020-09-28 Thread Jacob Pan
IOASID private data can be cleared by ioasid_attach_data() with a NULL
data pointer. A common use case is for a caller to free the data
afterward. ioasid_attach_data() calls synchronize_rcu() before return
such that free data can be sure without outstanding readers.
However, since synchronize_rcu() may sleep, ioasid_attach_data() cannot
be used under spinlocks.

This patch adds ioasid_detach_data() as a separate API where
synchronize_rcu() is called only in this case. ioasid_attach_data() can
then be used under spinlocks. In addition, this change makes the API
symmetrical.

Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/svm.c |  4 ++--
 drivers/iommu/ioasid.c| 54 ++-
 include/linux/ioasid.h|  5 -
 3 files changed, 50 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 2c5645f0737a..06a16bee7b65 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -398,7 +398,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
list_add_rcu(>list, >devs);
  out:
if (!IS_ERR_OR_NULL(svm) && list_empty(>devs)) {
-   ioasid_attach_data(data->hpasid, NULL);
+   ioasid_detach_data(data->hpasid);
kfree(svm);
}
 
@@ -441,7 +441,7 @@ int intel_svm_unbind_gpasid(struct device *dev, int pasid)
 * the unbind, IOMMU driver will get notified
 * and perform cleanup.
 */
-   ioasid_attach_data(pasid, NULL);
+   ioasid_detach_data(pasid);
kfree(svm);
}
}
diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
index 5f63af07acd5..6cfbdfb492e0 100644
--- a/drivers/iommu/ioasid.c
+++ b/drivers/iommu/ioasid.c
@@ -272,24 +272,58 @@ int ioasid_attach_data(ioasid_t ioasid, void *data)
 
spin_lock(_allocator_lock);
ioasid_data = xa_load(_allocator->xa, ioasid);
-   if (ioasid_data)
-   rcu_assign_pointer(ioasid_data->private, data);
-   else
+
+   if (!ioasid_data) {
ret = -ENOENT;
-   spin_unlock(_allocator_lock);
+   goto done_unlock;
+   }
 
-   /*
-* Wait for readers to stop accessing the old private data, so the
-* caller can free it.
-*/
-   if (!ret)
-   synchronize_rcu();
+   if (ioasid_data->private) {
+   ret = -EBUSY;
+   goto done_unlock;
+   }
+   rcu_assign_pointer(ioasid_data->private, data);
+
+done_unlock:
+   spin_unlock(_allocator_lock);
 
return ret;
 }
 EXPORT_SYMBOL_GPL(ioasid_attach_data);
 
 /**
+ * ioasid_detach_data - Clear the private data of an ioasid
+ *
+ * @ioasid: the IOASIDD to clear private data
+ */
+void ioasid_detach_data(ioasid_t ioasid)
+{
+   struct ioasid_data *ioasid_data;
+
+   spin_lock(_allocator_lock);
+   ioasid_data = xa_load(_allocator->xa, ioasid);
+
+   if (!ioasid_data) {
+   pr_warn("IOASID %u not found to detach data from\n", ioasid);
+   goto done_unlock;
+   }
+
+   if (ioasid_data->private) {
+   rcu_assign_pointer(ioasid_data->private, NULL);
+   goto done_unlock;
+   }
+
+done_unlock:
+   spin_unlock(_allocator_lock);
+   /*
+* Wait for readers to stop accessing the old private data,
+* so the caller can free it.
+*/
+   synchronize_rcu();
+}
+EXPORT_SYMBOL_GPL(ioasid_detach_data);
+
+/**
  * ioasid_alloc - Allocate an IOASID
  * @set: the IOASID set
  * @min: the minimum ID (inclusive)
diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h
index 9c44947a68c8..c7f649fa970a 100644
--- a/include/linux/ioasid.h
+++ b/include/linux/ioasid.h
@@ -40,7 +40,7 @@ void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid,
 int ioasid_register_allocator(struct ioasid_allocator_ops *allocator);
 void ioasid_unregister_allocator(struct ioasid_allocator_ops *allocator);
 int ioasid_attach_data(ioasid_t ioasid, void *data);
-
+void ioasid_detach_data(ioasid_t ioasid);
 #else /* !CONFIG_IOASID */
 static inline ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min,
ioasid_t max, void *private)
@@ -72,5 +72,8 @@ static inline int ioasid_attach_data(ioasid_t ioasid, void 
*data)
return -ENOTSUPP;
 }
 
+static inline void ioasid_detach_data(ioasid_t ioasid)
+{
+}
 #endif /* CONFIG_IOASID */
 #endif /* __LINUX_IOASID_H */
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 05/14] iommu/ioasid: Redefine IOASID set and allocation APIs

2020-09-28 Thread Jacob Pan
ioasid_set was introduced as an arbitrary token that is shared by a
group of IOASIDs. For example, two IOASIDs allocated via the same
ioasid_set pointer belong to the same set.

For guest SVA usages, system-wide IOASID resources need to be
partitioned such that each VM can have its own quota and being managed
separately. ioasid_set is the perfect candidate for meeting such
requirements. This patch redefines and extends ioasid_set with the
following new fields:
- Quota
- Reference count
- Storage of its namespace
- The token is now stored in the ioasid_set with types

Basic ioasid_set level APIs are introduced that wire up these new data.
Existing users of IOASID APIs are converted where a host IOASID set is
allocated for bare-metal usage.

Signed-off-by: Liu Yi L 
Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/iommu.c |  26 +++--
 drivers/iommu/intel/pasid.h |   1 +
 drivers/iommu/intel/svm.c   |  25 ++--
 drivers/iommu/ioasid.c  | 277 
 include/linux/ioasid.h  |  53 +++--
 5 files changed, 333 insertions(+), 49 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index e7bcb299e51e..872391890323 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -104,6 +104,9 @@
  */
 #define INTEL_IOMMU_PGSIZES(~0xFFFUL)
 
+/* PASIDs used by host SVM */
+struct ioasid_set *host_pasid_set;
+
 static inline int agaw_to_level(int agaw)
 {
return agaw + 2;
@@ -3147,8 +3150,8 @@ static void intel_vcmd_ioasid_free(ioasid_t ioasid, void 
*data)
 * Sanity check the ioasid owner is done at upper layer, e.g. VFIO
 * We can only free the PASID when all the devices are unbound.
 */
-   if (ioasid_find(NULL, ioasid, NULL)) {
-   pr_alert("Cannot free active IOASID %d\n", ioasid);
+   if (IS_ERR(ioasid_find(host_pasid_set, ioasid, NULL))) {
+   pr_err("IOASID %d to be freed but not in system set\n", ioasid);
return;
}
vcmd_free_pasid(iommu, ioasid);
@@ -,8 +3336,17 @@ static int __init init_dmars(void)
goto free_iommu;
 
/* PASID is needed for scalable mode irrespective to SVM */
-   if (intel_iommu_sm)
+   if (intel_iommu_sm) {
ioasid_install_capacity(intel_pasid_max_id);
+   /* We should not run out of IOASIDs at boot */
+   host_pasid_set = ioasid_set_alloc(NULL, PID_MAX_DEFAULT,
+ IOASID_SET_TYPE_NULL);
+   if (IS_ERR_OR_NULL(host_pasid_set)) {
+   pr_err("Failed to allocate host PASID set %lu\n",
+   PTR_ERR(host_pasid_set));
+   intel_iommu_sm = 0;
+   }
+   }
 
/*
 * for each drhd
@@ -3381,7 +3393,7 @@ static int __init init_dmars(void)
disable_dmar_iommu(iommu);
free_dmar_iommu(iommu);
}
-
+   ioasid_set_put(host_pasid_set);
kfree(g_iommus);
 
 error:
@@ -5163,7 +5175,7 @@ static void auxiliary_unlink_device(struct dmar_domain 
*domain,
domain->auxd_refcnt--;
 
if (!domain->auxd_refcnt && domain->default_pasid > 0)
-   ioasid_free(domain->default_pasid);
+   ioasid_free(host_pasid_set, domain->default_pasid);
 }
 
 static int aux_domain_add_dev(struct dmar_domain *domain,
@@ -5181,7 +5193,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
int pasid;
 
/* No private data needed for the default pasid */
-   pasid = ioasid_alloc(NULL, PASID_MIN,
+   pasid = ioasid_alloc(host_pasid_set, PASID_MIN,
 pci_max_pasids(to_pci_dev(dev)) - 1,
 NULL);
if (pasid == INVALID_IOASID) {
@@ -5224,7 +5236,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
spin_unlock(>lock);
spin_unlock_irqrestore(_domain_lock, flags);
if (!domain->auxd_refcnt && domain->default_pasid > 0)
-   ioasid_free(domain->default_pasid);
+   ioasid_free(host_pasid_set, domain->default_pasid);
 
return ret;
 }
diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h
index c9850766c3a9..ccdc23446015 100644
--- a/drivers/iommu/intel/pasid.h
+++ b/drivers/iommu/intel/pasid.h
@@ -99,6 +99,7 @@ static inline bool pasid_pte_is_present(struct pasid_entry 
*pte)
 }
 
 extern u32 intel_pasid_max_id;
+extern struct ioasid_set *host_pasid_set;
 int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp);
 void intel_pasid_free_id(int pasid);
 void *intel_pasid_lookup_id(int pasid);
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 06a16bee7b65..2e764e283469 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -228,7 +228,9 @@ static LIST_HEAD(global_svm_list);
   

[PATCH v3 07/14] iommu/ioasid: Add an iterator API for ioasid_set

2020-09-28 Thread Jacob Pan
Users of an ioasid_set may not keep track of all the IOASIDs allocated
under the set. When collective actions are needed for each IOASIDs, it
is useful to iterate over all the IOASIDs within the set. For example,
when the ioasid_set is freed, the user might perform the same cleanup
operation on each IOASID.

This patch adds an API to iterate all the IOASIDs within the set.

Signed-off-by: Jacob Pan 
---
 drivers/iommu/ioasid.c | 17 +
 include/linux/ioasid.h |  9 +
 2 files changed, 26 insertions(+)

diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
index cf8c7d34e2de..9628e78b2ab4 100644
--- a/drivers/iommu/ioasid.c
+++ b/drivers/iommu/ioasid.c
@@ -701,6 +701,23 @@ int ioasid_adjust_set(struct ioasid_set *set, int quota)
 EXPORT_SYMBOL_GPL(ioasid_adjust_set);
 
 /**
+ * ioasid_set_for_each_ioasid - Iterate over all the IOASIDs within the set
+ *
+ * Caller must hold a reference of the set and handles its own locking.
+ */
+void ioasid_set_for_each_ioasid(struct ioasid_set *set,
+   void (*fn)(ioasid_t id, void *data),
+   void *data)
+{
+   struct ioasid_data *entry;
+   unsigned long index;
+
+   xa_for_each(>xa, index, entry)
+   fn(index, data);
+}
+EXPORT_SYMBOL_GPL(ioasid_set_for_each_ioasid);
+
+/**
  * ioasid_find - Find IOASID data
  * @set: the IOASID set
  * @ioasid: the IOASID to find
diff --git a/include/linux/ioasid.h b/include/linux/ioasid.h
index 0a5e82148eb9..aab58bc26714 100644
--- a/include/linux/ioasid.h
+++ b/include/linux/ioasid.h
@@ -75,6 +75,9 @@ int ioasid_register_allocator(struct ioasid_allocator_ops 
*allocator);
 void ioasid_unregister_allocator(struct ioasid_allocator_ops *allocator);
 int ioasid_attach_data(ioasid_t ioasid, void *data);
 void ioasid_detach_data(ioasid_t ioasid);
+void ioasid_set_for_each_ioasid(struct ioasid_set *sdata,
+   void (*fn)(ioasid_t id, void *data),
+   void *data);
 #else /* !CONFIG_IOASID */
 static inline void ioasid_install_capacity(ioasid_t total)
 {
@@ -131,5 +134,11 @@ static inline int ioasid_attach_data(ioasid_t ioasid, void 
*data)
 static inline void ioasid_detach_data(ioasid_t ioasid)
 {
 }
+
+static inline void ioasid_set_for_each_ioasid(struct ioasid_set *sdata,
+ void (*fn)(ioasid_t id, void 
*data),
+ void *data)
+{
+}
 #endif /* CONFIG_IOASID */
 #endif /* __LINUX_IOASID_H */
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 14/14] iommu/vt-d: Store guest PASID during bind

2020-09-28 Thread Jacob Pan
IOASID core maintains the guest-host mapping in the form of SPID and
IOASID. This patch assigns the guest PASID (if valid) as SPID while
binding guest page table with a host PASID. This mapping will be used
for lookup and notifications.

Signed-off-by: Jacob Pan 
---
 drivers/iommu/intel/svm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 8f886718df83..e18f8b5af9ba 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -98,6 +98,7 @@ static inline bool intel_svm_capable(struct intel_iommu 
*iommu)
 static inline void intel_svm_drop_pasid(ioasid_t pasid)
 {
ioasid_detach_data(pasid);
+   ioasid_detach_spid(pasid);
ioasid_put(NULL, pasid);
 }
 
@@ -425,6 +426,7 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, 
struct device *dev,
if (data->flags & IOMMU_SVA_GPASID_VAL) {
svm->gpasid = data->gpasid;
svm->flags |= SVM_FLAG_GUEST_PASID;
+   ioasid_attach_spid(data->hpasid, data->gpasid);
}
ioasid_attach_data(data->hpasid, svm);
ioasid_get(NULL, svm->pasid);
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 00/14] IOASID extensions for guest SVA

2020-09-28 Thread Jacob Pan
IOASID was introduced in v5.5 as a generic kernel allocator service for
both PCIe Process Address Space ID (PASID) and ARM SMMU's Sub Stream
ID. In addition to basic ID allocation, ioasid_set was defined as a
token that is shared by a group of IOASIDs. This set token can be used
for permission checking, but lack of some features to address the
following needs by guest Shared Virtual Address (SVA).
- Manage IOASIDs by group, group ownership, quota, etc.
- State synchronization among IOASID users
- Non-identity guest-host IOASID mapping
- Lifecycle management across many users

This patchset introduces the following extensions as solutions to the
problems above.
- Redefine and extend IOASID set such that IOASIDs can be managed by groups.
- Add notifications for IOASID state synchronization
- Add reference counting for life cycle alignment among users
- Support ioasid_set private IDs, which can be used as guest IOASIDs
Please refer to Documentation/ioasid.rst in enclosed patch 1/9 for more
details.

This patchset only included VT-d driver as users of some of the new APIs.
VFIO and KVM patches are coming up to fully utilize the APIs introduced
here.

You can find this series at:
https://github.com/jacobpan/linux.git ioasid_v3
(VFIO and KVM patches will be available at this branch when published.)

This work is a result of collaboration with many people:
Liu, Yi L 
Wu Hao 
Ashok Raj 
Kevin Tian 

Thanks,

Jacob

Changelog:

V3:
- Use consistent ioasid_set_ prefix for ioasid_set level APIs
- Make SPID and private detach/attach APIs symmetric
- Use the same ioasid_put semantics as Jean-Phillippe IOASID reference patch
- Take away the public ioasid_notify() function, notifications are now emitted
  by IOASID core as a result of certain IOASID APIs
- Partition into finer incremental patches
- Miscellaneous cleanup, locking, exception handling fixes based on v2 reviews

V2:
- Redesigned ioasid_set APIs, removed set ID
- Added set private ID (SPID) for guest PASID usage.
- Add per ioasid_set notification and priority support.
- Back to use spinlocks and atomic notifications.
- Added async work in VT-d driver to perform teardown outside atomic context

Jacob Pan (14):
  docs: Document IO Address Space ID (IOASID) APIs
  iommu/ioasid: Rename ioasid_set_data()
  iommu/ioasid: Add a separate function for detach data
  iommu/ioasid: Support setting system-wide capacity
  iommu/ioasid: Redefine IOASID set and allocation APIs
  iommu/ioasid: Introduce API to adjust the quota of an ioasid_set
  iommu/ioasid: Add an iterator API for ioasid_set
  iommu/ioasid: Add reference couting functions
  iommu/ioasid: Introduce ioasid_set private ID
  iommu/ioasid: Introduce notification APIs
  iommu/ioasid: Support mm type ioasid_set notifications
  iommu/vt-d: Remove mm reference for guest SVA
  iommu/vt-d: Listen to IOASID notifications
  iommu/vt-d: Store guest PASID during bind

 Documentation/driver-api/ioasid.rst | 648 ++
 drivers/iommu/intel/iommu.c |  29 +-
 drivers/iommu/intel/pasid.h |   1 +
 drivers/iommu/intel/svm.c   | 132 +-
 drivers/iommu/ioasid.c  | 890 ++--
 include/linux/intel-iommu.h |   2 +
 include/linux/ioasid.h  | 197 +++-
 7 files changed, 1830 insertions(+), 69 deletions(-)
 create mode 100644 Documentation/driver-api/ioasid.rst

-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 11/14] iommu/ioasid: Support mm type ioasid_set notifications

2020-09-28 Thread Jacob Pan
As a system-wide resource, IOASID is often shared by multiple kernel
subsystems that are independent of each other. However, at the
ioasid_set level, these kernel subsystems must communicate with each
other for ownership checking, event notifications, etc. For example, on
Intel Scalable IO Virtualization (SIOV) enabled platforms, KVM and VFIO
instances under the same process/guest must be aware of a shared IOASID
set.
IOASID_SET_TYPE_MM token type was introduced to explicitly mark an
IOASID set that belongs to a process, thus use the same mm_struct
pointer as a token. Users of the same process can then identify with
each other based on this token.

This patch introduces MM token specific event registration APIs. Event
subscribers such as KVM instances can register IOASID event handler
without the knowledge of its ioasid_set. Event handlers are registered
based on its mm_struct pointer as a token. In case when subscribers
register handler *prior* to the creation of the ioasid_set, the
handler’s notification block is stored in a pending list within IOASID
core. Once the ioasid_set of the MM token is created, the notification
block will be registered by the IOASID core.

Signed-off-by: Liu Yi L 
Signed-off-by: Wu Hao 
Signed-off-by: Jacob Pan 
---
 drivers/iommu/ioasid.c | 117 +
 include/linux/ioasid.h |  15 +++
 2 files changed, 132 insertions(+)

diff --git a/drivers/iommu/ioasid.c b/drivers/iommu/ioasid.c
index 894b17c06ead..d5faeb559a43 100644
--- a/drivers/iommu/ioasid.c
+++ b/drivers/iommu/ioasid.c
@@ -889,6 +889,29 @@ void ioasid_set_put(struct ioasid_set *set)
 }
 EXPORT_SYMBOL_GPL(ioasid_set_put);
 
+/*
+ * ioasid_find_mm_set - Retrieve IOASID set with mm token
+ * Take a reference of the set if found.
+ */
+static struct ioasid_set *ioasid_find_mm_set(struct mm_struct *token)
+{
+   struct ioasid_set *set;
+   unsigned long index;
+
+   spin_lock(_allocator_lock);
+
+   xa_for_each(_sets, index, set) {
+   if (set->type == IOASID_SET_TYPE_MM && set->token == token) {
+   refcount_inc(>ref);
+   goto exit_unlock;
+   }
+   }
+   set = NULL;
+exit_unlock:
+   spin_unlock(_allocator_lock);
+   return set;
+}
+
 /**
  * ioasid_adjust_set - Adjust the quota of an IOASID set
  * @set:   IOASID set to be assigned
@@ -1121,6 +1144,100 @@ void ioasid_unregister_notifier(struct ioasid_set *set,
 }
 EXPORT_SYMBOL_GPL(ioasid_unregister_notifier);
 
+/**
+ * ioasid_register_notifier_mm - Register a notifier block on the IOASID set
+ *   created by the mm_struct pointer as the token
+ *
+ * @mm: the mm_struct token of the ioasid_set
+ * @nb: notfier block to be registered on the ioasid_set
+ *
+ * This a variant of ioasid_register_notifier() where the caller intends to
+ * listen to IOASID events belong the ioasid_set created under the same
+ * process. Caller is not aware of the ioasid_set, no need to hold reference
+ * of the ioasid_set.
+ */
+int ioasid_register_notifier_mm(struct mm_struct *mm, struct notifier_block 
*nb)
+{
+   struct ioasid_set_nb *curr;
+   struct ioasid_set *set;
+   int ret = 0;
+
+   if (!mm)
+   return -EINVAL;
+
+   spin_lock(_nb_lock);
+
+   /* Check for duplicates, nb is unique per set */
+   list_for_each_entry(curr, _nb_pending_list, list) {
+   if (curr->token == mm && curr->nb == nb) {
+   ret = -EBUSY;
+   goto exit_unlock;
+   }
+   }
+
+   /* Check if the token has an existing set */
+   set = ioasid_find_mm_set(mm);
+   if (!set) {
+   /* Add to the rsvd list as inactive */
+   curr->active = false;
+   } else {
+   /* REVISIT: Only register empty set for now. Can add an option
+* in the future to playback existing PASIDs.
+*/
+   if (set->nr_ioasids) {
+   pr_warn("IOASID set %d not empty\n", set->id);
+   ret = -EBUSY;
+   goto exit_unlock;
+   }
+   curr = kzalloc(sizeof(*curr), GFP_ATOMIC);
+   if (!curr) {
+   ret = -ENOMEM;
+   goto exit_unlock;
+   }
+   curr->token = mm;
+   curr->nb = nb;
+   curr->active = true;
+   curr->set = set;
+
+   /* Set already created, add to the notifier chain */
+   atomic_notifier_chain_register(>nh, nb);
+   /*
+* Do not hold a reference, if the set gets destroyed, the nb
+* entry will be marked inactive.
+*/
+   ioasid_set_put(set);
+   }
+
+   list_add(>list, _nb_pending_list);
+
+exit_unlock:
+   spin_unlock(_nb_lock);
+   return ret;
+}

Re: [RFC 0/3] iommu: Reserved regions for IOVAs beyond dma_mask and iommu aperture

2020-09-28 Thread Alex Williamson
On Mon, 28 Sep 2020 21:50:34 +0200
Eric Auger  wrote:

> VFIO currently exposes the usable IOVA regions through the
> VFIO_IOMMU_GET_INFO ioctl / VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE
> capability. However it fails to take into account the dma_mask
> of the devices within the container. The top limit currently is
> defined by the iommu aperture.

I think that dma_mask is traditionally a DMA API interface for a device
driver to indicate to the DMA layer which mappings are accessible to the
device.  On the other hand, vfio makes use of the IOMMU API where the
driver is in userspace.  That userspace driver has full control of the
IOVA range of the device, therefore dma_mask is mostly irrelevant to
vfio.  I think the issue you're trying to tackle is that the IORT code
is making use of the dma_mask to try to describe a DMA address
limitation imposed by the PCI root bus, living between the endpoint
device and the IOMMU.  Therefore, if the IORT code is exposing a
topology or system imposed device limitation, this seems much more akin
to something like an MSI reserved range, where it's not necessarily the
device or the IOMMU with the limitation, but something that sits
between them.

> So, for instance, if the IOMMU supports up to 48bits, it may give
> the impression the max IOVA is 48b while a device may have a
> dma_mask of 42b. So this API cannot really be used to compute
> the max usable IOVA.
> 
> This patch removes the IOVA region beyond the dma_mask's.

Rather it adds a reserved region accounting for the range above the
device's dma_mask.  I don't think the IOMMU API should be consuming
dma_mask like this though.  For example, what happens in
pci_dma_configure() when there are no OF or ACPI DMA restrictions?  It
appears to me that the dma_mask from whatever previous driver had the
device carries over to the new driver.  That's generally ok for the DMA
API because a driver is required to set the device's DMA mask.  It
doesn't make sense however to blindly consume that dma_mask and export
it via an IOMMU API.  For example I would expect to see different
results depending on whether a host driver has been bound to a device.
It seems the correct IOMMU API approach would be for the IORT code to
specifically register reserved ranges for the device.

> As we start to expose this reserved region in the sysfs file
> /sys/kernel/iommu_groups//reserved_regions, we also need to
> handle the IOVA range beyond the IOMMU aperture to handle the case
> where the dma_mask would have a higher number of bits than the iommu
> max input address.

Why?  The IOMMU geometry already describes this and vfio combines both
the IOMMU geometry and the device reserved regions when generating the
IOVA ranges?  Who is going to consume this information?  Additionally
it appears that reserved regions will report different information
depending on whether a device is attached to a domain.

> This is a change to the ABI as this reserved region was not yet
> exposed in sysfs /sys/kernel/iommu_groups//reserved_regions or
> through the VFIO ioctl. At VFIO level we increment the version of
> the VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE capability to advertise
> that change.

Is this really an ABI change?  The original entry for reserved regions
includes:

  Not necessarily all reserved regions are listed. This is typically
  used to output direct-mapped, MSI, non mappable regions.

I imagine the intention here was non-mappable relative to the IOMMU,
but non-mappable to the device is essentially what we're including
here.

I'm also concerned about bumping the vfio interface version for the
IOVA range.  We're not changing the interface, we're modifying the
result, and even then only for a fraction of users.  How many users are
potentially broken by that change?  Are we going to bump the version
for everyone any time the result changes on any platform?  Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: Add rmb after reading event queue prod_reg

2020-09-28 Thread Will Deacon
On Mon, 28 Sep 2020 16:32:02 +0800, Zhou Wang wrote:
> In arm_smmu_evtq_thread, reading event queue is from consumer pointer,
> which has no address dependency on producer pointer, prog_reg(MMIO) and
> event queue memory(Normal memory) can disorder. So the load for event queue
> can be done before the load of prod_reg, then perhaps wrong event entry
> value will be got.
> 
> Add rmb to make sure to get correct event queue entry value.

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu-v3: Add rmb after reading event queue prod_reg
  https://git.kernel.org/will/c/a76a3f2c 

(please note that I changed the patch to use readl() instead of an rmb()
in conjunction with the _relaxed() accessor, and then adjusted the cons
side to match in terms of DMB vs DSB).

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 01/13] mm: Define pasid in mm

2020-09-28 Thread Fenghua Yu
Hi, Will and Jean,

On Mon, Sep 28, 2020 at 11:22:51PM +0100, Will Deacon wrote:
> On Fri, Sep 18, 2020 at 12:18:41PM +0200, Jean-Philippe Brucker wrote:
> > From: Fenghua Yu 
> > 
> > PASID is shared by all threads in a process. So the logical place to keep
> > track of it is in the "mm". Both ARM and X86 need to use the PASID in the
> > "mm".
> > 
> > Suggested-by: Christoph Hellwig 
> > Signed-off-by: Fenghua Yu 
> > Reviewed-by: Tony Luck 
> > ---
> > https://lore.kernel.org/linux-iommu/1600187413-163670-8-git-send-email-fenghua...@intel.com/
> > ---
> >  include/linux/mm_types.h | 4 
> >  1 file changed, 4 insertions(+)
> 
> Acked-by: Will Deacon 

FYI. This patch is in x86 maintainers tree tip:x86/pasid now as part of
the x86 PASID MSR series.

Thanks.

-Fenghua
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()

2020-09-28 Thread Will Deacon
On Mon, Sep 21, 2020 at 09:45:57PM +0100, Will Deacon wrote:
> On Tue, Sep 22, 2020 at 03:13:53AM +0800, kernel test robot wrote:
> > Thank you for the patch! Perhaps something to improve:
> > 
> > [auto build test WARNING on iommu/next]
> > [also build test WARNING on linus/master v5.9-rc6 next-20200921]
> > [cannot apply to robclark/msm-next]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch]
> > 
> > url:
> > https://github.com/0day-ci/linux/commits/Yu-Kuai/iommu-qcom-add-missing-put_device-call-in-qcom_iommu_of_xlate/20200918-091341
> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
> > config: arm64-randconfig-r023-20200920 (attached as .config)
> > compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 
> > 4e8c028158b56d9c2142a62464e8e0686bde3584)
> > reproduce (this is a W=1 build):
> > wget 
> > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> > ~/bin/make.cross
> > chmod +x ~/bin/make.cross
> > # install arm64 cross compiling tool for clang build
> > # apt-get install binutils-aarch64-linux-gnu
> > # save the attached .config to linux build tree
> > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross 
> > ARCH=arm64 
> > 
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot 
> > 
> > All warnings (new ones prefixed by >>):
> > 
> > >> drivers/iommu/arm/arm-smmu/qcom_iommu.c:601:4: warning: misleading 
> > >> indentation; statement is not part of the previous 'if' 
> > >> [-Wmisleading-indentation]
> >return -EINVAL;
> >^
> >drivers/iommu/arm/arm-smmu/qcom_iommu.c:599:3: note: previous statement 
> > is here
> >if (WARN_ON(qcom_iommu != dev_iommu_priv_get(dev)))
> 
> Oh, this looks like a nasty bug. Seems we're missing some braces.

Yu Kuai: please could you send a v2 of this?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 3/3] vfio/type1: Increase the version of VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE

2020-09-28 Thread Eric Auger
Now the IOVA regions beyond the dma_mask and the vfio aperture are
removed from the usable IOVA ranges, the API becomes reliable to
compute the max IOVA. Let's advertise this by using a new version
for the capability.

Signed-off-by: Eric Auger 
---
 drivers/vfio/vfio_iommu_type1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 5fbf0c1f7433..af4596123954 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2541,7 +2541,7 @@ static int vfio_iommu_iova_add_cap(struct vfio_info_cap 
*caps,
struct vfio_iommu_type1_info_cap_iova_range *iova_cap;
 
header = vfio_info_cap_add(caps, size,
-  VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE, 1);
+  VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE, 2);
if (IS_ERR(header))
return PTR_ERR(header);
 
-- 
2.21.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 0/3] iommu: Reserved regions for IOVAs beyond dma_mask and iommu aperture

2020-09-28 Thread Eric Auger
VFIO currently exposes the usable IOVA regions through the
VFIO_IOMMU_GET_INFO ioctl / VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE
capability. However it fails to take into account the dma_mask
of the devices within the container. The top limit currently is
defined by the iommu aperture.

So, for instance, if the IOMMU supports up to 48bits, it may give
the impression the max IOVA is 48b while a device may have a
dma_mask of 42b. So this API cannot really be used to compute
the max usable IOVA.

This patch removes the IOVA region beyond the dma_mask's.
As we start to expose this reserved region in the sysfs file
/sys/kernel/iommu_groups//reserved_regions, we also need to
handle the IOVA range beyond the IOMMU aperture to handle the case
where the dma_mask would have a higher number of bits than the iommu
max input address.

This is a change to the ABI as this reserved region was not yet
exposed in sysfs /sys/kernel/iommu_groups//reserved_regions or
through the VFIO ioctl. At VFIO level we increment the version of
the VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE capability to advertise
that change.

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu/tree/dma_mask_rfc

Eric Auger (3):
  iommu: Fix merging in iommu_insert_resv_region
  iommu: Account for dma_mask and iommu aperture in IOVA reserved
regions
  vfio/type1: Increase the version of
VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE

 .../ABI/testing/sysfs-kernel-iommu_groups |  7 
 drivers/iommu/iommu.c | 41 ++-
 drivers/vfio/vfio_iommu_type1.c   |  2 +-
 3 files changed, 48 insertions(+), 2 deletions(-)

-- 
2.21.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 2/3] iommu: Account for dma_mask and iommu aperture in IOVA reserved regions

2020-09-28 Thread Eric Auger
VFIO currently exposes the usable IOVA regions through the
VFIO_IOMMU_GET_INFO ioctl. However it fails to take into account
the dma_mask of the devices within the container. The top limit
currently is defined by the iommu aperture.

So, for instance, if the IOMMU supports up to 48bits, it may give
the impression the max IOVA is 48b while a device may have a
dma_mask of 42b. So this API cannot really be used to compute
the max usable IOVA.

This patch removes the IOVA region beyond the dma_mask's. As we
start to expose this reserved region in the sysfs file
/sys/kernel/iommu_groups//reserved_regions, we also need to
expose the IOVA range beyond the IOMMU aperture to handle the case
where the dma_mask would have a higher number of bits than the iommu
max input address. Those out-of-reach regions get the
IOMMU_RESV_RESERVED type.

This is a change to the ABI as this reserved region was not yet
exposed in sysfs /sys/kernel/iommu_groups//reserved_regions or
through the VFIO ioctl. Document that change.

Signed-off-by: Eric Auger 
---
 .../ABI/testing/sysfs-kernel-iommu_groups |  7 
 drivers/iommu/iommu.c | 39 +++
 2 files changed, 46 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups 
b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
index 017f5bc3920c..2f316686c88b 100644
--- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
+++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
@@ -33,3 +33,10 @@ Description:In case an RMRR is used only by graphics or 
USB devices
it is now exposed as "direct-relaxable" instead of "direct".
In device assignment use case, for instance, those RMRR
are considered to be relaxable and safe.
+
+What:  /sys/kernel/iommu_groups/reserved_regions
+Date:  Sept 2020
+KernelVersion:  v5.11
+Contact:   Eric Auger 
+Description:Regions beyond the device dma_mask and the iommu aperture
+   now are exposed as IOMMU_RESV_RESERVED reserved regions.
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index dd8cda340e62..d797f07b3625 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2511,9 +2511,48 @@ EXPORT_SYMBOL_GPL(iommu_domain_set_attr);
 void iommu_get_resv_regions(struct device *dev, struct list_head *list)
 {
const struct iommu_ops *ops = dev->bus->iommu_ops;
+   struct iommu_resv_region *region;
+   struct iommu_domain *domain;
+
+   domain = iommu_get_domain_for_dev(dev);
+
+   if (domain) {
+   struct iommu_domain_geometry geo;
+
+   if (iommu_domain_get_attr(domain, DOMAIN_ATTR_GEOMETRY, ))
+   return;
+
+   if (geo.aperture_end < ULLONG_MAX && geo.aperture_end != 
geo.aperture_start) {
+   region = iommu_alloc_resv_region(geo.aperture_end + 1,
+ULLONG_MAX - 
geo.aperture_end,
+0, 
IOMMU_RESV_RESERVED);
+   if (!region)
+   return;
+   list_add_tail(>list, list);
+   }
+
+   if (geo.aperture_start > 0) {
+   region = iommu_alloc_resv_region(0, geo.aperture_start,
+0, 
IOMMU_RESV_RESERVED);
+   if (!region)
+   return;
+   list_add_tail(>list, list);
+   }
+   }
 
if (ops && ops->get_resv_regions)
ops->get_resv_regions(dev, list);
+
+   if (!dev->dma_mask || *dev->dma_mask == ULLONG_MAX)
+   return;
+
+   region = iommu_alloc_resv_region(*dev->dma_mask + 1,
+ULLONG_MAX - *dev->dma_mask,
+0, IOMMU_RESV_RESERVED);
+   if (!region)
+   return;
+
+   list_add_tail(>list, list);
 }
 
 void iommu_put_resv_regions(struct device *dev, struct list_head *list)
-- 
2.21.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/arm: Add module parameter to set msi iova address

2020-09-28 Thread Will Deacon
On Wed, Sep 23, 2020 at 08:32:43AM +0200, Auger Eric wrote:
> On 9/21/20 10:45 PM, Will Deacon wrote:
> > On Mon, Sep 14, 2020 at 11:13:07AM -0700, Vennila Megavannan wrote:
> >> From: Srinath Mannam 
> >>
> >> Add provision to change default value of MSI IOVA base to platform's
> >> suitable IOVA using module parameter. The present hardcoded MSI IOVA base
> >> may not be the accessible IOVA ranges of platform.
> >>
> >> If any platform has the limitaion to access default MSI IOVA, then it can
> >> be changed using "arm-smmu.msi_iova_base=0xa000" command line argument.
> >>
> >> Signed-off-by: Srinath Mannam 
> >> Co-developed-by: Vennila Megavannan 
> >> Signed-off-by: Vennila Megavannan 
> >> ---
> >>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 5 -
> >>  drivers/iommu/arm/arm-smmu/arm-smmu.c   | 5 -
> >>  2 files changed, 8 insertions(+), 2 deletions(-)
> > 
> > This feels pretty fragile. Wouldn't it be better to realise that there's
> > a region conflict with iommu_dma_get_resv_regions() and move the MSI window
> > accordingly at runtime?
> 
> Since cd2c9fcf5c66 ("iommu/dma: Move PCI window region reservation back
> into dma specific path"), the PCI host bridge windows are not exposed by
> iommu_dma_get_resv_regions() anymore. If I understood correctly, what is
> attempted here is to avoid the collision between such PCI host bridge
> window and the MSI IOVA range.

Either way, I think the kernel should figure this out at runtime and not
rely on a cmdline option to tell it where to place the region.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/5] iommu/tegra-smmu: Use iommu_fwspec in .probe_/.attach_device()

2020-09-28 Thread Nicolin Chen
On Mon, Sep 28, 2020 at 09:52:12AM +0200, Thierry Reding wrote:
> On Sat, Sep 26, 2020 at 01:07:17AM -0700, Nicolin Chen wrote:
> > @@ -13,6 +13,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> 
> Why is this needed? I don't see any of the symbols declared in that file
> used here.

Hmm..I think I mixed with some other change that needs this header
file. Removing it

> >  #include 
> >  
> >  #include 
> > @@ -61,6 +62,8 @@ struct tegra_smmu_as {
> > u32 attr;
> >  };
> >  
> > +static const struct iommu_ops tegra_smmu_ops;
> > +
> >  static struct tegra_smmu_as *to_smmu_as(struct iommu_domain *dom)
> >  {
> > return container_of(dom, struct tegra_smmu_as, domain);
> > @@ -484,60 +487,49 @@ static void tegra_smmu_as_unprepare(struct tegra_smmu 
> > *smmu,
> >  static int tegra_smmu_attach_dev(struct iommu_domain *domain,
> >  struct device *dev)
> >  {
> > +   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> > struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
> > struct tegra_smmu_as *as = to_smmu_as(domain);
> > -   struct device_node *np = dev->of_node;
> > -   struct of_phandle_args args;
> > -   unsigned int index = 0;
> > -   int err = 0;
> > -
> > -   while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> > -  )) {
> > -   unsigned int swgroup = args.args[0];
> > -
> > -   if (args.np != smmu->dev->of_node) {
> > -   of_node_put(args.np);
> > -   continue;
> > -   }
> > +   int index, err = 0;
> >  
> > -   of_node_put(args.np);
> > +   if (!fwspec || fwspec->ops != _smmu_ops)
> > +   return -ENOENT;
> >  
> > +   for (index = 0; index < fwspec->num_ids; index++) {
> > err = tegra_smmu_as_prepare(smmu, as);
> > -   if (err < 0)
> > -   return err;
> > +   if (err)
> > +   goto err_disable;
> 
> I'd personally drop the err_ prefix here because it's pretty obvious
> that we're going to do this as a result of an error happening.

Changing to "goto disable".

> > @@ -845,10 +837,25 @@ static int tegra_smmu_configure(struct tegra_smmu 
> > *smmu, struct device *dev,
> > return 0;
> >  }
> >  
> > +static struct tegra_smmu *tegra_smmu_get_by_fwnode(struct fwnode_handle 
> > *fwnode)
> > +{
> > +   struct device *dev = 
> > driver_find_device_by_fwnode(_mc_driver.driver, fwnode);
> > +   struct tegra_mc *mc;
> > +
> > +   if (!dev)
> > +   return NULL;
> > +
> > +   put_device(dev);
> > +   mc = dev_get_drvdata(dev);
> > +
> > +   return mc->smmu;
> > +}
> > +
> 
> As I mentioned in another reply, I think tegra_smmu_find() should be all
> you need in this case.

This function is used by .probe_device() where its dev pointer is
an SMMU client. IIUC, tegra_smmu_find() needs np pointer of "mc".
For a PCI device that doesn't have a DT node with iommus property,
not sure how we can use tegra_smmu_find().
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 01/13] mm: Define pasid in mm

2020-09-28 Thread Will Deacon
On Fri, Sep 18, 2020 at 12:18:41PM +0200, Jean-Philippe Brucker wrote:
> From: Fenghua Yu 
> 
> PASID is shared by all threads in a process. So the logical place to keep
> track of it is in the "mm". Both ARM and X86 need to use the PASID in the
> "mm".
> 
> Suggested-by: Christoph Hellwig 
> Signed-off-by: Fenghua Yu 
> Reviewed-by: Tony Luck 
> ---
> https://lore.kernel.org/linux-iommu/1600187413-163670-8-git-send-email-fenghua...@intel.com/
> ---
>  include/linux/mm_types.h | 4 
>  1 file changed, 4 insertions(+)

Acked-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/5] iommu/tegra-smmu: Add PCI support

2020-09-28 Thread Nicolin Chen
On Sun, Sep 27, 2020 at 01:18:15AM +0300, Dmitry Osipenko wrote:
> 26.09.2020 11:07, Nicolin Chen пишет:
> ...
> > +#ifdef CONFIG_PCI
> > +   if (!iommu_present(_bus_type)) {
> 
> Is this iommu_present() check really needed?
> 
> > +   pci_request_acs();
> 
> Shouldn't pci_request_acs() be invoked *after* bus_set_iommu() succeeds?

Will move it after that. Thanks!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 4/5] iommu/tegra-smmu: Add PCI support

2020-09-28 Thread Nicolin Chen
On Mon, Sep 28, 2020 at 09:55:45AM +0200, Thierry Reding wrote:
> On Sat, Sep 26, 2020 at 01:07:18AM -0700, Nicolin Chen wrote:
> > +#ifdef CONFIG_PCI
> > +   if (!iommu_present(_bus_type)) {
> > +   pci_request_acs();
> > +   err = bus_set_iommu(_bus_type, _smmu_ops);
> > +   if (err < 0) {
> > +   bus_set_iommu(_bus_type, NULL);
> > +   iommu_device_unregister(>iommu);
> > +   iommu_device_sysfs_remove(>iommu);
> > +   return ERR_PTR(err);
> 
> It might be worth factoring out the cleanup code now that there are
> multiple failures from which we may need to clean up.

Will do.

> Also, it'd be great if somehow we could do this without the #ifdef,
> but I guess since we're using the pci_bus_type global variable directly,
> there isn't much we can do here?

Probably no -- both pci_bus_type and pci_request_acs seem to depend
on it.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v10 00/13] iommu: Shared Virtual Addressing for SMMUv3 (PT sharing part)

2020-09-28 Thread Will Deacon
On Mon, Sep 28, 2020 at 06:23:15PM +0100, Will Deacon wrote:
> On Mon, Sep 28, 2020 at 06:47:31PM +0200, Jean-Philippe Brucker wrote:
> > On Fri, Sep 18, 2020 at 12:18:40PM +0200, Jean-Philippe Brucker wrote:
> > > This is version 10 of the page table sharing support for Arm SMMUv3.
> > > Patch 1 still needs an Ack from mm maintainers. However patches 4-11 do
> > > not depend on it, and could get merged for v5.10 regardless.
> > 
> > Are you OK with taking patches 4-11 for v5.10?
> > 
> > The rest depends on patch 1 which hasn't been acked yet. It's
> > uncontroversial and I'm sure it will eventually make it. In case it
> > doesn't, we'll keep track of mm->pasid within the IOMMU subsystem instead.
> 
> I was off most of last week, but I plan to see how much of this I can queue
> tonight. Stay tuned...

I've queued 4-11 locally, but I've put 4 and 6 on a shared branch with arm64
(for-next/svm) so I'd like that to hit next before I push out the merge into
the branch for Joerg.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/2] iommu/tegra-smmu: Expend mutex protection range

2020-09-28 Thread Nicolin Chen
This is used to protect potential race condition at use_count.
since probes of client drivers, calling attach_dev(), may run
concurrently.

Signed-off-by: Nicolin Chen 
---

Changelog
v1->v2:
 * N/A

 drivers/iommu/tegra-smmu.c | 34 +-
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index ec4c9dafff95..acda5902e095 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -256,26 +256,19 @@ static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, 
unsigned int *idp)
 {
unsigned long id;
 
-   mutex_lock(>lock);
-
id = find_first_zero_bit(smmu->asids, smmu->soc->num_asids);
-   if (id >= smmu->soc->num_asids) {
-   mutex_unlock(>lock);
+   if (id >= smmu->soc->num_asids)
return -ENOSPC;
-   }
 
set_bit(id, smmu->asids);
*idp = id;
 
-   mutex_unlock(>lock);
return 0;
 }
 
 static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id)
 {
-   mutex_lock(>lock);
clear_bit(id, smmu->asids);
-   mutex_unlock(>lock);
 }
 
 static bool tegra_smmu_capable(enum iommu_cap cap)
@@ -420,17 +413,21 @@ static int tegra_smmu_as_prepare(struct tegra_smmu *smmu,
 struct tegra_smmu_as *as)
 {
u32 value;
-   int err;
+   int err = 0;
+
+   mutex_lock(>lock);
 
if (as->use_count > 0) {
as->use_count++;
-   return 0;
+   goto err_unlock;
}
 
as->pd_dma = dma_map_page(smmu->dev, as->pd, 0, SMMU_SIZE_PD,
  DMA_TO_DEVICE);
-   if (dma_mapping_error(smmu->dev, as->pd_dma))
-   return -ENOMEM;
+   if (dma_mapping_error(smmu->dev, as->pd_dma)) {
+   err = -ENOMEM;
+   goto err_unlock;
+   }
 
/* We can't handle 64-bit DMA addresses */
if (!smmu_dma_addr_valid(smmu, as->pd_dma)) {
@@ -453,24 +450,35 @@ static int tegra_smmu_as_prepare(struct tegra_smmu *smmu,
as->smmu = smmu;
as->use_count++;
 
+   mutex_unlock(>lock);
+
return 0;
 
 err_unmap:
dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD, DMA_TO_DEVICE);
+err_unlock:
+   mutex_unlock(>lock);
+
return err;
 }
 
 static void tegra_smmu_as_unprepare(struct tegra_smmu *smmu,
struct tegra_smmu_as *as)
 {
-   if (--as->use_count > 0)
+   mutex_lock(>lock);
+
+   if (--as->use_count > 0) {
+   mutex_unlock(>lock);
return;
+   }
 
tegra_smmu_free_asid(smmu, as->id);
 
dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD, DMA_TO_DEVICE);
 
as->smmu = NULL;
+
+   mutex_unlock(>lock);
 }
 
 static int tegra_smmu_attach_dev(struct iommu_domain *domain,
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/2] iommu/tegra-smmu: Two followup changes

2020-09-28 Thread Nicolin Chen
Two followup patches for tegra-smmu:
PATCH-1 is a clean-up patch for the recently applied SWGROUP change.
PATCH-2 fixes a potential race condition

Changelog
v1->v2:
 * Separated first two changs of V1 so they may get applied first,
   since the other three changes need some extra time to finalize.

Nicolin Chen (2):
  iommu/tegra-smmu: Unwrap tegra_smmu_group_get
  iommu/tegra-smmu: Expend mutex protection range

 drivers/iommu/tegra-smmu.c | 53 ++
 1 file changed, 25 insertions(+), 28 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/2] iommu/tegra-smmu: Unwrap tegra_smmu_group_get

2020-09-28 Thread Nicolin Chen
The tegra_smmu_group_get was added to group devices in different
SWGROUPs and it'd return a NULL group pointer upon a mismatch at
tegra_smmu_find_group(), so for most of clients/devices, it very
likely would mismatch and need a fallback generic_device_group().

But now tegra_smmu_group_get handles devices in same SWGROUP too,
which means that it would allocate a group for every new SWGROUP
or would directly return an existing one upon matching a SWGROUP,
i.e. any device will go through this function.

So possibility of having a NULL group pointer in device_group()
is upon failure of either devm_kzalloc() or iommu_group_alloc().
In either case, calling generic_device_group() no longer makes a
sense. Especially for devm_kzalloc() failing case, it'd cause a
problem if it fails at devm_kzalloc() yet succeeds at a fallback
generic_device_group(), because it does not create a group->list
for other devices to match.

This patch simply unwraps the function to clean it up.

Signed-off-by: Nicolin Chen 
---

Changelog
v1->v2:
 * Changed type of swgroup to "unsigned int", following Thierry's
   commnets.

 drivers/iommu/tegra-smmu.c | 19 ---
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 0becdbfea306..ec4c9dafff95 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -903,10 +903,12 @@ static void tegra_smmu_group_release(void *iommu_data)
mutex_unlock(>lock);
 }
 
-static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu,
-   unsigned int swgroup)
+static struct iommu_group *tegra_smmu_device_group(struct device *dev)
 {
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
const struct tegra_smmu_group_soc *soc;
+   unsigned int swgroup = fwspec->ids[0];
struct tegra_smmu_group *group;
struct iommu_group *grp;
 
@@ -950,19 +952,6 @@ static struct iommu_group *tegra_smmu_group_get(struct 
tegra_smmu *smmu,
return group->group;
 }
 
-static struct iommu_group *tegra_smmu_device_group(struct device *dev)
-{
-   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-   struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
-   struct iommu_group *group;
-
-   group = tegra_smmu_group_get(smmu, fwspec->ids[0]);
-   if (!group)
-   group = generic_device_group(dev);
-
-   return group;
-}
-
 static int tegra_smmu_of_xlate(struct device *dev,
   struct of_phandle_args *args)
 {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH V2] iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()

2020-09-28 Thread Yu Kuai
if of_find_device_by_node() succeed, qcom_iommu_of_xlate() doesn't have
a corresponding put_device(). Thus add put_device() to fix the exception
handling for this function implementation.

Fixes: 0ae349a0f33fb ("iommu/qcom: Add qcom_iommu")
Signed-off-by: Yu Kuai 
---

Changes in V2:
 - Fix wrong 'Fixes'
 - add missing '{}' after if

 drivers/iommu/arm/arm-smmu/qcom_iommu.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c 
b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index 9535a6af7553..b30d6c966e2c 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -584,8 +584,10 @@ static int qcom_iommu_of_xlate(struct device *dev, struct 
of_phandle_args *args)
 * index into qcom_iommu->ctxs:
 */
if (WARN_ON(asid < 1) ||
-   WARN_ON(asid > qcom_iommu->num_ctxs))
+   WARN_ON(asid > qcom_iommu->num_ctxs)) {
+   put_device(_pdev->dev);
return -EINVAL;
+   }
 
if (!dev_iommu_priv_get(dev)) {
dev_iommu_priv_set(dev, qcom_iommu);
@@ -594,8 +596,10 @@ static int qcom_iommu_of_xlate(struct device *dev, struct 
of_phandle_args *args)
 * multiple different iommu devices.  Multiple context
 * banks are ok, but multiple devices are not:
 */
-   if (WARN_ON(qcom_iommu != dev_iommu_priv_get(dev)))
+   if (WARN_ON(qcom_iommu != dev_iommu_priv_get(dev))) {
+   put_device(_pdev->dev);
return -EINVAL;
+   }
}
 
return iommu_fwspec_add_ids(dev, , 1);
-- 
2.25.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 0/7] Convert the intel iommu driver to the dma-iommu api

2020-09-28 Thread Lu Baolu

Hi Tvrtko,

On 9/28/20 5:44 PM, Tvrtko Ursulin wrote:


On 27/09/2020 07:34, Lu Baolu wrote:

Hi,

The previous post of this series could be found here.

https://lore.kernel.org/linux-iommu/20200912032200.11489-1-baolu...@linux.intel.com/ 



This version introduce a new patch [4/7] to fix an issue reported here.

https://lore.kernel.org/linux-iommu/51a1baec-48d1-c0ac-181b-1fba92aa4...@linux.intel.com/ 



There aren't any other changes.

Please help to test and review.

Best regards,
baolu

Lu Baolu (3):
   iommu: Add quirk for Intel graphic devices in map_sg


Since I do have patches to fix i915 to handle this, do we want to 
co-ordinate the two and avoid having to add this quirk and then later 
remove it? Or you want to go the staged approach?


I have no preference. It depends on which patch goes first. Let the
maintainers help here.

Best regards,
baolu



Regards,

Tvrtko


   iommu/vt-d: Update domain geometry in iommu_ops.at(de)tach_dev
   iommu/vt-d: Cleanup after converting to dma-iommu ops

Tom Murphy (4):
   iommu: Handle freelists when using deferred flushing in iommu drivers
   iommu: Add iommu_dma_free_cpu_cached_iovas()
   iommu: Allow the dma-iommu api to use bounce buffers
   iommu/vt-d: Convert intel iommu driver to the iommu ops

  .../admin-guide/kernel-parameters.txt |   5 -
  drivers/iommu/dma-iommu.c | 228 -
  drivers/iommu/intel/Kconfig   |   1 +
  drivers/iommu/intel/iommu.c   | 901 +++---
  include/linux/dma-iommu.h |   8 +
  include/linux/iommu.h |   1 +
  6 files changed, 336 insertions(+), 808 deletions(-)


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 2/2] iommu/tegra-smmu: Expend mutex protection range

2020-09-28 Thread Dmitry Osipenko
...
>  static bool tegra_smmu_capable(enum iommu_cap cap)
> @@ -420,17 +413,21 @@ static int tegra_smmu_as_prepare(struct tegra_smmu 
> *smmu,
>struct tegra_smmu_as *as)
>  {
>   u32 value;
> - int err;
> + int err = 0;
> +
> + mutex_lock(>lock);
>  
>   if (as->use_count > 0) {
>   as->use_count++;
> - return 0;
> + goto err_unlock;

This looks a bit odd because it's not a error condition. Perhaps should
be better to "goto bump_usecount"?

Or make it similar to tegra_smmu_as_unprepare()?

>   }
>  
>   as->pd_dma = dma_map_page(smmu->dev, as->pd, 0, SMMU_SIZE_PD,
> DMA_TO_DEVICE);
> - if (dma_mapping_error(smmu->dev, as->pd_dma))
> - return -ENOMEM;
> + if (dma_mapping_error(smmu->dev, as->pd_dma)) {
> + err = -ENOMEM;
> + goto err_unlock;
> + }
>  
>   /* We can't handle 64-bit DMA addresses */
>   if (!smmu_dma_addr_valid(smmu, as->pd_dma)) {
> @@ -453,24 +450,35 @@ static int tegra_smmu_as_prepare(struct tegra_smmu 
> *smmu,
>   as->smmu = smmu;

bump_usecount:

>   as->use_count++;
>  
> + mutex_unlock(>lock);
> +
>   return 0;
>  
>  err_unmap:
>   dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD, DMA_TO_DEVICE);
> +err_unlock:
> + mutex_unlock(>lock);
> +
>   return err;
>  }
>  
>  static void tegra_smmu_as_unprepare(struct tegra_smmu *smmu,
>   struct tegra_smmu_as *as)
>  {
> - if (--as->use_count > 0)
> + mutex_lock(>lock);
> +
> + if (--as->use_count > 0) {
> + mutex_unlock(>lock);
>   return;
> + }
>  
>   tegra_smmu_free_asid(smmu, as->id);
>  
>   dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD, DMA_TO_DEVICE);
>  
>   as->smmu = NULL;
> +
> + mutex_unlock(>lock);
>  }
>  
>  static int tegra_smmu_attach_dev(struct iommu_domain *domain,
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 2/2] iommu/tegra-smmu: Expend mutex protection range

2020-09-28 Thread Nicolin Chen
On Tue, Sep 29, 2020 at 03:17:58AM +0300, Dmitry Osipenko wrote:
> ...
> >  static bool tegra_smmu_capable(enum iommu_cap cap)
> > @@ -420,17 +413,21 @@ static int tegra_smmu_as_prepare(struct tegra_smmu 
> > *smmu,
> >  struct tegra_smmu_as *as)
> >  {
> > u32 value;
> > -   int err;
> > +   int err = 0;
> > +
> > +   mutex_lock(>lock);
> >  
> > if (as->use_count > 0) {
> > as->use_count++;
> > -   return 0;
> > +   goto err_unlock;
> 
> This looks a bit odd because it's not a error condition. Perhaps should
> be better to "goto bump_usecount"?
> 
> Or make it similar to tegra_smmu_as_unprepare()?

Hmm...I think it's simple to just make it "goto unlock" then.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()

2020-09-28 Thread yukuai (C)



On 2020/09/29 7:08, Will Deacon wrote:

On Mon, Sep 21, 2020 at 09:45:57PM +0100, Will Deacon wrote:

On Tue, Sep 22, 2020 at 03:13:53AM +0800, kernel test robot wrote:

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on iommu/next]
[also build test WARNING on linus/master v5.9-rc6 next-20200921]
[cannot apply to robclark/msm-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Yu-Kuai/iommu-qcom-add-missing-put_device-call-in-qcom_iommu_of_xlate/20200918-091341
base:   https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
config: arm64-randconfig-r023-20200920 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 
4e8c028158b56d9c2142a62464e8e0686bde3584)
reproduce (this is a W=1 build):
 wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
 chmod +x ~/bin/make.cross
 # install arm64 cross compiling tool for clang build
 # apt-get install binutils-aarch64-linux-gnu
 # save the attached .config to linux build tree
 COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):


drivers/iommu/arm/arm-smmu/qcom_iommu.c:601:4: warning: misleading indentation; 
statement is not part of the previous 'if' [-Wmisleading-indentation]

return -EINVAL;
^
drivers/iommu/arm/arm-smmu/qcom_iommu.c:599:3: note: previous statement is 
here
if (WARN_ON(qcom_iommu != dev_iommu_priv_get(dev)))


Oh, this looks like a nasty bug. Seems we're missing some braces.


Yu Kuai: please could you send a v2 of this?



Hi, Will

Thanks for your notice, will send a V2 soon.

Yu Kuai
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V7 0/3] iommu: Add support to change default domain of an iommu group

2020-09-28 Thread Raj, Ashok
Hi Joerg

On Fri, Sep 25, 2020 at 09:34:23AM +0200, Joerg Roedel wrote:
> Hi Ashok,
> 
> On Thu, Sep 24, 2020 at 10:21:48AM -0700, Raj, Ashok wrote:
> > Just trying to followup on this series.
> > 
> > Sai has moved out of Intel, hence I'm trying to followup on his behalf.
> > 
> > Let me know if you have queued this for the next release.
> 
> Not yet, but I think this is mostly ready. Can you please send a new
> version in a new mail thread so that I can pick it up with b4?

I reposted Sai's patches again here. 

https://lore.kernel.org/linux-iommu/20200925190620.18732-2-ashok@intel.com/

Can you let me know if they are in a format you can pickup> via b4? 

With recent issues we have sending mails out of Intel, wanted to make sure 
everything
made it out in one piece.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/5] iommu/tegra-smmu: Use iommu_fwspec in .probe_/.attach_device()

2020-09-28 Thread Dmitry Osipenko
...
>> As I mentioned in another reply, I think tegra_smmu_find() should be all
>> you need in this case.
> 
> This function is used by .probe_device() where its dev pointer is
> an SMMU client. IIUC, tegra_smmu_find() needs np pointer of "mc".
> For a PCI device that doesn't have a DT node with iommus property,
> not sure how we can use tegra_smmu_find().

Perhaps you could get np from struct pci_dev.bus?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/2] iommu/tegra-smmu: Expend mutex protection range

2020-09-28 Thread Nicolin Chen
This is used to protect potential race condition at use_count.
since probes of client drivers, calling attach_dev(), may run
concurrently.

Signed-off-by: Nicolin Chen 
---

Changelog
v2->v3:
 * Renamed label "err_unlock" to "unlock"
v1->v2:
 * N/A

 drivers/iommu/tegra-smmu.c | 34 +-
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index ec4c9dafff95..6a3ecc334481 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -256,26 +256,19 @@ static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, 
unsigned int *idp)
 {
unsigned long id;
 
-   mutex_lock(>lock);
-
id = find_first_zero_bit(smmu->asids, smmu->soc->num_asids);
-   if (id >= smmu->soc->num_asids) {
-   mutex_unlock(>lock);
+   if (id >= smmu->soc->num_asids)
return -ENOSPC;
-   }
 
set_bit(id, smmu->asids);
*idp = id;
 
-   mutex_unlock(>lock);
return 0;
 }
 
 static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id)
 {
-   mutex_lock(>lock);
clear_bit(id, smmu->asids);
-   mutex_unlock(>lock);
 }
 
 static bool tegra_smmu_capable(enum iommu_cap cap)
@@ -420,17 +413,21 @@ static int tegra_smmu_as_prepare(struct tegra_smmu *smmu,
 struct tegra_smmu_as *as)
 {
u32 value;
-   int err;
+   int err = 0;
+
+   mutex_lock(>lock);
 
if (as->use_count > 0) {
as->use_count++;
-   return 0;
+   goto unlock;
}
 
as->pd_dma = dma_map_page(smmu->dev, as->pd, 0, SMMU_SIZE_PD,
  DMA_TO_DEVICE);
-   if (dma_mapping_error(smmu->dev, as->pd_dma))
-   return -ENOMEM;
+   if (dma_mapping_error(smmu->dev, as->pd_dma)) {
+   err = -ENOMEM;
+   goto unlock;
+   }
 
/* We can't handle 64-bit DMA addresses */
if (!smmu_dma_addr_valid(smmu, as->pd_dma)) {
@@ -453,24 +450,35 @@ static int tegra_smmu_as_prepare(struct tegra_smmu *smmu,
as->smmu = smmu;
as->use_count++;
 
+   mutex_unlock(>lock);
+
return 0;
 
 err_unmap:
dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD, DMA_TO_DEVICE);
+unlock:
+   mutex_unlock(>lock);
+
return err;
 }
 
 static void tegra_smmu_as_unprepare(struct tegra_smmu *smmu,
struct tegra_smmu_as *as)
 {
-   if (--as->use_count > 0)
+   mutex_lock(>lock);
+
+   if (--as->use_count > 0) {
+   mutex_unlock(>lock);
return;
+   }
 
tegra_smmu_free_asid(smmu, as->id);
 
dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD, DMA_TO_DEVICE);
 
as->smmu = NULL;
+
+   mutex_unlock(>lock);
 }
 
 static int tegra_smmu_attach_dev(struct iommu_domain *domain,
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 0/2] iommu/tegra-smmu: Two followup changes

2020-09-28 Thread Nicolin Chen
Two followup patches for tegra-smmu:
PATCH-1 is a clean-up patch for the recently applied SWGROUP change.
PATCH-2 fixes a potential race condition

Changelog
v2->v3:
 * PATCH-2: renamed "err_unlock" to "unlock"
v1->v2:
 * Separated first two changs of V1 so they may get applied first,
   since the other three changes need some extra time to finalize.

Nicolin Chen (2):
  iommu/tegra-smmu: Unwrap tegra_smmu_group_get
  iommu/tegra-smmu: Expend mutex protection range

 drivers/iommu/tegra-smmu.c | 53 ++
 1 file changed, 25 insertions(+), 28 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/2] iommu/tegra-smmu: Unwrap tegra_smmu_group_get

2020-09-28 Thread Nicolin Chen
The tegra_smmu_group_get was added to group devices in different
SWGROUPs and it'd return a NULL group pointer upon a mismatch at
tegra_smmu_find_group(), so for most of clients/devices, it very
likely would mismatch and need a fallback generic_device_group().

But now tegra_smmu_group_get handles devices in same SWGROUP too,
which means that it would allocate a group for every new SWGROUP
or would directly return an existing one upon matching a SWGROUP,
i.e. any device will go through this function.

So possibility of having a NULL group pointer in device_group()
is upon failure of either devm_kzalloc() or iommu_group_alloc().
In either case, calling generic_device_group() no longer makes a
sense. Especially for devm_kzalloc() failing case, it'd cause a
problem if it fails at devm_kzalloc() yet succeeds at a fallback
generic_device_group(), because it does not create a group->list
for other devices to match.

This patch simply unwraps the function to clean it up.

Signed-off-by: Nicolin Chen 
---

Changelog
v2->v3:
 * N/A
v1->v2:
 * Changed type of swgroup to "unsigned int", following Thierry's
   commnets.

 drivers/iommu/tegra-smmu.c | 19 ---
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 0becdbfea306..ec4c9dafff95 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -903,10 +903,12 @@ static void tegra_smmu_group_release(void *iommu_data)
mutex_unlock(>lock);
 }
 
-static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu,
-   unsigned int swgroup)
+static struct iommu_group *tegra_smmu_device_group(struct device *dev)
 {
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
const struct tegra_smmu_group_soc *soc;
+   unsigned int swgroup = fwspec->ids[0];
struct tegra_smmu_group *group;
struct iommu_group *grp;
 
@@ -950,19 +952,6 @@ static struct iommu_group *tegra_smmu_group_get(struct 
tegra_smmu *smmu,
return group->group;
 }
 
-static struct iommu_group *tegra_smmu_device_group(struct device *dev)
-{
-   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-   struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
-   struct iommu_group *group;
-
-   group = tegra_smmu_group_get(smmu, fwspec->ids[0]);
-   if (!group)
-   group = generic_device_group(dev);
-
-   return group;
-}
-
 static int tegra_smmu_of_xlate(struct device *dev,
   struct of_phandle_args *args)
 {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/5] iommu/tegra-smmu: Use iommu_fwspec in .probe_/.attach_device()

2020-09-28 Thread Nicolin Chen
On Tue, Sep 29, 2020 at 07:06:37AM +0300, Dmitry Osipenko wrote:
> ...
> >> As I mentioned in another reply, I think tegra_smmu_find() should be all
> >> you need in this case.
> > 
> > This function is used by .probe_device() where its dev pointer is
> > an SMMU client. IIUC, tegra_smmu_find() needs np pointer of "mc".
> > For a PCI device that doesn't have a DT node with iommus property,
> > not sure how we can use tegra_smmu_find().
> 
> Perhaps you could get np from struct pci_dev.bus?

Possibly yes...but I was hoping the solution would be potentially
reused by other devices that don't have DT nodes, USB for example,
though I haven't tested with current version.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu