Re: [PATCH v9 03/10] iommu: Separate IOMMU_DEV_FEAT_IOPF from IOMMU_DEV_FEAT_SVA

2021-01-15 Thread Lu Baolu

Hi Jean,

On 2021/1/15 0:41, Jean-Philippe Brucker wrote:

I guess detailing what's needed for nested IOPF can help the discussion,
although I haven't seen any concrete plan about implementing it, and it
still seems a couple of years away. There are two important steps with
nested IOPF:

(1) Figuring out whether a fault comes from L1 or L2. A SMMU stall event
 comes with this information, but a PRI page request doesn't. The IOMMU
 driver has to first translate the IOVA to a GPA, injecting the fault
 into the guest if this translation fails by using the usual
 iommu_report_device_fault().

(2) Translating the faulting GPA to a HVA that can be fed to
 handle_mm_fault(). That requires help from KVM, so another interface -
 either KVM registering GPA->HVA translation tables or IOMMU driver
 querying each translation. Either way it should be reusable by device
 drivers that implement IOPF themselves.

(1) could be enabled with iommu_dev_enable_feature(). (2) requires a more
complex interface. (2) alone might also be desirable - demand-paging for
level 2 only, no SVA for level 1.

Anyway, back to this patch. What I'm trying to convey is "can the IOMMU
receive incoming I/O page faults for this device and, when SVA is enabled,
feed them to the mm subsystem?  Enable that or return an error." I'm stuck
on the name. IOPF alone is too vague. Not IOPF_L1 as Kevin noted, since L1
is also used in virtualization. IOPF_BIND and IOPF_SVA could also mean (2)
above. IOMMU_DEV_FEAT_IOPF_FLAT?

That leaves space for the nested extensions. (1) above could be
IOMMU_FEAT_IOPF_NESTED, and (2) requires some new interfacing with KVM (or
just an external fault handler) and could be used with either IOPF_FLAT or
IOPF_NESTED. We can figure out the details later. What do you think?


I agree that we can define IOPF_ for current usage and leave space for
future extensions.

IOPF_FLAT represents IOPF on first-level translation, currently first
level translation could be used in below cases.

1) FL w/ internal Page Table: Kernel IOVA;
2) FL w/ external Page Table: VFIO passthrough;
3) FL w/ shared CPU page table: SVA

We don't need to support IOPF for case 1). Let's put it aside.

IOPF handling of 2) and 3) are different. Do we need to define different
names to distinguish these two cases?

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v13 08/15] iommu/smmuv3: Implement cache_invalidate

2021-01-15 Thread chenxiang (M)

Hi Eric,


在 2020/11/18 19:21, Eric Auger 写道:

Implement domain-selective and page-selective IOTLB invalidations.

Signed-off-by: Eric Auger 

---
v7 -> v8:
- ASID based invalidation using iommu_inv_pasid_info
- check ARCHID/PASID flags in addr based invalidation
- use __arm_smmu_tlb_inv_context and __arm_smmu_tlb_inv_range_nosync

v6 -> v7
- check the uapi version

v3 -> v4:
- adapt to changes in the uapi
- add support for leaf parameter
- do not use arm_smmu_tlb_inv_range_nosync or arm_smmu_tlb_inv_context
   anymore

v2 -> v3:
- replace __arm_smmu_tlb_sync by arm_smmu_cmdq_issue_sync

v1 -> v2:
- properly pass the asid
---
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 53 +
  1 file changed, 53 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index fdecc9f17b36..24124361dd3b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2771,6 +2771,58 @@ static void arm_smmu_detach_pasid_table(struct 
iommu_domain *domain)
mutex_unlock(_domain->init_mutex);
  }
  
+static int

+arm_smmu_cache_invalidate(struct iommu_domain *domain, struct device *dev,
+ struct iommu_cache_invalidate_info *inv_info)
+{
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+   if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+   return -EINVAL;
+
+   if (!smmu)
+   return -EINVAL;
+
+   if (inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
+   return -EINVAL;
+
+   if (inv_info->cache & IOMMU_CACHE_INV_TYPE_IOTLB) {
+   if (inv_info->granularity == IOMMU_INV_GRANU_PASID) {
+   struct iommu_inv_pasid_info *info =
+   _info->granu.pasid_info;
+
+   if (!(info->flags & IOMMU_INV_PASID_FLAGS_ARCHID) ||
+(info->flags & IOMMU_INV_PASID_FLAGS_PASID))
+   return -EINVAL;
+
+   __arm_smmu_tlb_inv_context(smmu_domain, info->archid);
+
+   } else if (inv_info->granularity == IOMMU_INV_GRANU_ADDR) {
+   struct iommu_inv_addr_info *info = 
_info->granu.addr_info;
+   size_t size = info->nb_granules * info->granule_size;
+   bool leaf = info->flags & IOMMU_INV_ADDR_FLAGS_LEAF;
+
+   if (!(info->flags & IOMMU_INV_ADDR_FLAGS_ARCHID) ||
+(info->flags & IOMMU_INV_ADDR_FLAGS_PASID))
+   return -EINVAL;
+
+   __arm_smmu_tlb_inv_range(info->addr, size,
+info->granule_size, leaf,
+ smmu_domain, info->archid);


When debugging with vSMMU on ARM64 huawei platform,  there is a issue:  
RIL feature is enabled on guest OS while it is not supported on host 
OS,  with some operations
(such as rmmod driver during iperf between guest and host), SMMU 
translation error occurs frequently, and it works well if RIL feature is 
disabled on guest OS.
We find that in function vfio_iommu_unmap_notify() (qemu code) it passes 
total size of tlb (num_pages * granule_size) as granule size to host OS :

addr_info->granules_size = size
addr_info->nb_granule = 1
So total size (num_pages * granule_size) is passed as granule size to 
function __arm_smmu_inv_range(), if RIL feature is not supported, then 
total size may be not
the granule size that host OS can recongize, i will only invalidate part 
of tlb it wants to invalidate (for example, 4K/2M/1G pagesize is 
supported on host OS,  if total_size = 8K, then addr_info->granule_size 
= 8K,
addr_info->nb_granule = 1, as RIL feature is not supported, it 
invalidates just 4K one time but it thought it had invalidated 8K).
Please have a check of the issue, and we have a temporary fix as follows 
on the issue.


hw/arm/smmuv3.c   | 3 ++-
 hw/vfio/common.c  | 6 +++---
 include/exec/memory.h | 1 +
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 6725019..891a65d 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -821,7 +821,8 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,

 entry.target_as = _space_memory;
 entry.iova = iova;
-entry.addr_mask = num_pages * (1 << granule) - 1;
+entry.addr_mask = (1 << granule) - 1;
+entry.num_pages = num_pages;
 entry.perm = IOMMU_NONE;
 entry.arch_id = asid;
 entry.flags = IOMMU_INV_GRANU_ADDR;
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 5d365e0..c0164b1 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -604,7 +604,7 @@ static void vfio_iommu_unmap_notify(IOMMUNotifier 
*n, IOMMUTLBEntry *iotlb)

 hwaddr start = iotlb->iova + 

Re: [RFC PATCH v3 1/2] iommu: Add capability IOMMU_CAP_VIOMMU

2021-01-15 Thread Lu Baolu

Hi,

On 2021/1/15 14:31, Leon Romanovsky wrote:

On Fri, Jan 15, 2021 at 07:49:47AM +0800, Lu Baolu wrote:

Hi Leon,

On 1/14/21 9:26 PM, Leon Romanovsky wrote:

On Thu, Jan 14, 2021 at 09:30:02AM +0800, Lu Baolu wrote:

Some vendor IOMMU drivers are able to declare that it is running in a VM
context. This is very valuable for the features that only want to be
supported on bare metal. Add a capability bit so that it could be used.


And how is it used? Who and how will set it?


Use the existing iommu_capable(). I should add more descriptions about
who and how to use it.


I want to see the code that sets this capability.


Currently we have Intel VT-d and the virt-iommu setting this capability.

 static bool intel_iommu_capable(enum iommu_cap cap)
 {
if (cap == IOMMU_CAP_CACHE_COHERENCY)
return domain_update_iommu_snooping(NULL) == 1;
if (cap == IOMMU_CAP_INTR_REMAP)
return irq_remapping_enabled == 1;
+   if (cap == IOMMU_CAP_VIOMMU)
+   return caching_mode_enabled();

return false;
 }

And,

+static bool viommu_capable(enum iommu_cap cap)
+{
+   if (cap == IOMMU_CAP_VIOMMU)
+   return true;
+
+   return false;
+}

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 6/6] vfio/iommu_type1: Drop parameter "pgsize" of update_user_bitmap

2021-01-15 Thread Alex Williamson
On Thu, 7 Jan 2021 12:44:01 +0800
Keqian Zhu  wrote:

> We always use the smallest supported page size of vfio_iommu as
> pgsize. Drop parameter "pgsize" of update_user_bitmap.
> 
> Signed-off-by: Keqian Zhu 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 82649a040148..bceda5e8baaa 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -978,10 +978,9 @@ static void vfio_update_pgsize_bitmap(struct vfio_iommu 
> *iommu)
>  }
>  
>  static int update_user_bitmap(u64 __user *bitmap, struct vfio_iommu *iommu,
> -   struct vfio_dma *dma, dma_addr_t base_iova,
> -   size_t pgsize)
> +   struct vfio_dma *dma, dma_addr_t base_iova)
>  {
> - unsigned long pgshift = __ffs(pgsize);
> + unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
>   unsigned long nbits = dma->size >> pgshift;
>   unsigned long bit_offset = (dma->iova - base_iova) >> pgshift;
>   unsigned long copy_offset = bit_offset / BITS_PER_LONG;
> @@ -1046,7 +1045,7 @@ static int vfio_iova_dirty_bitmap(u64 __user *bitmap, 
> struct vfio_iommu *iommu,
>   if (dma->iova > iova + size - 1)
>   break;
>  
> - ret = update_user_bitmap(bitmap, iommu, dma, iova, pgsize);
> + ret = update_user_bitmap(bitmap, iommu, dma, iova);
>   if (ret)
>   return ret;
>  
> @@ -1192,7 +1191,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>  
>   if (unmap->flags & VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP) {
>   ret = update_user_bitmap(bitmap->data, iommu, dma,
> -  unmap->iova, pgsize);
> +  unmap->iova);
>   if (ret)
>   break;
>   }

Same as the previous, both call sites already have both pgsize and
pgshift, pass both rather than recalculate.  Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/6] vfio/iommu_type1: Drop parameter "pgsize" of vfio_iova_dirty_bitmap

2021-01-15 Thread Alex Williamson
On Thu, 7 Jan 2021 12:44:00 +0800
Keqian Zhu  wrote:

> We always use the smallest supported page size of vfio_iommu as
> pgsize. Remove parameter "pgsize" of vfio_iova_dirty_bitmap.
> 
> Signed-off-by: Keqian Zhu 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 080c05b129ee..82649a040148 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1015,11 +1015,12 @@ static int update_user_bitmap(u64 __user *bitmap, 
> struct vfio_iommu *iommu,
>  }
>  
>  static int vfio_iova_dirty_bitmap(u64 __user *bitmap, struct vfio_iommu 
> *iommu,
> -   dma_addr_t iova, size_t size, size_t pgsize)
> +   dma_addr_t iova, size_t size)
>  {
>   struct vfio_dma *dma;
>   struct rb_node *n;
> - unsigned long pgshift = __ffs(pgsize);
> + unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
> + size_t pgsize = (size_t)1 << pgshift;
>   int ret;
>  
>   /*
> @@ -2824,8 +2825,7 @@ static int vfio_iommu_type1_dirty_pages(struct 
> vfio_iommu *iommu,
>   if (iommu->dirty_page_tracking)
>   ret = vfio_iova_dirty_bitmap(range.bitmap.data,
>iommu, range.iova,
> -  range.size,
> -  range.bitmap.pgsize);
> +  range.size);
>   else
>   ret = -EINVAL;
>  out_unlock:

In this case the caller has actually already calculated both pgsize and
pgshift, the better optimization would be to pass both rather than
recalculate.  Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/6] vfio/iommu_type1: Drop parameter "pgsize" of vfio_dma_bitmap_alloc_all

2021-01-15 Thread Alex Williamson
On Thu, 7 Jan 2021 12:43:59 +0800
Keqian Zhu  wrote:

> We always use the smallest supported page size of vfio_iommu as
> pgsize. Remove parameter "pgsize" of vfio_dma_bitmap_alloc_all.
> 
> Signed-off-by: Keqian Zhu 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index b596c482487b..080c05b129ee 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -236,9 +236,10 @@ static void vfio_dma_populate_bitmap(struct vfio_dma 
> *dma, size_t pgsize)
>   }
>  }
>  
> -static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu, size_t pgsize)
> +static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu)
>  {
>   struct rb_node *n;
> + size_t pgsize = (size_t)1 << __ffs(iommu->pgsize_bitmap);
>  
>   for (n = rb_first(>dma_list); n; n = rb_next(n)) {
>   struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node);
> @@ -2761,12 +2762,9 @@ static int vfio_iommu_type1_dirty_pages(struct 
> vfio_iommu *iommu,
>   return -EINVAL;
>  
>   if (dirty.flags & VFIO_IOMMU_DIRTY_PAGES_FLAG_START) {
> - size_t pgsize;
> -
>   mutex_lock(>lock);
> - pgsize = 1 << __ffs(iommu->pgsize_bitmap);
>   if (!iommu->dirty_page_tracking) {
> - ret = vfio_dma_bitmap_alloc_all(iommu, pgsize);
> + ret = vfio_dma_bitmap_alloc_all(iommu);
>   if (!ret)
>   iommu->dirty_page_tracking = true;
>   }

This just moves the same calculation from one place to another, what's
the point?  Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/6] vfio/iommu_type1: Initially set the pinned_page_dirty_scope

2021-01-15 Thread Alex Williamson
On Thu, 7 Jan 2021 12:43:58 +0800
Keqian Zhu  wrote:

> For now there are 3 ways to promote the pinned_page_dirty_scope
> status of vfio_iommu:
> 
> 1. Through vfio pin interface.
> 2. Detach a group without pinned_dirty_scope.
> 3. Attach a group with pinned_dirty_scope.
> 
> For point 3, the only chance to promote the pinned_page_dirty_scope
> status is when vfio_iommu is newly created. As we can safely set
> empty vfio_iommu to be at pinned status, then the point 3 can be
> removed to reduce operations.
> 
> Signed-off-by: Keqian Zhu 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 110ada24ee91..b596c482487b 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -2045,11 +2045,8 @@ static int vfio_iommu_type1_attach_group(void 
> *iommu_data,
>* Non-iommu backed group cannot dirty memory directly,
>* it can only use interfaces that provide dirty
>* tracking.
> -  * The iommu scope can only be promoted with the
> -  * addition of a dirty tracking group.
>*/
>   group->pinned_page_dirty_scope = true;
> - promote_pinned_page_dirty_scope(iommu);
>   mutex_unlock(>lock);
>  
>   return 0;
> @@ -2436,6 +2433,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
>   INIT_LIST_HEAD(>iova_list);
>   iommu->dma_list = RB_ROOT;
>   iommu->dma_avail = dma_entry_limit;
> + iommu->pinned_page_dirty_scope = true;
>   mutex_init(>lock);
>   BLOCKING_INIT_NOTIFIER_HEAD(>notifier);
>  

This would be resolved automatically if we used the counter approach I
mentioned on the previous patch, adding a pinned-page scope group simply
wouldn't increment the iommu counter, which would initially be zero
indicating no "all-dma" groups.  Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/6] vfio/iommu_type1: Ignore external domain when promote pinned_scope

2021-01-15 Thread Alex Williamson
On Thu, 7 Jan 2021 12:43:57 +0800
Keqian Zhu  wrote:

> The pinned_scope of external domain's groups are always true, that's
> to say we can safely ignore external domain when promote pinned_scope
> status of vfio_iommu.
> 
> Signed-off-by: Keqian Zhu 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 14 +++---
>  1 file changed, 3 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 334a8240e1da..110ada24ee91 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1637,14 +1637,7 @@ static void promote_pinned_page_dirty_scope(struct 
> vfio_iommu *iommu)
>   }
>   }
>  
> - if (iommu->external_domain) {
> - domain = iommu->external_domain;
> - list_for_each_entry(group, >group_list, next) {
> - if (!group->pinned_page_dirty_scope)
> - return;
> - }
> - }
> -
> + /* The external domain always passes check */
>   iommu->pinned_page_dirty_scope = true;
>  }
>  
> @@ -2347,7 +2340,6 @@ static void vfio_iommu_type1_detach_group(void 
> *iommu_data,
>   if (iommu->external_domain) {
>   group = find_iommu_group(iommu->external_domain, iommu_group);
>   if (group) {
> - promote_dirty_scope = !group->pinned_page_dirty_scope;


With this, vfio_group.pinned_page_dirty_scope is effectively a dead
field on the struct for groups on the external_domain group list and
handled specially.  That's not great.

If you actually want to make more than a trivial improvement to scope
tracking, what about making a counter on our struct vfio_iommu for all
the non-pinned-page (ie. all-dma) scope groups attached to the
container.  Groups on the external domain would still set their group
dirty scope to pinned pages, groups making use of an iommu domain would
have an all-dma scope initially and increment that counter when
attached.  Groups that still have an all-dma scope on detach would
decrement the counter.  If a group changes from all-dma to pinned-page
scope, the counter is also decremented.  We'd never need to search
across group lists.  Thanks,

Alex

>   list_del(>next);
>   kfree(group);
>  
> @@ -2360,7 +2352,8 @@ static void vfio_iommu_type1_detach_group(void 
> *iommu_data,
>   kfree(iommu->external_domain);
>   iommu->external_domain = NULL;
>   }
> - goto detach_group_done;
> + mutex_unlock(>lock);
> + return;
>   }
>   }
>  
> @@ -2408,7 +2401,6 @@ static void vfio_iommu_type1_detach_group(void 
> *iommu_data,
>   else
>   vfio_iommu_iova_free(_copy);
>  
> -detach_group_done:
>   /*
>* Removal of a group without dirty tracking may allow the iommu scope
>* to be promoted.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/6] vfio/iommu_type1: Make an explicit "promote" semantic

2021-01-15 Thread Alex Williamson
On Thu, 7 Jan 2021 12:43:56 +0800
Keqian Zhu  wrote:

> When we want to promote the pinned_page_dirty_scope of vfio_iommu,
> we call the "update" function to visit all vfio_group, but when we
> want to downgrade this, we can set the flag as false directly.

I agree that the transition can only go in one direction, but it's
still conditional on the scope of all groups involved.  We are
"updating" the iommu state based on the change of a group.  Renaming
this to "promote" seems like a matter of personal preference.

> So we'd better make an explicit "promote" semantic to the "update"
> function. BTW, if vfio_iommu already has been promoted, then return
> early.

Currently it's the caller that avoids using this function when the
iommu scope is already correct.  In fact the changes induces a
redundant test in the pin_pages code path, we're changing a group from
non-pinned-page-scope to pinned-page-scope, therefore the iommu scope
cannot initially be scope limited.  In the attach_group call path,
we're moving that test from the caller, so at best we've introduced an
additional function call.

The function as it exists today is also more versatile whereas the
"promote" version here forces it to a single task with no appreciable
difference in complexity or code.  This seems like a frivolous change.
Thanks,

Alex

> Signed-off-by: Keqian Zhu 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 30 ++
>  1 file changed, 14 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 0b4dedaa9128..334a8240e1da 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -148,7 +148,7 @@ static int put_pfn(unsigned long pfn, int prot);
>  static struct vfio_group *vfio_iommu_find_iommu_group(struct vfio_iommu 
> *iommu,
>  struct iommu_group *iommu_group);
>  
> -static void update_pinned_page_dirty_scope(struct vfio_iommu *iommu);
> +static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu);
>  /*
>   * This code handles mapping and unmapping of user data buffers
>   * into DMA'ble space using the IOMMU
> @@ -714,7 +714,7 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
>   group = vfio_iommu_find_iommu_group(iommu, iommu_group);
>   if (!group->pinned_page_dirty_scope) {
>   group->pinned_page_dirty_scope = true;
> - update_pinned_page_dirty_scope(iommu);
> + promote_pinned_page_dirty_scope(iommu);
>   }
>  
>   goto pin_done;
> @@ -1622,27 +1622,26 @@ static struct vfio_group 
> *vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
>   return group;
>  }
>  
> -static void update_pinned_page_dirty_scope(struct vfio_iommu *iommu)
> +static void promote_pinned_page_dirty_scope(struct vfio_iommu *iommu)
>  {
>   struct vfio_domain *domain;
>   struct vfio_group *group;
>  
> + if (iommu->pinned_page_dirty_scope)
> + return;
> +
>   list_for_each_entry(domain, >domain_list, next) {
>   list_for_each_entry(group, >group_list, next) {
> - if (!group->pinned_page_dirty_scope) {
> - iommu->pinned_page_dirty_scope = false;
> + if (!group->pinned_page_dirty_scope)
>   return;
> - }
>   }
>   }
>  
>   if (iommu->external_domain) {
>   domain = iommu->external_domain;
>   list_for_each_entry(group, >group_list, next) {
> - if (!group->pinned_page_dirty_scope) {
> - iommu->pinned_page_dirty_scope = false;
> + if (!group->pinned_page_dirty_scope)
>   return;
> - }
>   }
>   }
>  
> @@ -2057,8 +2056,7 @@ static int vfio_iommu_type1_attach_group(void 
> *iommu_data,
>* addition of a dirty tracking group.
>*/
>   group->pinned_page_dirty_scope = true;
> - if (!iommu->pinned_page_dirty_scope)
> - update_pinned_page_dirty_scope(iommu);
> + promote_pinned_page_dirty_scope(iommu);
>   mutex_unlock(>lock);
>  
>   return 0;
> @@ -2341,7 +2339,7 @@ static void vfio_iommu_type1_detach_group(void 
> *iommu_data,
>   struct vfio_iommu *iommu = iommu_data;
>   struct vfio_domain *domain;
>   struct vfio_group *group;
> - bool update_dirty_scope = false;
> + bool promote_dirty_scope = false;
>   LIST_HEAD(iova_copy);
>  
>   mutex_lock(>lock);
> @@ -2349,7 +2347,7 @@ static void vfio_iommu_type1_detach_group(void 
> *iommu_data,
>   if (iommu->external_domain) {
>   group = find_iommu_group(iommu->external_domain, iommu_group);
>   if (group) {
> -  

Re: [PATCH v4 3/3] iommu/iova: Flush CPU rcache for when a depot fills

2021-01-15 Thread Robin Murphy

On 2021-01-15 17:32, Jean-Philippe Brucker wrote:

On Thu, Dec 10, 2020 at 02:23:09AM +0800, John Garry wrote:

Leizhen reported some time ago that IOVA performance may degrade over time
[0], but unfortunately his solution to fix this problem was not given
attention.

To summarize, the issue is that as time goes by, the CPU rcache and depot
rcache continue to grow. As such, IOVA RB tree access time also continues
to grow.

At a certain point, a depot may become full, and also some CPU rcaches may
also be full when inserting another IOVA is attempted. For this scenario,
currently the "loaded" CPU rcache is freed and a new one is created. This
freeing means that many IOVAs in the RB tree need to be freed, which
makes IO throughput performance fall off a cliff in some storage scenarios:

Jobs: 12 (f=12): [] [0.0% done] [6314MB/0KB/0KB /s] [1616K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [5669MB/0KB/0KB /s] [1451K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [6673MB/0KB/0KB /s] [1708K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [6761MB/0KB/0KB /s] [1731K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [6685MB/0KB/0KB /s] [1711K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [6178MB/0KB/0KB /s] [1582K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [6731MB/0KB/0KB /s] [1723K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [2387MB/0KB/0KB /s] [611K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [2689MB/0KB/0KB /s] [688K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [2278MB/0KB/0KB /s] [583K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [1288MB/0KB/0KB /s] [330K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [1632MB/0KB/0KB /s] [418K/0/0 iops]
Jobs: 12 (f=12): [] [0.0% done] [1765MB/0KB/0KB /s] [452K/0/0 iops]

And continue in this fashion, without recovering. Note that in this
example it was required to wait 16 hours for this to occur. Also note that
IO throughput also becomes gradually becomes more unstable leading up to
this point.

This problem is only seen for non-strict mode. For strict mode, the rcaches
stay quite compact.


It would be good to understand why the rcache doesn't stabilize. Could be
a bug, or just need some tuning

In strict mode, if a driver does Alloc-Free-Alloc and the first alloc
misses the rcache, the second allocation hits it. The same sequence in
non-strict mode misses the cache twice, because the IOVA is added to the
flush queue on Free.

So rather than AFAFAF.. we get AAA..FFF.., only once the fq_timer triggers
or the FQ is full. Interestingly the FQ size is 2x IOVA_MAG_SIZE, so we
could allocate 2 magazines worth of fresh IOVAs before alloc starts
hitting the cache. If a job allocates more than that, some magazines are
going to the depot, and with multi-CPU jobs those will get used on other
CPUs during the next alloc bursts, causing the progressive increase in
rcache consumption. I wonder if setting IOVA_MAG_SIZE > IOVA_FQ_SIZE helps
reuse of IOVAs?

Then again I haven't worked out the details, might be entirely wrong. I'll
have another look next week.


I did start digging into the data (thanks for that!) before Christmas, 
but between being generally frazzled and trying to remember how to write 
Perl to massage the numbers out of the log dump I never got round to 
responding, sorry.


The partial thoughts that I can recall right now are firstly that the 
total numbers of IOVAs are actually pretty meaningless, it really needs 
to be broken down by size (that's where my Perl-hacking stalled...); 
secondly that the pattern is far more than just a steady increase - the 
CPU rcache count looks to be heading asymptotically towards ~65K IOVAs 
all the time, representing (IIRC) two sizes being in heavy rotation, 
while the depot is happily ticking along in a steady state as expected, 
until it suddenly explodes out of nowhere; thirdly, I'd really like to 
see instrumentation of the flush queues at the same time, since I think 
they're the real culprit.


My theory so far is that everyone is calling queue_iova() frequently 
enough to keep the timer at bay and their own queues drained. Then at 
the ~16H mark, *something* happens that pauses unmaps long enough for 
the timer to fire, and at that point all hell breaks loose. The depot is 
suddenly flooded with IOVAs of *all* sizes, indicative of all the queues 
being flushed at once (note that the two most common sizes have been 
hovering perilously close to "full" the whole time), but then, 
crucially, *that keeps happening*. My guess is that the load of 
fq_flush_timeout() slows things down enough that the the 

Re: [PATCH v2 2/2] vfio/iommu_type1: Sanity check pfn_list when remove vfio_dma

2021-01-15 Thread Alex Williamson
On Fri, 15 Jan 2021 17:26:43 +0800
Keqian Zhu  wrote:

> vfio_sanity_check_pfn_list() is used to check whether pfn_list of
> vfio_dma is empty when remove the external domain, so it makes a
> wrong assumption that only external domain will add pfn to dma pfn_list.
> 
> Now we apply this check when remove a specific vfio_dma and extract
> the notifier check just for external domain.

The page pinning interface is gated by having a notifier registered for
unmaps, therefore non-external domains would also need to register a
notifier.  There's currently no other way to add entries to the
pfn_list.  So if we allow pinning for such domains, then it's wrong to
WARN_ON() when the notifier list is not-empty when removing an external
domain.  Long term we should probably extend page {un}pinning for the
caller to pass their notifier to be validated against the notifier list
rather than just allowing page pinning if *any* notifier is registered.
Thanks,

Alex
 
> Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices")
> Signed-off-by: Keqian Zhu 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 24 +---
>  1 file changed, 5 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 4e82b9a3440f..a9bc15e84a4e 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -958,6 +958,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, 
> struct vfio_dma *dma,
>  
>  static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
>  {
> + WARN_ON(!RB_EMPTY_ROOT(>pfn_list);
>   vfio_unmap_unpin(iommu, dma, true);
>   vfio_unlink_dma(iommu, dma);
>   put_task_struct(dma->task);
> @@ -2251,23 +2252,6 @@ static void vfio_iommu_unmap_unpin_reaccount(struct 
> vfio_iommu *iommu)
>   }
>  }
>  
> -static void vfio_sanity_check_pfn_list(struct vfio_iommu *iommu)
> -{
> - struct rb_node *n;
> -
> - n = rb_first(>dma_list);
> - for (; n; n = rb_next(n)) {
> - struct vfio_dma *dma;
> -
> - dma = rb_entry(n, struct vfio_dma, node);
> -
> - if (WARN_ON(!RB_EMPTY_ROOT(>pfn_list)))
> - break;
> - }
> - /* mdev vendor driver must unregister notifier */
> - WARN_ON(iommu->notifier.head);
> -}
> -
>  /*
>   * Called when a domain is removed in detach. It is possible that
>   * the removed domain decided the iova aperture window. Modify the
> @@ -2367,7 +2351,8 @@ static void vfio_iommu_type1_detach_group(void 
> *iommu_data,
>   kfree(group);
>  
>   if (list_empty(>external_domain->group_list)) {
> - vfio_sanity_check_pfn_list(iommu);
> + /* mdev vendor driver must unregister notifier 
> */
> + WARN_ON(iommu->notifier.head);
>  
>   if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu))
>   vfio_iommu_unmap_unpin_all(iommu);
> @@ -2491,7 +2476,8 @@ static void vfio_iommu_type1_release(void *iommu_data)
>  
>   if (iommu->external_domain) {
>   vfio_release_domain(iommu->external_domain, true);
> - vfio_sanity_check_pfn_list(iommu);
> + /* mdev vendor driver must unregister notifier */
> + WARN_ON(iommu->notifier.head);
>   kfree(iommu->external_domain);
>   }
>  

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [GIT PULL] IOMMU fixes for -rc4

2021-01-15 Thread pr-tracker-bot
The pull request you sent on Fri, 15 Jan 2021 11:26:20 +:

> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git tags/iommu-fixes

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/7aec71cd9c1f251ef17eae5f898c10133d49421d

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/2] vfio/iommu_type1: Populate full dirty when detach non-pinned group

2021-01-15 Thread Alex Williamson
On Fri, 15 Jan 2021 17:26:42 +0800
Keqian Zhu  wrote:

> If a group with non-pinned-page dirty scope is detached with dirty
> logging enabled, we should fully populate the dirty bitmaps at the
> time it's removed since we don't know the extent of its previous DMA,
> nor will the group be present to trigger the full bitmap when the user
> retrieves the dirty bitmap.
> 
> Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages 
> tracking")
> Suggested-by: Alex Williamson 
> Signed-off-by: Keqian Zhu 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 18 +-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 0b4dedaa9128..4e82b9a3440f 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -236,6 +236,19 @@ static void vfio_dma_populate_bitmap(struct vfio_dma 
> *dma, size_t pgsize)
>   }
>  }
>  
> +static void vfio_iommu_populate_bitmap_full(struct vfio_iommu *iommu)
> +{
> + struct rb_node *n;
> + unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
> +
> + for (n = rb_first(>dma_list); n; n = rb_next(n)) {
> + struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node);
> +
> + if (dma->iommu_mapped)
> + bitmap_set(dma->bitmap, 0, dma->size >> pgshift);
> + }
> +}
> +
>  static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu, size_t pgsize)
>  {
>   struct rb_node *n;
> @@ -2415,8 +2428,11 @@ static void vfio_iommu_type1_detach_group(void 
> *iommu_data,
>* Removal of a group without dirty tracking may allow the iommu scope
>* to be promoted.
>*/
> - if (update_dirty_scope)
> + if (update_dirty_scope) {
>   update_pinned_page_dirty_scope(iommu);
> + if (iommu->dirty_page_tracking)
> + vfio_iommu_populate_bitmap_full(iommu);
> + }
>   mutex_unlock(>lock);
>  }
>  

This doesn't do the right thing.  This marks the bitmap dirty if:

 * The detached group dirty scope was not limited to pinned pages

 AND

 * Dirty tracking is enabled

 AND

 * The vfio_dma is *currently* (ie. after the detach) iommu_mapped

We need to mark the bitmap dirty based on whether the vfio_dma *was*
iommu_mapped by the group that is now detached.  Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 3/3] iommu/iova: Flush CPU rcache for when a depot fills

2021-01-15 Thread Jean-Philippe Brucker
On Thu, Dec 10, 2020 at 02:23:09AM +0800, John Garry wrote:
> Leizhen reported some time ago that IOVA performance may degrade over time
> [0], but unfortunately his solution to fix this problem was not given
> attention.
> 
> To summarize, the issue is that as time goes by, the CPU rcache and depot
> rcache continue to grow. As such, IOVA RB tree access time also continues
> to grow.
> 
> At a certain point, a depot may become full, and also some CPU rcaches may
> also be full when inserting another IOVA is attempted. For this scenario,
> currently the "loaded" CPU rcache is freed and a new one is created. This
> freeing means that many IOVAs in the RB tree need to be freed, which
> makes IO throughput performance fall off a cliff in some storage scenarios:
> 
> Jobs: 12 (f=12): [] [0.0% done] [6314MB/0KB/0KB /s] [1616K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [5669MB/0KB/0KB /s] [1451K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6673MB/0KB/0KB /s] [1708K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6031MB/0KB/0KB /s] [1544K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6761MB/0KB/0KB /s] [1731K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6705MB/0KB/0KB /s] [1717K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6685MB/0KB/0KB /s] [1711K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6178MB/0KB/0KB /s] [1582K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [6731MB/0KB/0KB /s] [1723K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [2387MB/0KB/0KB /s] [611K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [2689MB/0KB/0KB /s] [688K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [2278MB/0KB/0KB /s] [583K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [1288MB/0KB/0KB /s] [330K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [1632MB/0KB/0KB /s] [418K/0/0 
> iops]
> Jobs: 12 (f=12): [] [0.0% done] [1765MB/0KB/0KB /s] [452K/0/0 
> iops]
> 
> And continue in this fashion, without recovering. Note that in this
> example it was required to wait 16 hours for this to occur. Also note that
> IO throughput also becomes gradually becomes more unstable leading up to
> this point.
> 
> This problem is only seen for non-strict mode. For strict mode, the rcaches
> stay quite compact.

It would be good to understand why the rcache doesn't stabilize. Could be
a bug, or just need some tuning

In strict mode, if a driver does Alloc-Free-Alloc and the first alloc
misses the rcache, the second allocation hits it. The same sequence in
non-strict mode misses the cache twice, because the IOVA is added to the
flush queue on Free. 

So rather than AFAFAF.. we get AAA..FFF.., only once the fq_timer triggers
or the FQ is full. Interestingly the FQ size is 2x IOVA_MAG_SIZE, so we
could allocate 2 magazines worth of fresh IOVAs before alloc starts
hitting the cache. If a job allocates more than that, some magazines are
going to the depot, and with multi-CPU jobs those will get used on other
CPUs during the next alloc bursts, causing the progressive increase in
rcache consumption. I wonder if setting IOVA_MAG_SIZE > IOVA_FQ_SIZE helps
reuse of IOVAs?

Then again I haven't worked out the details, might be entirely wrong. I'll
have another look next week.

Thanks,
Jean

> As a solution to this issue, judge that the IOVA caches have grown too big
> when cached magazines need to be free, and just flush all the CPUs rcaches
> instead.
> 
> The depot rcaches, however, are not flushed, as they can be used to
> immediately replenish active CPUs.
> 
> In future, some IOVA compaction could be implemented to solve the
> instability issue, which I figure could be quite complex to implement.
> 
> [0] 
> https://lore.kernel.org/linux-iommu/20190815121104.29140-3-thunder.leiz...@huawei.com/
> 
> Analyzed-by: Zhen Lei 
> Reported-by: Xiang Chen 
> Tested-by: Xiang Chen 
> Signed-off-by: John Garry 
> Reviewed-by: Zhen Lei 
> ---
>  drivers/iommu/iova.c | 16 ++--
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 732ee687e0e2..39b7488de8bb 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -841,7 +841,6 @@ static bool __iova_rcache_insert(struct iova_domain 
> *iovad,
>struct iova_rcache *rcache,
>unsigned long iova_pfn)
>  {
> - struct iova_magazine *mag_to_free = NULL;
>   struct iova_cpu_rcache *cpu_rcache;
>   bool can_insert = false;
>   unsigned long flags;
> @@ -863,13 +862,12 @@ static bool __iova_rcache_insert(struct iova_domain 
> *iovad,
>   if (cpu_rcache->loaded)
>   

Re: [PATCH v4 2/3] iommu/iova: Avoid double-negatives in magazine helpers

2021-01-15 Thread Jean-Philippe Brucker
On Thu, Dec 10, 2020 at 02:23:08AM +0800, John Garry wrote:
> A similar crash to the following could be observed if initial CPU rcache
> magazine allocations fail in init_iova_rcaches():

Any idea why that's happening?  This fix seems ok but if we're expecting
allocation failures for the loaded magazine then we could easily get it
for cpu_rcaches too, and get a similar abort at runtime.

Thanks,
Jean

> Unable to handle kernel NULL pointer dereference at virtual address 
> 
> Mem abort info:
> 
>   free_iova_fast+0xfc/0x280
>   iommu_dma_free_iova+0x64/0x70
>   __iommu_dma_unmap+0x9c/0xf8
>   iommu_dma_unmap_sg+0xa8/0xc8
>   dma_unmap_sg_attrs+0x28/0x50
>   cq_thread_v3_hw+0x2dc/0x528
>   irq_thread_fn+0x2c/0xa0
>   irq_thread+0x130/0x1e0
>   kthread+0x154/0x158
>   ret_from_fork+0x10/0x34
> 
> The issue is that expression !iova_magazine_full(NULL) evaluates true; this
> falls over in __iova_rcache_insert() when we attempt to cache a mag and
> cpu_rcache->loaded == NULL:
> 
> if (!iova_magazine_full(cpu_rcache->loaded)) {
>   can_insert = true;
> ...
> 
> if (can_insert)
>   iova_magazine_push(cpu_rcache->loaded, iova_pfn);
> 
> As above, can_insert is evaluated true, which it shouldn't be, and we try
> to insert pfns in a NULL mag, which is not safe.
> 
> To avoid this, stop using double-negatives, like !iova_magazine_full() and
> !iova_magazine_empty(), and use positive tests, like
> iova_magazine_has_space() and iova_magazine_has_pfns(), respectively; these
> can safely deal with cpu_rcache->{loaded, prev} = NULL.
> 
> Signed-off-by: John Garry 
> Tested-by: Xiang Chen 
> Reviewed-by: Zhen Lei 

> ---
>  drivers/iommu/iova.c | 29 +
>  1 file changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index cf1aacda2fe4..732ee687e0e2 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -767,14 +767,18 @@ iova_magazine_free_pfns(struct iova_magazine *mag, 
> struct iova_domain *iovad)
>   mag->size = 0;
>  }
>  
> -static bool iova_magazine_full(struct iova_magazine *mag)
> +static bool iova_magazine_has_space(struct iova_magazine *mag)
>  {
> - return (mag && mag->size == IOVA_MAG_SIZE);
> + if (!mag)
> + return false;
> + return mag->size < IOVA_MAG_SIZE;
>  }
>  
> -static bool iova_magazine_empty(struct iova_magazine *mag)
> +static bool iova_magazine_has_pfns(struct iova_magazine *mag)
>  {
> - return (!mag || mag->size == 0);
> + if (!mag)
> + return false;
> + return mag->size;
>  }
>  
>  static unsigned long iova_magazine_pop(struct iova_magazine *mag,
> @@ -783,7 +787,7 @@ static unsigned long iova_magazine_pop(struct 
> iova_magazine *mag,
>   int i;
>   unsigned long pfn;
>  
> - BUG_ON(iova_magazine_empty(mag));
> + BUG_ON(!iova_magazine_has_pfns(mag));
>  
>   /* Only fall back to the rbtree if we have no suitable pfns at all */
>   for (i = mag->size - 1; mag->pfns[i] > limit_pfn; i--)
> @@ -799,7 +803,7 @@ static unsigned long iova_magazine_pop(struct 
> iova_magazine *mag,
>  
>  static void iova_magazine_push(struct iova_magazine *mag, unsigned long pfn)
>  {
> - BUG_ON(iova_magazine_full(mag));
> + BUG_ON(!iova_magazine_has_space(mag));
>  
>   mag->pfns[mag->size++] = pfn;
>  }
> @@ -845,9 +849,9 @@ static bool __iova_rcache_insert(struct iova_domain 
> *iovad,
>   cpu_rcache = raw_cpu_ptr(rcache->cpu_rcaches);
>   spin_lock_irqsave(_rcache->lock, flags);
>  
> - if (!iova_magazine_full(cpu_rcache->loaded)) {
> + if (iova_magazine_has_space(cpu_rcache->loaded)) {
>   can_insert = true;
> - } else if (!iova_magazine_full(cpu_rcache->prev)) {
> + } else if (iova_magazine_has_space(cpu_rcache->prev)) {
>   swap(cpu_rcache->prev, cpu_rcache->loaded);
>   can_insert = true;
>   } else {
> @@ -856,8 +860,9 @@ static bool __iova_rcache_insert(struct iova_domain 
> *iovad,
>   if (new_mag) {
>   spin_lock(>lock);
>   if (rcache->depot_size < MAX_GLOBAL_MAGS) {
> - rcache->depot[rcache->depot_size++] =
> - cpu_rcache->loaded;
> + if (cpu_rcache->loaded)
> + rcache->depot[rcache->depot_size++] =
> + cpu_rcache->loaded;
>   } else {
>   mag_to_free = cpu_rcache->loaded;
>   }
> @@ -908,9 +913,9 @@ static unsigned long __iova_rcache_get(struct iova_rcache 
> *rcache,
>   cpu_rcache = raw_cpu_ptr(rcache->cpu_rcaches);
>   spin_lock_irqsave(_rcache->lock, flags);
>  
> - if (!iova_magazine_empty(cpu_rcache->loaded)) {
> + if (iova_magazine_has_pfns(cpu_rcache->loaded)) {
>   has_pfn = true;
> - } else 

Re: [PATCH v4 1/3] iommu/iova: Add free_all_cpu_cached_iovas()

2021-01-15 Thread Jean-Philippe Brucker
On Thu, Dec 10, 2020 at 02:23:07AM +0800, John Garry wrote:
> Add a helper function to free the CPU rcache for all online CPUs.
> 
> There also exists a function of the same name in
> drivers/iommu/intel/iommu.c, but the parameters are different, and there
> should be no conflict.
> 
> Signed-off-by: John Garry 
> Tested-by: Xiang Chen 
> Reviewed-by: Zhen Lei 

Reviewed-by: Jean-Philippe Brucker 

(unless we find a better solution for patch 3)

> ---
>  drivers/iommu/iova.c | 13 +
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index f9c35852018d..cf1aacda2fe4 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -238,6 +238,14 @@ static int __alloc_and_insert_iova_range(struct 
> iova_domain *iovad,
>   return -ENOMEM;
>  }
>  
> +static void free_all_cpu_cached_iovas(struct iova_domain *iovad)
> +{
> + unsigned int cpu;
> +
> + for_each_online_cpu(cpu)
> + free_cpu_cached_iovas(cpu, iovad);
> +}
> +
>  static struct kmem_cache *iova_cache;
>  static unsigned int iova_cache_users;
>  static DEFINE_MUTEX(iova_cache_mutex);
> @@ -435,15 +443,12 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned 
> long size,
>  retry:
>   new_iova = alloc_iova(iovad, size, limit_pfn, true);
>   if (!new_iova) {
> - unsigned int cpu;
> -
>   if (!flush_rcache)
>   return 0;
>  
>   /* Try replenishing IOVAs by flushing rcache. */
>   flush_rcache = false;
> - for_each_online_cpu(cpu)
> - free_cpu_cached_iovas(cpu, iovad);
> + free_all_cpu_cached_iovas(iovad);
>   free_global_cached_iovas(iovad);
>   goto retry;
>   }
> -- 
> 2.26.2
> 
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Consult on ARM SMMU debugfs

2021-01-15 Thread Robin Murphy

On 2021-01-15 15:14, Russell King - ARM Linux admin wrote:

On Mon, Jan 11, 2021 at 08:01:48PM +, Robin Murphy wrote:

On 2021-01-07 02:45, chenxiang (M) wrote:

Hi Will,� Robin or other guys,

When debugging SMMU/SVA issue on huawei ARM64 board, we find that it
lacks of enough debugfs for ARM SMMU driver (such as

the value of STE/CD which we need to check sometimes). Currently it
creates top-level iommu directory in debugfs, but there is no debugfs

for ARM SMMU driver specially. Do you know whether ARM have the plan to
do that recently?


FWIW I don't think I've ever felt the need to need to inspect the Stream
Table on a live system. So far the nature of the STE code has been simple
enough that it's very hard for any given STE to be *wrong* - either it's set
up as expected and thus works fine, or it's not initialised at all and you
get C_BAD_STE, where 99% of the time you then just cross-reference the
Stream ID against the firmware and find that the DT/IORT is wrong.

Similarly I don't think I've even even *seen* an issue that could be
attributed to a context descriptor, although I appreciate that as we start
landing more PASID and SVA support the scope for that starts to widen
considerably.

Feel free to propose a patch if you believe it would be genuinely useful and
won't just bit-rot into a maintenance burden, but it's not something that's
on our roadmap here.


I do think that the IOMMU stuff needs better debugging. I've hit the
WARN_ON() in __arm_lpae_map(), and it's been pretty much undebuggable,
so I've resorted to putting the IOMMU into bypass mode permanently to
work around the issue.

The reason that it's undebuggable is if one puts printk() or trace
statements in the code, boots the platform, you get flooded with those
debugging messages, because every access to the rootfs generates and
tears down a mapping.

It would be nice to be able to inspect the IOMMU page tables and state
of the IOMMU, rather than having to resort to effectively disabling
the IOMMU.


Certainly once we get to stuff like unpinned VFIO, having the ability to 
inspect pagetables for arbitrary IOMMU API usage will indeed be useful. 
From the DMA mapping perspective, though, unless you're working on the 
io-pgtable code itself it's not really going to tell you much that 
dumping the mappings from dma-debug can't already.


FWIW whenever I encounter that particular warning in iommu-dma context, 
I don't care where the existing mapping is pointing, since it's merely a 
symptom of the damage already having been done. At that point I'd 
usually go off and audit all the DMA API calls in the offending driver, 
since it's typically caused by corruption in the IOVA allocator from 
passing the wrong size in a dma_unmap_*() call, and those can often be 
spotted by inspection. For active debugging, what you really want to 
know is the *history* of operations around that IOVA, since you're 
primarily interested in the request that last mapped it, then the 
corresponding unmap request for nominally the same buffer (which allowed 
the IOVA region to be freed for reuse) that for some reason didn't cover 
one or more pages that it should have. The IOMMU API tracepoints can be 
a handy tool there.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Consult on ARM SMMU debugfs

2021-01-15 Thread Russell King - ARM Linux admin
On Mon, Jan 11, 2021 at 08:01:48PM +, Robin Murphy wrote:
> On 2021-01-07 02:45, chenxiang (M) wrote:
> > Hi Will,  Robin or other guys,
> > 
> > When debugging SMMU/SVA issue on huawei ARM64 board, we find that it
> > lacks of enough debugfs for ARM SMMU driver (such as
> > 
> > the value of STE/CD which we need to check sometimes). Currently it
> > creates top-level iommu directory in debugfs, but there is no debugfs
> > 
> > for ARM SMMU driver specially. Do you know whether ARM have the plan to
> > do that recently?
> 
> FWIW I don't think I've ever felt the need to need to inspect the Stream
> Table on a live system. So far the nature of the STE code has been simple
> enough that it's very hard for any given STE to be *wrong* - either it's set
> up as expected and thus works fine, or it's not initialised at all and you
> get C_BAD_STE, where 99% of the time you then just cross-reference the
> Stream ID against the firmware and find that the DT/IORT is wrong.
> 
> Similarly I don't think I've even even *seen* an issue that could be
> attributed to a context descriptor, although I appreciate that as we start
> landing more PASID and SVA support the scope for that starts to widen
> considerably.
> 
> Feel free to propose a patch if you believe it would be genuinely useful and
> won't just bit-rot into a maintenance burden, but it's not something that's
> on our roadmap here.

I do think that the IOMMU stuff needs better debugging. I've hit the
WARN_ON() in __arm_lpae_map(), and it's been pretty much undebuggable,
so I've resorted to putting the IOMMU into bypass mode permanently to
work around the issue.

The reason that it's undebuggable is if one puts printk() or trace
statements in the code, boots the platform, you get flooded with those
debugging messages, because every access to the rootfs generates and
tears down a mapping.

It would be nice to be able to inspect the IOMMU page tables and state
of the IOMMU, rather than having to resort to effectively disabling
the IOMMU.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: check for the deferred attach when attaching a device

2021-01-15 Thread Robin Murphy

On 2021-01-15 14:26, lijiang wrote:

Hi, Robin

Thank you for the comment.

在 2021年01月13日 01:29, Robin Murphy 写道:

On 2021-01-05 07:52, lijiang wrote:

在 2021年01月05日 11:55, lijiang 写道:

Hi,

Also add Joerg to cc list.



Also add more people to cc list, Jerry Snitselaar and Tom Lendacky.

Thanks.


Thanks.
Lianbo
在 2020年12月26日 13:39, Lianbo Jiang 写道:

Currently, because domain attach allows to be deferred from iommu
driver to device driver, and when iommu initializes, the devices
on the bus will be scanned and the default groups will be allocated.

Due to the above changes, some devices could be added to the same
group as below:

[    3.859417] pci :01:00.0: Adding to iommu group 16
[    3.864572] pci :01:00.1: Adding to iommu group 16
[    3.869738] pci :02:00.0: Adding to iommu group 17
[    3.874892] pci :02:00.1: Adding to iommu group 17

But when attaching these devices, it doesn't allow that a group has
more than one device, otherwise it will return an error. This conflicts
with the deferred attaching. Unfortunately, it has two devices in the
same group for my side, for example:

[    9.627014] iommu_group_device_count(): device name[0]::01:00.0
[    9.633545] iommu_group_device_count(): device name[1]::01:00.1
...
[   10.255609] iommu_group_device_count(): device name[0]::02:00.0
[   10.262144] iommu_group_device_count(): device name[1]::02:00.1

Finally, which caused the failure of tg3 driver when tg3 driver calls
the dma_alloc_coherent() to allocate coherent memory in the tg3_test_dma().

[    9.660310] tg3 :01:00.0: DMA engine test failed, aborting
[    9.754085] tg3: probe of :01:00.0 failed with error -12
[    9.997512] tg3 :01:00.1: DMA engine test failed, aborting
[   10.043053] tg3: probe of :01:00.1 failed with error -12
[   10.288905] tg3 :02:00.0: DMA engine test failed, aborting
[   10.334070] tg3: probe of :02:00.0 failed with error -12
[   10.578303] tg3 :02:00.1: DMA engine test failed, aborting
[   10.622629] tg3: probe of :02:00.1 failed with error -12

In addition, the similar situations also occur in other drivers such
as the bnxt_en driver. That can be reproduced easily in kdump kernel
when SME is active.

Add a check for the deferred attach in the iommu_attach_device() and
allow to attach the deferred device regardless of how many devices
are in a group.


Is this iommu_attach_device() call is coming from iommu-dma? (if not, then 
whoever's calling it probably shouldn't be)



Yes, you are right, the iommu_attach_device call is coming from iommu-dma.
  

Assuming so, then probably what should happen is to move the handling currently 
in iommu_dma_deferred_attach() into the core so that it can call 
__iommu_attach_device() directly - the intent is just to replay that exact call 
skipped in iommu_group_add_device(), so the legacy external 
iommu_attach_device() interface isn't really the right tool for the job


Sounds good. I will check if this can work in various cases. If it's OK, I will 
post again.

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f0305e6aac1b..5e7da902ac36 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -23,7 +23,6 @@
  #include 
  #include 
  #include 
-#include 
  #include 
  
  struct iommu_dma_msi_page {

@@ -378,21 +377,6 @@ static int iommu_dma_init_domain(struct iommu_domain 
*domain, dma_addr_t base,
return iova_reserve_iommu_regions(dev, domain);
  }
  
-static int iommu_dma_deferred_attach(struct device *dev,

-   struct iommu_domain *domain)
-{
-   const struct iommu_ops *ops = domain->ops;
-
-   if (!is_kdump_kernel())
-   return 0;
-
-   if (unlikely(ops->is_attach_deferred &&
-   ops->is_attach_deferred(domain, dev)))
-   return iommu_attach_device(domain, dev);
-
-   return 0;
-}
-
  /**
   * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU API
   *page flags.
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ffeebda8d6de..4fed1567b498 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -23,6 +23,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  
  static struct kset *iommu_group_kset;

@@ -1952,6 +1953,21 @@ static int __iommu_attach_device(struct iommu_domain 
*domain,
return ret;
  }
  
+int iommu_dma_deferred_attach(struct device *dev,

+struct iommu_domain *domain)
+{
+const struct iommu_ops *ops = domain->ops;
+
+if (!is_kdump_kernel())
+return 0;
+
+if (unlikely(ops->is_attach_deferred &&
+ops->is_attach_deferred(domain, dev)))
+return __iommu_attach_device(domain, dev);
+
+return 0;
+}
+
  int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
  {
struct iommu_group *group;
diff --git a/include/linux/iommu.h 

Re: [PATCH] iommu: check for the deferred attach when attaching a device

2021-01-15 Thread lijiang
Hi, Robin

Thank you for the comment.

在 2021年01月13日 01:29, Robin Murphy 写道:
> On 2021-01-05 07:52, lijiang wrote:
>> 在 2021年01月05日 11:55, lijiang 写道:
>>> Hi,
>>>
>>> Also add Joerg to cc list.
>>>
>>
>> Also add more people to cc list, Jerry Snitselaar and Tom Lendacky.
>>
>> Thanks.
>>
>>> Thanks.
>>> Lianbo
>>> 在 2020年12月26日 13:39, Lianbo Jiang 写道:
 Currently, because domain attach allows to be deferred from iommu
 driver to device driver, and when iommu initializes, the devices
 on the bus will be scanned and the default groups will be allocated.

 Due to the above changes, some devices could be added to the same
 group as below:

 [    3.859417] pci :01:00.0: Adding to iommu group 16
 [    3.864572] pci :01:00.1: Adding to iommu group 16
 [    3.869738] pci :02:00.0: Adding to iommu group 17
 [    3.874892] pci :02:00.1: Adding to iommu group 17

 But when attaching these devices, it doesn't allow that a group has
 more than one device, otherwise it will return an error. This conflicts
 with the deferred attaching. Unfortunately, it has two devices in the
 same group for my side, for example:

 [    9.627014] iommu_group_device_count(): device name[0]::01:00.0
 [    9.633545] iommu_group_device_count(): device name[1]::01:00.1
 ...
 [   10.255609] iommu_group_device_count(): device name[0]::02:00.0
 [   10.262144] iommu_group_device_count(): device name[1]::02:00.1

 Finally, which caused the failure of tg3 driver when tg3 driver calls
 the dma_alloc_coherent() to allocate coherent memory in the tg3_test_dma().

 [    9.660310] tg3 :01:00.0: DMA engine test failed, aborting
 [    9.754085] tg3: probe of :01:00.0 failed with error -12
 [    9.997512] tg3 :01:00.1: DMA engine test failed, aborting
 [   10.043053] tg3: probe of :01:00.1 failed with error -12
 [   10.288905] tg3 :02:00.0: DMA engine test failed, aborting
 [   10.334070] tg3: probe of :02:00.0 failed with error -12
 [   10.578303] tg3 :02:00.1: DMA engine test failed, aborting
 [   10.622629] tg3: probe of :02:00.1 failed with error -12

 In addition, the similar situations also occur in other drivers such
 as the bnxt_en driver. That can be reproduced easily in kdump kernel
 when SME is active.

 Add a check for the deferred attach in the iommu_attach_device() and
 allow to attach the deferred device regardless of how many devices
 are in a group.
> 
> Is this iommu_attach_device() call is coming from iommu-dma? (if not, then 
> whoever's calling it probably shouldn't be)
> 

Yes, you are right, the iommu_attach_device call is coming from iommu-dma.
 
> Assuming so, then probably what should happen is to move the handling 
> currently in iommu_dma_deferred_attach() into the core so that it can call 
> __iommu_attach_device() directly - the intent is just to replay that exact 
> call skipped in iommu_group_add_device(), so the legacy external 
> iommu_attach_device() interface isn't really the right tool for the job 

Sounds good. I will check if this can work in various cases. If it's OK, I will 
post again.

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f0305e6aac1b..5e7da902ac36 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -23,7 +23,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 struct iommu_dma_msi_page {
@@ -378,21 +377,6 @@ static int iommu_dma_init_domain(struct iommu_domain 
*domain, dma_addr_t base,
return iova_reserve_iommu_regions(dev, domain);
 }
 
-static int iommu_dma_deferred_attach(struct device *dev,
-   struct iommu_domain *domain)
-{
-   const struct iommu_ops *ops = domain->ops;
-
-   if (!is_kdump_kernel())
-   return 0;
-
-   if (unlikely(ops->is_attach_deferred &&
-   ops->is_attach_deferred(domain, dev)))
-   return iommu_attach_device(domain, dev);
-
-   return 0;
-}
-
 /**
  * dma_info_to_prot - Translate DMA API directions and attributes to IOMMU API
  *page flags.
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ffeebda8d6de..4fed1567b498 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static struct kset *iommu_group_kset;
@@ -1952,6 +1953,21 @@ static int __iommu_attach_device(struct iommu_domain 
*domain,
return ret;
 }
 
+int iommu_dma_deferred_attach(struct device *dev,
+struct iommu_domain *domain)
+{
+const struct iommu_ops *ops = domain->ops;
+
+if (!is_kdump_kernel())
+return 0;
+
+if (unlikely(ops->is_attach_deferred &&
+ops->is_attach_deferred(domain, dev)))
+return __iommu_attach_device(domain, 

RE: [EXTERNAL] Re: amdgpu error whenever IOMMU is enabled

2021-01-15 Thread Merger, Edgar [AUTOSOL/MAS/AUGS]
Hi Joerg,

Thanks. Hope you are doing well now.

Edgar

-Original Message-
From: jroe...@suse.de  
Sent: Freitag, 15. Januar 2021 09:18
To: Merger, Edgar [AUTOSOL/MAS/AUGS] 
Cc: iommu@lists.linux-foundation.org
Subject: Re: [EXTERNAL] Re: amdgpu error whenever IOMMU is enabled

Hi Edgar,

On Mon, Nov 23, 2020 at 06:41:18AM +, Merger, Edgar [AUTOSOL/MAS/AUGS] 
wrote:
> Just wanted to follow-up on that topic.
> Is that quirk already put into upstream kernel?

Sorry for the late reply, I had to take an extended sick leave. I will take 
care of sending this fix upstream next week.

Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 5/6] media: uvcvideo: Use dma_alloc_noncontiguos API

2021-01-15 Thread Ricardo Ribalda
On Mon, Jan 11, 2021 at 9:36 AM . Christoph Hellwig  wrote:
>
> On Thu, Jan 07, 2021 at 03:14:08PM +0100, Ricardo Ribalda wrote:
> > > So I think we do want these branches for coherent vs non-coherent as they
> > > have very different semantics and I do not think that hiding them under
> > > the same API helps people to understand those vastly different semantics.
> > >
> > > OTOH we should look into a fallback for DMA API instances that do not
> > > support the discontigous allocations.
> > >
> > > I think between your comments and those from Ricardo I have a good idea
> > > for a somewhat updated API.  I'll try to post something in the next days.
> >
> > Did you have time to look into this?
> >
> > No hurry, I just want to make sure that I didn't miss anything ;)
>
> Haven't managed to get to it, sorry.

No worries!, is there something we can do to help you with this?

>
> >
> > Best regards!
> >
> >
> >
> > --
> > Ricardo Ribalda
> ---end quoted text---



-- 
Ricardo Ribalda
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 15/15] iommu/virtio: Update fault type and reason info for viommu fault

2021-01-15 Thread Vivek Gautam
Fault type information can tell about a page request fault or
an unreceoverable fault, and further additions to fault reasons
and the related PASID information can help in handling faults
efficiently.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/virtio-iommu.c  | 27 +--
 include/uapi/linux/virtio_iommu.h | 13 -
 2 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 9cc3d35125e9..10ef9e98214a 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -652,9 +652,16 @@ static int viommu_fault_handler(struct viommu_dev *viommu,
char *reason_str;
 
u8 reason   = fault->reason;
+   u16 type= fault->flt_type;
u32 flags   = le32_to_cpu(fault->flags);
u32 endpoint= le32_to_cpu(fault->endpoint);
u64 address = le64_to_cpu(fault->address);
+   u32 pasid   = le32_to_cpu(fault->pasid);
+
+   if (type == VIRTIO_IOMMU_FAULT_F_PAGE_REQ) {
+   dev_info(viommu->dev, "Page request fault - unhandled\n");
+   return 0;
+   }
 
switch (reason) {
case VIRTIO_IOMMU_FAULT_R_DOMAIN:
@@ -663,6 +670,21 @@ static int viommu_fault_handler(struct viommu_dev *viommu,
case VIRTIO_IOMMU_FAULT_R_MAPPING:
reason_str = "page";
break;
+   case VIRTIO_IOMMU_FAULT_R_WALK_EABT:
+   reason_str = "page walk external abort";
+   break;
+   case VIRTIO_IOMMU_FAULT_R_PTE_FETCH:
+   reason_str = "pte fetch";
+   break;
+   case VIRTIO_IOMMU_FAULT_R_PERMISSION:
+   reason_str = "permission";
+   break;
+   case VIRTIO_IOMMU_FAULT_R_ACCESS:
+   reason_str = "access";
+   break;
+   case VIRTIO_IOMMU_FAULT_R_OOR_ADDRESS:
+   reason_str = "output address";
+   break;
case VIRTIO_IOMMU_FAULT_R_UNKNOWN:
default:
reason_str = "unknown";
@@ -671,8 +693,9 @@ static int viommu_fault_handler(struct viommu_dev *viommu,
 
/* TODO: find EP by ID and report_iommu_fault */
if (flags & VIRTIO_IOMMU_FAULT_F_ADDRESS)
-   dev_err_ratelimited(viommu->dev, "%s fault from EP %u at %#llx 
[%s%s%s]\n",
-   reason_str, endpoint, address,
+   dev_err_ratelimited(viommu->dev,
+   "%s fault from EP %u PASID %u at %#llx 
[%s%s%s]\n",
+   reason_str, endpoint, pasid, address,
flags & VIRTIO_IOMMU_FAULT_F_READ ? "R" : 
"",
flags & VIRTIO_IOMMU_FAULT_F_WRITE ? "W" : 
"",
flags & VIRTIO_IOMMU_FAULT_F_EXEC ? "X" : 
"");
diff --git a/include/uapi/linux/virtio_iommu.h 
b/include/uapi/linux/virtio_iommu.h
index 608c8d642e1f..a537d82777f7 100644
--- a/include/uapi/linux/virtio_iommu.h
+++ b/include/uapi/linux/virtio_iommu.h
@@ -290,19 +290,30 @@ struct virtio_iommu_req_invalidate {
 #define VIRTIO_IOMMU_FAULT_R_UNKNOWN   0
 #define VIRTIO_IOMMU_FAULT_R_DOMAIN1
 #define VIRTIO_IOMMU_FAULT_R_MAPPING   2
+#define VIRTIO_IOMMU_FAULT_R_WALK_EABT 3
+#define VIRTIO_IOMMU_FAULT_R_PTE_FETCH 4
+#define VIRTIO_IOMMU_FAULT_R_PERMISSION5
+#define VIRTIO_IOMMU_FAULT_R_ACCESS6
+#define VIRTIO_IOMMU_FAULT_R_OOR_ADDRESS   7
 
 #define VIRTIO_IOMMU_FAULT_F_READ  (1 << 0)
 #define VIRTIO_IOMMU_FAULT_F_WRITE (1 << 1)
 #define VIRTIO_IOMMU_FAULT_F_EXEC  (1 << 2)
 #define VIRTIO_IOMMU_FAULT_F_ADDRESS   (1 << 8)
 
+#define VIRTIO_IOMMU_FAULT_F_DMA_UNRECOV   1
+#define VIRTIO_IOMMU_FAULT_F_PAGE_REQ  2
+
 struct virtio_iommu_fault {
__u8reason;
-   __u8reserved[3];
+   __le16  flt_type;
+   __u8reserved;
__le32  flags;
__le32  endpoint;
__u8reserved2[4];
__le64  address;
+   __le32  pasid;
+   __u8reserved3[4];
 };
 
 #endif
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 14/15] iommu/virtio: Add support for Arm LPAE page table format

2021-01-15 Thread Vivek Gautam
From: Jean-Philippe Brucker 

When PASID isn't supported, we can still register one set of tables.
Add support to register Arm LPAE based page table.

Signed-off-by: Jean-Philippe Brucker 
[Vivek: Clean-ups to add right tcr definitions and accomodate
with parent patches]
Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/virtio-iommu.c  | 131 +-
 include/uapi/linux/virtio_iommu.h |  30 +++
 2 files changed, 139 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index b5222da1dc74..9cc3d35125e9 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -135,6 +135,13 @@ struct viommu_event {
 #define to_viommu_domain(domain)   \
container_of(domain, struct viommu_domain, domain)
 
+#define VIRTIO_FIELD_PREP(_mask, _shift, _val) \
+   ({  \
+   (((_val) << VIRTIO_IOMMU_PGTF_ARM_ ## _shift) & \
+(VIRTIO_IOMMU_PGTF_ARM_ ## _mask <<\
+ VIRTIO_IOMMU_PGTF_ARM_ ## _shift));   \
+   })
+
 static int viommu_get_req_errno(void *buf, size_t len)
 {
struct virtio_iommu_req_tail *tail = buf + len - sizeof(*tail);
@@ -897,6 +904,76 @@ static int viommu_simple_attach(struct viommu_domain 
*vdomain,
return ret;
 }
 
+static int viommu_config_arm_pgt(struct viommu_endpoint *vdev,
+struct io_pgtable_cfg *cfg,
+struct virtio_iommu_req_attach_pgt_arm *req,
+u64 *asid)
+{
+   int id;
+   struct virtio_iommu_probe_table_format *pgtf = (void *)vdev->pgtf;
+   typeof(>arm_lpae_s1_cfg.tcr) tcr = >arm_lpae_s1_cfg.tcr;
+   u64 __tcr;
+
+   if (pgtf->asid_bits != 8 && pgtf->asid_bits != 16)
+   return -EINVAL;
+
+   id = ida_simple_get(_ida, 1, 1 << pgtf->asid_bits, GFP_KERNEL);
+   if (id < 0)
+   return -ENOMEM;
+
+   __tcr = VIRTIO_FIELD_PREP(T0SZ_MASK, T0SZ_SHIFT, tcr->tsz) |
+   VIRTIO_FIELD_PREP(IRGN0_MASK, IRGN0_SHIFT, tcr->irgn) |
+   VIRTIO_FIELD_PREP(ORGN0_MASK, ORGN0_SHIFT, tcr->orgn) |
+   VIRTIO_FIELD_PREP(SH0_MASK, SH0_SHIFT, tcr->sh) |
+   VIRTIO_FIELD_PREP(TG0_MASK, TG0_SHIFT, tcr->tg) |
+   VIRTIO_IOMMU_PGTF_ARM_EPD1 | VIRTIO_IOMMU_PGTF_ARM_HPD0 |
+   VIRTIO_IOMMU_PGTF_ARM_HPD1;
+
+   req->format = cpu_to_le16(VIRTIO_IOMMU_FOMRAT_PGTF_ARM_LPAE);
+   req->ttbr   = cpu_to_le64(cfg->arm_lpae_s1_cfg.ttbr);
+   req->tcr= cpu_to_le64(__tcr);
+   req->mair   = cpu_to_le64(cfg->arm_lpae_s1_cfg.mair);
+   req->asid   = cpu_to_le16(id);
+
+   *asid = id;
+   return 0;
+}
+
+static int viommu_attach_pgtable(struct viommu_endpoint *vdev,
+struct viommu_domain *vdomain,
+enum io_pgtable_fmt fmt,
+struct io_pgtable_cfg *cfg,
+u64 *asid)
+{
+   int ret;
+   int i, eid;
+
+   struct virtio_iommu_req_attach_table req = {
+   .head.type  = VIRTIO_IOMMU_T_ATTACH_TABLE,
+   .domain = cpu_to_le32(vdomain->id),
+   };
+
+   switch (fmt) {
+   case ARM_64_LPAE_S1:
+   ret = viommu_config_arm_pgt(vdev, cfg, (void *), asid);
+   if (ret)
+   return ret;
+   break;
+   default:
+   WARN_ON(1);
+   return -EINVAL;
+   }
+
+   vdev_for_each_id(i, eid, vdev) {
+   req.endpoint = cpu_to_le32(eid);
+   ret = viommu_send_req_sync(vdomain->viommu, , sizeof(req));
+   if (ret)
+   return ret;
+   }
+
+   return 0;
+}
+
 static int viommu_teardown_pgtable(struct viommu_domain *vdomain)
 {
struct iommu_vendor_psdtable_cfg *pst_cfg;
@@ -972,32 +1049,42 @@ static int viommu_setup_pgtable(struct viommu_endpoint 
*vdev,
if (!ops)
return -ENOMEM;
 
-   pst_cfg = >cfg;
-   cfgi = _cfg->vendor.cfg;
-   id = ida_simple_get(_ida, 1, 1 << desc->asid_bits, GFP_KERNEL);
-   if (id < 0) {
-   ret = id;
-   goto err_free_pgtable;
-   }
+   if (!tbl) {
+   /* No PASID support, send attach_table */
+   ret = viommu_attach_pgtable(vdev, vdomain, fmt, ,
+   >mm.archid);
+   if (ret)
+   goto err_free_pgtable;
+   } else {
+   pst_cfg = >cfg;
+   

[PATCH RFC v1 13/15] iommu/virtio: Attach Arm PASID tables when available

2021-01-15 Thread Vivek Gautam
From: Jean-Philippe Brucker 

When the ARM PASID table format is reported in a probe, send an attach
request and install the page tables for iommu_map/iommu_unmap use.
Architecture-specific components are already abstracted to libraries. We
just need to pass config bits around and setup an alternative mechanism to
the mapping tree.

We reuse the convention already adopted by other IOMMU architectures (ARM
SMMU and AMD IOMMU), that entry 0 in the PASID table is reserved for
non-PASID traffic. Bind the PASID table, and setup entry 0 to be modified
with iommu_map/unmap.

Signed-off-by: Jean-Philippe Brucker 
[Vivek: Bunch of refactoring and clean-ups to use iommu-pasid-table APIs,
creating iommu_pasid_table, and configuring based on reported
pasid format. Couple of additional methods have also been created
to configure vendor specific pasid configuration]
Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/virtio-iommu.c | 314 +++
 1 file changed, 314 insertions(+)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 004ea94e3731..b5222da1dc74 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -25,6 +25,7 @@
 #include 
 
 #include 
+#include "iommu-pasid-table.h"
 
 #define MSI_IOVA_BASE  0x800
 #define MSI_IOVA_LENGTH0x10
@@ -33,6 +34,9 @@
 #define VIOMMU_EVENT_VQ1
 #define VIOMMU_NR_VQS  2
 
+/* Some architectures need an Address Space ID for each page table */
+static DEFINE_IDA(asid_ida);
+
 struct viommu_dev {
struct iommu_device iommu;
struct device   *dev;
@@ -55,6 +59,7 @@ struct viommu_dev {
u32 probe_size;
 
boolhas_map:1;
+   boolhas_table:1;
 };
 
 struct viommu_mapping {
@@ -76,6 +81,7 @@ struct viommu_domain {
struct mutexmutex; /* protects viommu pointer */
unsigned intid;
u32 map_flags;
+   struct iommu_pasid_table*pasid_tbl;
 
/* Default address space when a table is bound */
struct viommu_mmmm;
@@ -891,6 +897,285 @@ static int viommu_simple_attach(struct viommu_domain 
*vdomain,
return ret;
 }
 
+static int viommu_teardown_pgtable(struct viommu_domain *vdomain)
+{
+   struct iommu_vendor_psdtable_cfg *pst_cfg;
+   struct arm_smmu_cfg_info *cfgi;
+   u32 asid;
+
+   if (!vdomain->mm.ops)
+   return 0;
+
+   free_io_pgtable_ops(vdomain->mm.ops);
+   vdomain->mm.ops = NULL;
+
+   if (vdomain->pasid_tbl) {
+   pst_cfg = >pasid_tbl->cfg;
+   cfgi = _cfg->vendor.cfg;
+   asid = cfgi->s1_cfg->cd.asid;
+
+   iommu_psdtable_write(vdomain->pasid_tbl, pst_cfg, 0, NULL);
+   ida_simple_remove(_ida, asid);
+   }
+
+   return 0;
+}
+
+static int viommu_setup_pgtable(struct viommu_endpoint *vdev,
+   struct viommu_domain *vdomain)
+{
+   int ret, id;
+   u32 asid;
+   enum io_pgtable_fmt fmt;
+   struct io_pgtable_ops *ops = NULL;
+   struct viommu_dev *viommu = vdev->viommu;
+   struct virtio_iommu_probe_table_format *desc = vdev->pgtf;
+   struct iommu_pasid_table *tbl = vdomain->pasid_tbl;
+   struct iommu_vendor_psdtable_cfg *pst_cfg;
+   struct arm_smmu_cfg_info *cfgi;
+   struct io_pgtable_cfg cfg = {
+   .iommu_dev  = viommu->dev->parent,
+   .tlb= _flush_ops,
+   .pgsize_bitmap  = vdev->pgsize_mask ? vdev->pgsize_mask :
+ vdomain->domain.pgsize_bitmap,
+   .ias= (vdev->input_end ? ilog2(vdev->input_end) :
+  
ilog2(vdomain->domain.geometry.aperture_end)) + 1,
+   .oas= vdev->output_bits,
+   };
+
+   if (!desc)
+   return -EINVAL;
+
+   if (!vdev->output_bits)
+   return -ENODEV;
+
+   switch (le16_to_cpu(desc->format)) {
+   case VIRTIO_IOMMU_FOMRAT_PGTF_ARM_LPAE:
+   fmt = ARM_64_LPAE_S1;
+   break;
+   default:
+   dev_err(vdev->dev, "unsupported page table format 0x%x\n",
+   le16_to_cpu(desc->format));
+   return -EINVAL;
+   }
+
+   if (vdomain->mm.ops) {
+   /*
+* TODO: attach additional endpoint to the domain. Check that
+* the config is sane.
+*/
+  

[PATCH RFC v1 12/15] iommu/virtio: Add support for INVALIDATE request

2021-01-15 Thread Vivek Gautam
From: Jean-Philippe Brucker 

Add support for tlb invalidation ops that can send invalidation
requests to back-end virtio-iommu when stage-1 page tables are
supported.

Signed-off-by: Jean-Philippe Brucker 
[Vivek: Refactoring the iommu_flush_ops, and adding only one pasid sync
op that's needed with current iommu-pasid-table infrastructure.
Also updating uapi defines as required by latest changes]
Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/virtio-iommu.c | 95 
 1 file changed, 95 insertions(+)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index ae5dfd3f8269..004ea94e3731 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -63,6 +64,8 @@ struct viommu_mapping {
 };
 
 struct viommu_mm {
+   int pasid;
+   u64 archid;
struct io_pgtable_ops   *ops;
struct viommu_domain*domain;
 };
@@ -692,6 +695,98 @@ static void viommu_event_handler(struct virtqueue *vq)
virtqueue_kick(vq);
 }
 
+/* PASID and pgtable APIs */
+
+static void __viommu_flush_pasid_tlb_all(struct viommu_domain *vdomain,
+int pasid, u64 arch_id, int type)
+{
+   struct virtio_iommu_req_invalidate req = {
+   .head.type  = VIRTIO_IOMMU_T_INVALIDATE,
+   .inv_gran   = cpu_to_le32(VIRTIO_IOMMU_INVAL_G_PASID),
+   .flags  = cpu_to_le32(VIRTIO_IOMMU_INVAL_F_PASID),
+   .inv_type   = cpu_to_le32(type),
+
+   .domain = cpu_to_le32(vdomain->id),
+   .pasid  = cpu_to_le32(pasid),
+   .archid = cpu_to_le64(arch_id),
+   };
+
+   if (viommu_send_req_sync(vdomain->viommu, , sizeof(req)))
+   pr_debug("could not send invalidate request\n");
+}
+
+static void viommu_flush_tlb_add(struct iommu_iotlb_gather *gather,
+unsigned long iova, size_t granule,
+void *cookie)
+{
+   struct viommu_mm *viommu_mm = cookie;
+   struct viommu_domain *vdomain = viommu_mm->domain;
+   struct iommu_domain *domain = >domain;
+
+   iommu_iotlb_gather_add_page(domain, gather, iova, granule);
+}
+
+static void viommu_flush_tlb_walk(unsigned long iova, size_t size,
+ size_t granule, void *cookie)
+{
+   struct viommu_mm *viommu_mm = cookie;
+   struct viommu_domain *vdomain = viommu_mm->domain;
+   struct virtio_iommu_req_invalidate req = {
+   .head.type  = VIRTIO_IOMMU_T_INVALIDATE,
+   .inv_gran   = cpu_to_le32(VIRTIO_IOMMU_INVAL_G_VA),
+   .inv_type   = cpu_to_le32(VIRTIO_IOMMU_INV_T_IOTLB),
+   .flags  = cpu_to_le32(VIRTIO_IOMMU_INVAL_F_ARCHID),
+
+   .domain = cpu_to_le32(vdomain->id),
+   .pasid  = cpu_to_le32(viommu_mm->pasid),
+   .archid = cpu_to_le64(viommu_mm->archid),
+   .virt_start = cpu_to_le64(iova),
+   .nr_pages   = cpu_to_le64(size / granule),
+   .granule= ilog2(granule),
+   };
+
+   if (viommu_add_req(vdomain->viommu, , sizeof(req)))
+   pr_debug("could not add invalidate request\n");
+}
+
+static void viommu_flush_tlb_all(void *cookie)
+{
+   struct viommu_mm *viommu_mm = cookie;
+
+   if (!viommu_mm->archid)
+   return;
+
+   __viommu_flush_pasid_tlb_all(viommu_mm->domain, viommu_mm->pasid,
+viommu_mm->archid,
+VIRTIO_IOMMU_INV_T_IOTLB);
+}
+
+static struct iommu_flush_ops viommu_flush_ops = {
+   .tlb_flush_all  = viommu_flush_tlb_all,
+   .tlb_flush_walk = viommu_flush_tlb_walk,
+   .tlb_add_page   = viommu_flush_tlb_add,
+};
+
+static void viommu_flush_pasid(void *cookie, int pasid, bool leaf)
+{
+   struct viommu_domain *vdomain = cookie;
+   struct virtio_iommu_req_invalidate req = {
+   .head.type  = VIRTIO_IOMMU_T_INVALIDATE,
+   .inv_gran   = cpu_to_le32(VIRTIO_IOMMU_INVAL_G_PASID),
+   .inv_type   = cpu_to_le32(VIRTIO_IOMMU_INV_T_PASID),
+   .flags  = cpu_to_le32(VIRTIO_IOMMU_INVAL_F_PASID),
+
+   .domain = cpu_to_le32(vdomain->id),
+   .pasid  = cpu_to_le32(pasid),
+   };
+
+   if (leaf)
+   req.flags   |= 

[PATCH RFC v1 11/15] iommu/virtio: Add headers for binding pasid table in iommu

2021-01-15 Thread Vivek Gautam
From: Jean-Philippe Brucker 

Add the required UAPI defines for binding pasid tables in virtio-iommu.
This mode allows to hand stage-1 page tables over to the guest.

Signed-off-by: Jean-Philippe Brucker 
[Vivek: Refactor to cleanup headers for invalidation]
Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 include/uapi/linux/virtio_iommu.h | 68 +++
 1 file changed, 68 insertions(+)

diff --git a/include/uapi/linux/virtio_iommu.h 
b/include/uapi/linux/virtio_iommu.h
index 8a0624bab4b2..3481e4a3dd24 100644
--- a/include/uapi/linux/virtio_iommu.h
+++ b/include/uapi/linux/virtio_iommu.h
@@ -16,6 +16,7 @@
 #define VIRTIO_IOMMU_F_BYPASS  3
 #define VIRTIO_IOMMU_F_PROBE   4
 #define VIRTIO_IOMMU_F_MMIO5
+#define VIRTIO_IOMMU_F_ATTACH_TABLE6
 
 struct virtio_iommu_range_64 {
__le64  start;
@@ -44,6 +45,8 @@ struct virtio_iommu_config {
 #define VIRTIO_IOMMU_T_MAP 0x03
 #define VIRTIO_IOMMU_T_UNMAP   0x04
 #define VIRTIO_IOMMU_T_PROBE   0x05
+#define VIRTIO_IOMMU_T_ATTACH_TABLE0x06
+#define VIRTIO_IOMMU_T_INVALIDATE  0x07
 
 /* Status types */
 #define VIRTIO_IOMMU_S_OK  0x00
@@ -82,6 +85,37 @@ struct virtio_iommu_req_detach {
struct virtio_iommu_req_tailtail;
 };
 
+struct virtio_iommu_req_attach_table {
+   struct virtio_iommu_req_headhead;
+   __le32  domain;
+   __le32  endpoint;
+   __le16  format;
+   __u8reserved[62];
+   struct virtio_iommu_req_tailtail;
+};
+
+#define VIRTIO_IOMMU_PSTF_ARM_SMMU_V3_LINEAR   0x0
+#define VIRTIO_IOMMU_PSTF_ARM_SMMU_V3_4KL2 0x1
+#define VIRTIO_IOMMU_PSTF_ARM_SMMU_V3_64KL20x2
+
+#define VIRTIO_IOMMU_PSTF_ARM_SMMU_V3_DSS_TERM 0x0
+#define VIRTIO_IOMMU_PSTF_ARM_SMMU_V3_DSS_BYPASS 0x1
+#define VIRTIO_IOMMU_PSTF_ARM_SMMU_V3_DSS_00x2
+
+/* Arm SMMUv3 PASID Table Descriptor */
+struct virtio_iommu_req_attach_pst_arm {
+   struct virtio_iommu_req_headhead;
+   __le32  domain;
+   __le32  endpoint;
+   __le16  format;
+   __u8s1fmt;
+   __u8s1dss;
+   __le64  s1contextptr;
+   __le32  s1cdmax;
+   __u8reserved[48];
+   struct virtio_iommu_req_tailtail;
+};
+
 #define VIRTIO_IOMMU_MAP_F_READ(1 << 0)
 #define VIRTIO_IOMMU_MAP_F_WRITE   (1 << 1)
 #define VIRTIO_IOMMU_MAP_F_MMIO(1 << 2)
@@ -188,6 +222,40 @@ struct virtio_iommu_req_probe {
 */
 };
 
+#define VIRTIO_IOMMU_INVAL_G_DOMAIN(1 << 0)
+#define VIRTIO_IOMMU_INVAL_G_PASID (1 << 1)
+#define VIRTIO_IOMMU_INVAL_G_VA(1 << 2)
+
+#define VIRTIO_IOMMU_INV_T_IOTLB   (1 << 0)
+#define VIRTIO_IOMMU_INV_T_DEV_IOTLB   (1 << 1)
+#define VIRTIO_IOMMU_INV_T_PASID   (1 << 2)
+
+#define VIRTIO_IOMMU_INVAL_F_PASID (1 << 0)
+#define VIRTIO_IOMMU_INVAL_F_ARCHID(1 << 1)
+#define VIRTIO_IOMMU_INVAL_F_LEAF  (1 << 2)
+
+struct virtio_iommu_req_invalidate {
+   struct virtio_iommu_req_headhead;
+   __le16  inv_gran;
+   __le16  inv_type;
+
+   __le16  flags;
+   __u8reserved1[2];
+   __le32  domain;
+
+   __le32  pasid;
+   __u8reserved2[4];
+
+   __le64  archid;
+   __le64  virt_start;
+   __le64  nr_pages;
+
+   /* Page size, in nr of bits, typically 12 for 4k, 30 for 2MB, etc.) */
+   __u8granule;
+   __u8reserved3[11];
+   struct virtio_iommu_req_tailtail;
+};
+
 /* Fault types */
 #define VIRTIO_IOMMU_FAULT_R_UNKNOWN   0
 #define VIRTIO_IOMMU_FAULT_R_DOMAIN1
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org

[PATCH RFC v1 10/15] iommu/virtio: Prepare to add attach pasid table infrastructure

2021-01-15 Thread Vivek Gautam
In preparation to add attach pasid table op, separate out the
existing attach request code to a separate method.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/virtio-iommu.c | 73 +---
 1 file changed, 51 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 12d73321dbf4..ae5dfd3f8269 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -52,6 +52,8 @@ struct viommu_dev {
/* Supported MAP flags */
u32 map_flags;
u32 probe_size;
+
+   boolhas_map:1;
 };
 
 struct viommu_mapping {
@@ -60,6 +62,11 @@ struct viommu_mapping {
u32 flags;
 };
 
+struct viommu_mm {
+   struct io_pgtable_ops   *ops;
+   struct viommu_domain*domain;
+};
+
 struct viommu_domain {
struct iommu_domain domain;
struct viommu_dev   *viommu;
@@ -67,12 +74,20 @@ struct viommu_domain {
unsigned intid;
u32 map_flags;
 
+   /* Default address space when a table is bound */
+   struct viommu_mmmm;
+
+   /* When no table is bound, use generic mappings */
spinlock_t  mappings_lock;
struct rb_root_cached   mappings;
 
unsigned long   nr_endpoints;
 };
 
+#define vdev_for_each_id(i, eid, vdev) \
+   for (i = 0; i < vdev->dev->iommu->fwspec->num_ids &&\
+   ({ eid = vdev->dev->iommu->fwspec->ids[i]; 1; }); i++)
+
 struct viommu_endpoint {
struct device   *dev;
struct viommu_dev   *viommu;
@@ -750,12 +765,40 @@ static void viommu_domain_free(struct iommu_domain 
*domain)
kfree(vdomain);
 }
 
+static int viommu_simple_attach(struct viommu_domain *vdomain,
+   struct viommu_endpoint *vdev)
+{
+   int i, eid, ret;
+   struct virtio_iommu_req_attach req = {
+   .head.type  = VIRTIO_IOMMU_T_ATTACH,
+   .domain = cpu_to_le32(vdomain->id),
+   };
+
+   if (!vdomain->viommu->has_map)
+   return -ENODEV;
+
+   vdev_for_each_id(i, eid, vdev) {
+   req.endpoint = cpu_to_le32(eid);
+
+   ret = viommu_send_req_sync(vdomain->viommu, , sizeof(req));
+   if (ret)
+   return ret;
+   }
+
+   if (!vdomain->nr_endpoints) {
+   /*
+* This endpoint is the first to be attached to the domain.
+* Replay existing mappings if any (e.g. SW MSI).
+*/
+   ret = viommu_replay_mappings(vdomain);
+   }
+
+   return ret;
+}
+
 static int viommu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
-   int i;
int ret = 0;
-   struct virtio_iommu_req_attach req;
-   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct viommu_endpoint *vdev = dev_iommu_priv_get(dev);
struct viommu_domain *vdomain = to_viommu_domain(domain);
 
@@ -790,25 +833,9 @@ static int viommu_attach_dev(struct iommu_domain *domain, 
struct device *dev)
if (vdev->vdomain)
vdev->vdomain->nr_endpoints--;
 
-   req = (struct virtio_iommu_req_attach) {
-   .head.type  = VIRTIO_IOMMU_T_ATTACH,
-   .domain = cpu_to_le32(vdomain->id),
-   };
-
-   for (i = 0; i < fwspec->num_ids; i++) {
-   req.endpoint = cpu_to_le32(fwspec->ids[i]);
-
-   ret = viommu_send_req_sync(vdomain->viommu, , sizeof(req));
-   if (ret)
-   return ret;
-   }
-
-   if (!vdomain->nr_endpoints) {
-   /*
-* This endpoint is the first to be attached to the domain.
-* Replay existing mappings (e.g. SW MSI).
-*/
-   ret = viommu_replay_mappings(vdomain);
+   if (!vdomain->mm.ops) {
+   /* If we couldn't bind any table, use the mapping tree */
+   ret = viommu_simple_attach(vdomain, vdev);
if (ret)
return ret;
}
@@ -1142,6 +1169,8 @@ static int viommu_probe(struct virtio_device *vdev)
struct virtio_iommu_config, probe_size,
>probe_size);
 
+   viommu->has_map = virtio_has_feature(vdev, VIRTIO_IOMMU_F_MAP_UNMAP);
+
viommu->geometry = (struct iommu_domain_geometry) {

[PATCH RFC v1 09/15] iommu/virtio: Update table format probing header

2021-01-15 Thread Vivek Gautam
Add info about asid_bits and additional flags to table format
probing header.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 include/uapi/linux/virtio_iommu.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/virtio_iommu.h 
b/include/uapi/linux/virtio_iommu.h
index 43821e33e7af..8a0624bab4b2 100644
--- a/include/uapi/linux/virtio_iommu.h
+++ b/include/uapi/linux/virtio_iommu.h
@@ -169,7 +169,10 @@ struct virtio_iommu_probe_pasid_size {
 struct virtio_iommu_probe_table_format {
struct virtio_iommu_probe_property  head;
__le16  format;
-   __u8reserved[2];
+   __le16  asid_bits;
+
+   __le32  flags;
+   __u8reserved[4];
 };
 
 struct virtio_iommu_req_probe {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 08/15] iommu: Add asid_bits to arm smmu-v3 stage1 table info

2021-01-15 Thread Vivek Gautam
aisd_bits data is required to prepare stage-1 tables for arm-smmu-v3.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 include/uapi/linux/iommu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index 082d758dd016..96abbfc7c643 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -357,7 +357,7 @@ struct iommu_pasid_smmuv3 {
__u32   version;
__u8s1fmt;
__u8s1dss;
-   __u8padding[2];
+   __u16   asid_bits;
 };
 
 /**
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 07/15] iommu/virtio: Add table format probing

2021-01-15 Thread Vivek Gautam
From: Jean-Philippe Brucker 

The device may provide information about hardware tables and additional
capabilities for each device. Parse the new probe fields.

Signed-off-by: Jean-Philippe Brucker 
[Vivek: Refactor to use "struct virtio_iommu_probe_table_format" rather
than separate structures for page table and pasid table format.]
Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/virtio-iommu.c | 102 ++-
 1 file changed, 101 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 2bfdd5734844..12d73321dbf4 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -78,6 +78,17 @@ struct viommu_endpoint {
struct viommu_dev   *viommu;
struct viommu_domain*vdomain;
struct list_headresv_regions;
+
+   /* properties of the physical IOMMU */
+   u64 pgsize_mask;
+   u64 input_start;
+   u64 input_end;
+   u8  output_bits;
+   u8  pasid_bits;
+   /* Preferred PASID table format */
+   void*pstf;
+   /* Preferred page table format */
+   void*pgtf;
 };
 
 struct viommu_request {
@@ -457,6 +468,72 @@ static int viommu_add_resv_mem(struct viommu_endpoint 
*vdev,
return 0;
 }
 
+static int viommu_add_pgsize_mask(struct viommu_endpoint *vdev,
+ struct virtio_iommu_probe_page_size_mask 
*prop,
+ size_t len)
+{
+   if (len < sizeof(*prop))
+   return -EINVAL;
+   vdev->pgsize_mask = le64_to_cpu(prop->mask);
+   return 0;
+}
+
+static int viommu_add_input_range(struct viommu_endpoint *vdev,
+ struct virtio_iommu_probe_input_range *prop,
+ size_t len)
+{
+   if (len < sizeof(*prop))
+   return -EINVAL;
+   vdev->input_start   = le64_to_cpu(prop->start);
+   vdev->input_end = le64_to_cpu(prop->end);
+   return 0;
+}
+
+static int viommu_add_output_size(struct viommu_endpoint *vdev,
+ struct virtio_iommu_probe_output_size *prop,
+ size_t len)
+{
+   if (len < sizeof(*prop))
+   return -EINVAL;
+   vdev->output_bits = prop->bits;
+   return 0;
+}
+
+static int viommu_add_pasid_size(struct viommu_endpoint *vdev,
+struct virtio_iommu_probe_pasid_size *prop,
+size_t len)
+{
+   if (len < sizeof(*prop))
+   return -EINVAL;
+   vdev->pasid_bits = prop->bits;
+   return 0;
+}
+
+static int viommu_add_pgtf(struct viommu_endpoint *vdev, void *pgtf, size_t 
len)
+{
+   /* Select the first page table format available */
+   if (len < sizeof(struct virtio_iommu_probe_table_format) || vdev->pgtf)
+   return -EINVAL;
+
+   vdev->pgtf = kmemdup(pgtf, len, GFP_KERNEL);
+   if (!vdev->pgtf)
+   return -ENOMEM;
+
+   return 0;
+}
+
+static int viommu_add_pstf(struct viommu_endpoint *vdev, void *pstf, size_t 
len)
+{
+   if (len < sizeof(struct virtio_iommu_probe_table_format) || vdev->pstf)
+   return -EINVAL;
+
+   vdev->pstf = kmemdup(pstf, len, GFP_KERNEL);
+   if (!vdev->pstf)
+   return -ENOMEM;
+
+   return 0;
+}
+
 static int viommu_probe_endpoint(struct viommu_dev *viommu, struct device *dev)
 {
int ret;
@@ -493,11 +570,30 @@ static int viommu_probe_endpoint(struct viommu_dev 
*viommu, struct device *dev)
 
while (type != VIRTIO_IOMMU_PROBE_T_NONE &&
   cur < viommu->probe_size) {
+   void *value = prop;
len = le16_to_cpu(prop->length) + sizeof(*prop);
 
switch (type) {
case VIRTIO_IOMMU_PROBE_T_RESV_MEM:
-   ret = viommu_add_resv_mem(vdev, (void *)prop, len);
+   ret = viommu_add_resv_mem(vdev, value, len);
+   break;
+   case VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK:
+   ret = viommu_add_pgsize_mask(vdev, value, len);
+   break;
+   case VIRTIO_IOMMU_PROBE_T_INPUT_RANGE:
+   ret = viommu_add_input_range(vdev, value, len);
+   break;
+   case VIRTIO_IOMMU_PROBE_T_OUTPUT_SIZE:
+   ret = viommu_add_output_size(vdev, value, len);
+   break;
+   

[PATCH RFC v1 06/15] iommu/virtio: Add headers for table format probing

2021-01-15 Thread Vivek Gautam
From: Jean-Philippe Brucker 

Add required UAPI defines for probing table format for underlying
iommu hardware. The device may provide information about hardware
tables and additional capabilities for each device.
This allows guest to correctly fabricate stage-1 page tables.

Signed-off-by: Jean-Philippe Brucker 
[Vivek: Use a single "struct virtio_iommu_probe_table_format" rather
than separate structures for page table and pasid table format.
Also update commit message.]
Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Michael S. Tsirkin 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 include/uapi/linux/virtio_iommu.h | 44 ++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/virtio_iommu.h 
b/include/uapi/linux/virtio_iommu.h
index 237e36a280cb..43821e33e7af 100644
--- a/include/uapi/linux/virtio_iommu.h
+++ b/include/uapi/linux/virtio_iommu.h
@@ -2,7 +2,7 @@
 /*
  * Virtio-iommu definition v0.12
  *
- * Copyright (C) 2019 Arm Ltd.
+ * Copyright (C) 2019-2021 Arm Ltd.
  */
 #ifndef _UAPI_LINUX_VIRTIO_IOMMU_H
 #define _UAPI_LINUX_VIRTIO_IOMMU_H
@@ -111,6 +111,12 @@ struct virtio_iommu_req_unmap {
 
 #define VIRTIO_IOMMU_PROBE_T_NONE  0
 #define VIRTIO_IOMMU_PROBE_T_RESV_MEM  1
+#define VIRTIO_IOMMU_PROBE_T_PAGE_SIZE_MASK2
+#define VIRTIO_IOMMU_PROBE_T_INPUT_RANGE   3
+#define VIRTIO_IOMMU_PROBE_T_OUTPUT_SIZE   4
+#define VIRTIO_IOMMU_PROBE_T_PASID_SIZE5
+#define VIRTIO_IOMMU_PROBE_T_PAGE_TABLE_FMT6
+#define VIRTIO_IOMMU_PROBE_T_PASID_TABLE_FMT   7
 
 #define VIRTIO_IOMMU_PROBE_T_MASK  0xfff
 
@@ -130,6 +136,42 @@ struct virtio_iommu_probe_resv_mem {
__le64  end;
 };
 
+struct virtio_iommu_probe_page_size_mask {
+   struct virtio_iommu_probe_property  head;
+   __u8reserved[4];
+   __le64  mask;
+};
+
+struct virtio_iommu_probe_input_range {
+   struct virtio_iommu_probe_property  head;
+   __u8reserved[4];
+   __le64  start;
+   __le64  end;
+};
+
+struct virtio_iommu_probe_output_size {
+   struct virtio_iommu_probe_property  head;
+   __u8bits;
+   __u8reserved[3];
+};
+
+struct virtio_iommu_probe_pasid_size {
+   struct virtio_iommu_probe_property  head;
+   __u8bits;
+   __u8reserved[3];
+};
+
+/* Arm LPAE page table format */
+#define VIRTIO_IOMMU_FOMRAT_PGTF_ARM_LPAE  1
+/* Arm smmu-v3 type PASID table format */
+#define VIRTIO_IOMMU_FORMAT_PSTF_ARM_SMMU_V3   2
+
+struct virtio_iommu_probe_table_format {
+   struct virtio_iommu_probe_property  head;
+   __le16  format;
+   __u8reserved[2];
+};
+
 struct virtio_iommu_req_probe {
struct virtio_iommu_req_headhead;
__le32  endpoint;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 05/15] iommu/arm-smmu-v3: Set sync op from consumer driver of cd-lib

2021-01-15 Thread Vivek Gautam
Te change allows different consumers of arm-smmu-v3-cd-lib to set
their respective sync op for pasid entries.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c | 1 -
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c| 7 +++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
index ec37476c8d09..acaa09acecdd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
@@ -265,7 +265,6 @@ struct iommu_vendor_psdtable_ops arm_cd_table_ops = {
.free= arm_smmu_free_cd_tables,
.prepare = arm_smmu_prepare_cd,
.write   = arm_smmu_write_ctx_desc,
-   .sync= arm_smmu_sync_cd,
 };
 
 struct iommu_pasid_table *arm_smmu_register_cd_table(struct device *dev,
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 2f86c6ac42b6..0c644be22b4b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1869,6 +1869,13 @@ static int arm_smmu_domain_finalise_s1(struct 
arm_smmu_domain *smmu_domain,
if (ret)
goto out_free_cd_tables;
 
+   /*
+* Strange to setup an op here?
+* cd-lib is the actual user of sync op, and therefore the platform
+* drivers should assign this sync/maintenance ops as per need.
+*/
+   tbl->ops->sync = arm_smmu_sync_cd;
+
/*
 * Note that this will end up calling arm_smmu_sync_cd() before
 * the master has been added to the devices list for this domain.
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 04/15] iommu/arm-smmu-v3: Update CD base address info for user-space

2021-01-15 Thread Vivek Gautam
Update base address information in vendor pasid table info to pass that
to user-space for stage1 table management.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
index 8a7187534706..ec37476c8d09 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
@@ -55,6 +55,9 @@ static __le64 *arm_smmu_get_cd_ptr(struct 
iommu_vendor_psdtable_cfg *pst_cfg,
if (arm_smmu_alloc_cd_leaf_table(dev, l1_desc))
return NULL;
 
+   if (s1cfg->s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
+   pst_cfg->base = l1_desc->l2ptr_dma;
+
l1ptr = cdcfg->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
/* An invalid L1CD can be cached */
@@ -211,6 +214,9 @@ static int arm_smmu_alloc_cd_tables(struct 
iommu_vendor_psdtable_cfg *pst_cfg)
goto err_free_l1;
}
 
+   if (s1cfg->s1fmt == STRTAB_STE_0_S1FMT_64K_L2)
+   pst_cfg->base = cdcfg->cdtab_dma;
+
return 0;
 
 err_free_l1:
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 03/15] iommu/arm-smmu-v3: Update drivers to work with iommu-pasid-table

2021-01-15 Thread Vivek Gautam
Update arm-smmu-v3 context descriptor (CD) library driver to work
with iommu-pasid-table APIs. These APIs are then used in arm-smmu-v3
drivers to manage CD tables.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 .../arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c  | 127 +-
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  16 ++-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  47 ---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |   7 +-
 drivers/iommu/iommu-pasid-table.h |  10 +-
 5 files changed, 144 insertions(+), 63 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
index 97d1786a8a70..8a7187534706 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
@@ -8,17 +8,17 @@
 #include 
 
 #include "arm-smmu-v3.h"
+#include "../../iommu-pasid-table.h"
 
-static int arm_smmu_alloc_cd_leaf_table(struct arm_smmu_device *smmu,
+static int arm_smmu_alloc_cd_leaf_table(struct device *dev,
struct arm_smmu_l1_ctx_desc *l1_desc)
 {
size_t size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3);
 
-   l1_desc->l2ptr = dmam_alloc_coherent(smmu->dev, size,
+   l1_desc->l2ptr = dmam_alloc_coherent(dev, size,
 _desc->l2ptr_dma, GFP_KERNEL);
if (!l1_desc->l2ptr) {
-   dev_warn(smmu->dev,
-"failed to allocate context descriptor table\n");
+   dev_warn(dev, "failed to allocate context descriptor table\n");
return -ENOMEM;
}
return 0;
@@ -34,35 +34,39 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst,
WRITE_ONCE(*dst, cpu_to_le64(val));
 }
 
-static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain,
+static __le64 *arm_smmu_get_cd_ptr(struct iommu_vendor_psdtable_cfg *pst_cfg,
   u32 ssid)
 {
__le64 *l1ptr;
unsigned int idx;
+   struct device *dev = pst_cfg->iommu_dev;
+   struct arm_smmu_cfg_info *cfgi = _cfg->vendor.cfg;
+   struct arm_smmu_s1_cfg *s1cfg = cfgi->s1_cfg;
+   struct arm_smmu_ctx_desc_cfg *cdcfg = >cdcfg;
struct arm_smmu_l1_ctx_desc *l1_desc;
-   struct arm_smmu_device *smmu = smmu_domain->smmu;
-   struct arm_smmu_ctx_desc_cfg *cdcfg = _domain->s1_cfg.cdcfg;
+   struct iommu_pasid_table *tbl = pasid_table_cfg_to_table(pst_cfg);
 
-   if (smmu_domain->s1_cfg.s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
+   if (s1cfg->s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
return cdcfg->cdtab + ssid * CTXDESC_CD_DWORDS;
 
idx = ssid >> CTXDESC_SPLIT;
l1_desc = >l1_desc[idx];
if (!l1_desc->l2ptr) {
-   if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
+   if (arm_smmu_alloc_cd_leaf_table(dev, l1_desc))
return NULL;
 
l1ptr = cdcfg->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
/* An invalid L1CD can be cached */
-   arm_smmu_sync_cd(smmu_domain, ssid, false);
+   if (iommu_psdtable_sync(tbl, tbl->cookie, ssid, false))
+   return NULL;
}
idx = ssid & (CTXDESC_L2_ENTRIES - 1);
return l1_desc->l2ptr + idx * CTXDESC_CD_DWORDS;
 }
 
-int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
-   struct arm_smmu_ctx_desc *cd)
+static int arm_smmu_write_ctx_desc(struct iommu_vendor_psdtable_cfg *pst_cfg,
+  int ssid, void *cookie)
 {
/*
 * This function handles the following cases:
@@ -78,12 +82,15 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain 
*smmu_domain, int ssid,
u64 val;
bool cd_live;
__le64 *cdptr;
-   struct arm_smmu_device *smmu = smmu_domain->smmu;
+   struct arm_smmu_cfg_info *cfgi = _cfg->vendor.cfg;
+   struct arm_smmu_s1_cfg *s1cfg = cfgi->s1_cfg;
+   struct iommu_pasid_table *tbl = pasid_table_cfg_to_table(pst_cfg);
+   struct arm_smmu_ctx_desc *cd = cookie;
 
-   if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax)))
+   if (WARN_ON(ssid >= (1 << s1cfg->s1cdmax)))
return -E2BIG;
 
-   cdptr = arm_smmu_get_cd_ptr(smmu_domain, ssid);
+   cdptr = arm_smmu_get_cd_ptr(pst_cfg, ssid);
if (!cdptr)
return -ENOMEM;
 
@@ -111,7 +118,8 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain 
*smmu_domain, int ssid,
 * order. Ensure that it observes valid values before reading
 * V=1.
 */
-   

[PATCH RFC v1 02/15] iommu: Add a simple PASID table library

2021-01-15 Thread Vivek Gautam
Add a small API in iommu subsystem to handle PASID table allocation
requests from different consumer drivers, such as a paravirtualized
iommu driver. The API provides ops for allocating and freeing PASID
table, writing to it and managing the table caches.

This library also provides for registering a vendor API that attaches
to these ops. The vendor APIs would eventually perform arch level
implementations for these PASID tables.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/iommu-pasid-table.h | 134 ++
 1 file changed, 134 insertions(+)
 create mode 100644 drivers/iommu/iommu-pasid-table.h

diff --git a/drivers/iommu/iommu-pasid-table.h 
b/drivers/iommu/iommu-pasid-table.h
new file mode 100644
index ..bd4f57656f67
--- /dev/null
+++ b/drivers/iommu/iommu-pasid-table.h
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * PASID table management for the IOMMU
+ *
+ * Copyright (C) 2021 Arm Ltd.
+ */
+#ifndef __IOMMU_PASID_TABLE_H
+#define __IOMMU_PASID_TABLE_H
+
+#include 
+
+#include "arm/arm-smmu-v3/arm-smmu-v3.h"
+
+enum pasid_table_fmt {
+   PASID_TABLE_ARM_SMMU_V3,
+   PASID_TABLE_NUM_FMTS,
+};
+
+/**
+ * struct arm_smmu_cfg_info - arm-smmu-v3 specific configuration data
+ *
+ * @s1_cfg: arm-smmu-v3 stage1 config data
+ * @feat_flag: features supported by arm-smmu-v3 implementation
+ */
+struct arm_smmu_cfg_info {
+   struct arm_smmu_s1_cfg  *s1_cfg;
+   u32 feat_flag;
+};
+
+/**
+ * struct iommu_vendor_psdtable_cfg - Configuration data for PASID tables
+ *
+ * @iommu_dev: device performing the DMA table walks
+ * @fmt: The PASID table format
+ * @base: DMA address of the allocated table, set by the vendor driver
+ * @cfg: arm-smmu-v3 specific config data
+ */
+struct iommu_vendor_psdtable_cfg {
+   struct device   *iommu_dev;
+   enum pasid_table_fmtfmt;
+   dma_addr_t  base;
+   union {
+   struct arm_smmu_cfg_infocfg;
+   } vendor;
+};
+
+struct iommu_vendor_psdtable_ops;
+
+/**
+ * struct iommu_pasid_table - describes a set of PASID tables
+ *
+ * @cookie: An opaque token provided by the IOMMU driver and passed back to any
+ * callback routine.
+ * @cfg: A copy of the PASID table configuration
+ * @ops: The PASID table operations in use for this set of page tables
+ */
+struct iommu_pasid_table {
+   void*cookie;
+   struct iommu_vendor_psdtable_cfgcfg;
+   struct iommu_vendor_psdtable_ops*ops;
+};
+
+#define pasid_table_cfg_to_table(pst_cfg) \
+   container_of((pst_cfg), struct iommu_pasid_table, cfg)
+
+struct iommu_vendor_psdtable_ops {
+   int (*alloc)(struct iommu_vendor_psdtable_cfg *cfg);
+   void (*free)(struct iommu_vendor_psdtable_cfg *cfg);
+   void (*prepare)(struct iommu_vendor_psdtable_cfg *cfg,
+   struct io_pgtable_cfg *pgtbl_cfg, u32 asid);
+   int (*write)(struct iommu_vendor_psdtable_cfg *cfg, int ssid,
+void *cookie);
+   void (*sync)(void *cookie, int ssid, bool leaf);
+};
+
+static inline int iommu_psdtable_alloc(struct iommu_pasid_table *tbl,
+  struct iommu_vendor_psdtable_cfg *cfg)
+{
+   if (!tbl->ops->alloc)
+   return -ENOSYS;
+
+   return tbl->ops->alloc(cfg);
+}
+
+static inline void iommu_psdtable_free(struct iommu_pasid_table *tbl,
+  struct iommu_vendor_psdtable_cfg *cfg)
+{
+   if (!tbl->ops->free)
+   return;
+
+   tbl->ops->free(cfg);
+}
+
+static inline int iommu_psdtable_prepare(struct iommu_pasid_table *tbl,
+struct iommu_vendor_psdtable_cfg *cfg,
+struct io_pgtable_cfg *pgtbl_cfg,
+u32 asid)
+{
+   if (!tbl->ops->prepare)
+   return -ENOSYS;
+
+   tbl->ops->prepare(cfg, pgtbl_cfg, asid);
+   return 0;
+}
+
+static inline int iommu_psdtable_write(struct iommu_pasid_table *tbl,
+  struct iommu_vendor_psdtable_cfg *cfg,
+  int ssid, void *cookie)
+{
+   if (!tbl->ops->write)
+   return -ENOSYS;
+
+   return tbl->ops->write(cfg, ssid, cookie);
+}
+
+static inline int iommu_psdtable_sync(struct iommu_pasid_table *tbl,
+ void *cookie, int ssid, bool leaf)
+{
+   if (!tbl->ops->sync)
+   return -ENOSYS;
+
+   tbl->ops->sync(cookie, ssid, leaf);
+   return 0;
+}
+
+/* A placeholder to register vendor specific pasid layer */
+static inline struct iommu_pasid_table *

[PATCH RFC v1 01/15] iommu/arm-smmu-v3: Create a Context Descriptor library

2021-01-15 Thread Vivek Gautam
Para-virtualized iommu drivers in guest may require to create and manage
context descriptor (CD) tables as part of PASID table allocations.
The PASID tables are passed to host to configure stage-1 tables in
hardware.
Make way for a library driver for CD management to allow para-
virtualized iommu driver call such code.

Signed-off-by: Vivek Gautam 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Jean-Philippe Brucker 
Cc: Eric Auger 
Cc: Alex Williamson 
Cc: Kevin Tian 
Cc: Jacob Pan 
Cc: Liu Yi L 
Cc: Lorenzo Pieralisi 
Cc: Shameerali Kolothum Thodi 
---
 drivers/iommu/arm/arm-smmu-v3/Makefile|   2 +-
 .../arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c  | 223 ++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 216 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |   3 +
 4 files changed, 228 insertions(+), 216 deletions(-)
 create mode 100644 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c

diff --git a/drivers/iommu/arm/arm-smmu-v3/Makefile 
b/drivers/iommu/arm/arm-smmu-v3/Makefile
index 54feb1ecccad..ca1a05b8b8ad 100644
--- a/drivers/iommu/arm/arm-smmu-v3/Makefile
+++ b/drivers/iommu/arm/arm-smmu-v3/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_ARM_SMMU_V3) += arm_smmu_v3.o
-arm_smmu_v3-objs-y += arm-smmu-v3.o
+arm_smmu_v3-objs-y += arm-smmu-v3.o arm-smmu-v3-cd-lib.o
 arm_smmu_v3-objs-$(CONFIG_ARM_SMMU_V3_SVA) += arm-smmu-v3-sva.o
 arm_smmu_v3-objs := $(arm_smmu_v3-objs-y)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
new file mode 100644
index ..97d1786a8a70
--- /dev/null
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * arm-smmu-v3 context descriptor handling library driver
+ *
+ * Copyright (C) 2021 Arm Ltd.
+ */
+
+#include 
+
+#include "arm-smmu-v3.h"
+
+static int arm_smmu_alloc_cd_leaf_table(struct arm_smmu_device *smmu,
+   struct arm_smmu_l1_ctx_desc *l1_desc)
+{
+   size_t size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3);
+
+   l1_desc->l2ptr = dmam_alloc_coherent(smmu->dev, size,
+_desc->l2ptr_dma, GFP_KERNEL);
+   if (!l1_desc->l2ptr) {
+   dev_warn(smmu->dev,
+"failed to allocate context descriptor table\n");
+   return -ENOMEM;
+   }
+   return 0;
+}
+
+static void arm_smmu_write_cd_l1_desc(__le64 *dst,
+ struct arm_smmu_l1_ctx_desc *l1_desc)
+{
+   u64 val = (l1_desc->l2ptr_dma & CTXDESC_L1_DESC_L2PTR_MASK) |
+ CTXDESC_L1_DESC_V;
+
+   /* See comment in arm_smmu_write_ctx_desc() */
+   WRITE_ONCE(*dst, cpu_to_le64(val));
+}
+
+static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain,
+  u32 ssid)
+{
+   __le64 *l1ptr;
+   unsigned int idx;
+   struct arm_smmu_l1_ctx_desc *l1_desc;
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+   struct arm_smmu_ctx_desc_cfg *cdcfg = _domain->s1_cfg.cdcfg;
+
+   if (smmu_domain->s1_cfg.s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
+   return cdcfg->cdtab + ssid * CTXDESC_CD_DWORDS;
+
+   idx = ssid >> CTXDESC_SPLIT;
+   l1_desc = >l1_desc[idx];
+   if (!l1_desc->l2ptr) {
+   if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
+   return NULL;
+
+   l1ptr = cdcfg->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
+   arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
+   /* An invalid L1CD can be cached */
+   arm_smmu_sync_cd(smmu_domain, ssid, false);
+   }
+   idx = ssid & (CTXDESC_L2_ENTRIES - 1);
+   return l1_desc->l2ptr + idx * CTXDESC_CD_DWORDS;
+}
+
+int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
+   struct arm_smmu_ctx_desc *cd)
+{
+   /*
+* This function handles the following cases:
+*
+* (1) Install primary CD, for normal DMA traffic (SSID = 0).
+* (2) Install a secondary CD, for SID+SSID traffic.
+* (3) Update ASID of a CD. Atomically write the first 64 bits of the
+* CD, then invalidate the old entry and mappings.
+* (4) Quiesce the context without clearing the valid bit. Disable
+* translation, and ignore any translation fault.
+* (5) Remove a secondary CD.
+*/
+   u64 val;
+   bool cd_live;
+   __le64 *cdptr;
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+   if (WARN_ON(ssid >= (1 << smmu_domain->s1_cfg.s1cdmax)))
+   return -E2BIG;
+
+   cdptr = arm_smmu_get_cd_ptr(smmu_domain, ssid);
+   if (!cdptr)
+   return -ENOMEM;
+
+   val = le64_to_cpu(cdptr[0]);
+   cd_live = !!(val & CTXDESC_CD_0_V);
+
+   if (!cd) { /* (5) */
+  

[PATCH RFC v1 00/15] iommu/virtio: Nested stage support with Arm

2021-01-15 Thread Vivek Gautam
This patch-series aims at enabling Nested stage translation in guests
using virtio-iommu as the paravirtualized iommu. The backend is supported
with Arm SMMU-v3 that provides nested stage-1 and stage-2 translation.

This series derives its purpose from various efforts happening to add
support for Shared Virtual Addressing (SVA) in host and guest. On Arm,
most of the support for SVA has already landed. The support for nested
stage translation and fault reporting to guest has been proposed [1].
The related changes required in VFIO [2] framework have also been put
forward.

This series proposes changes in virtio-iommu to program PASID tables
and related stage-1 page tables. A simple iommu-pasid-table library
is added for this purpose that interacts with vendor drivers to
allocate and populate PASID tables.
In Arm SMMUv3 we propose to pull the Context Descriptor (CD) management
code out of the arm-smmu-v3 driver and add that as a glue vendor layer
to support allocating CD tables, and populating them with right values.
These CD tables are essentially the PASID tables and contain stage-1
page table configurations too.
A request to setup these CD tables come from virtio-iommu driver using
the iommu-pasid-table library when running on Arm. The virtio-iommu
then pass these PASID tables to the host using the right virtio backend
and support in VMM.

For testing we have added necessary support in kvmtool. The changes in
kvmtool are based on virtio-iommu development branch by Jean-Philippe
Brucker [3].

The tested kernel branch contains following in the order bottom to top
on the git hash -
a) v5.11-rc3
b) arm-smmu-v3 [1] and vfio [2] changes from Eric to add nested page
   table support for Arm.
c) Smmu test engine patches from Jean-Philippe's branch [4]
d) This series
e) Domain nesting info patches [5][6][7].
f) Changes to add arm-smmu-v3 specific nesting info (to be sent to
   the list).

This kernel is tested on Neoverse reference software stack with
Fixed virtual platform. Public version of the software stack and
FVP is available here[8][9].

A big thanks to Jean-Philippe for his contributions towards this work
and for his valuable guidance.

[1] 
https://lore.kernel.org/linux-iommu/20201118112151.25412-1-eric.au...@redhat.com/T/
[2] 
https://lore.kernel.org/kvmarm/20201116110030.32335-12-eric.au...@redhat.com/T/
[3] https://jpbrucker.net/git/kvmtool/log/?h=virtio-iommu/devel
[4] https://jpbrucker.net/git/linux/log/?h=sva/smmute
[5] 
https://lore.kernel.org/kvm/1599734733-6431-2-git-send-email-yi.l@intel.com/
[6] 
https://lore.kernel.org/kvm/1599734733-6431-3-git-send-email-yi.l@intel.com/
[7] 
https://lore.kernel.org/kvm/1599734733-6431-4-git-send-email-yi.l@intel.com/
[8] 
https://developer.arm.com/tools-and-software/open-source-software/arm-platforms-software/arm-ecosystem-fvps
[9] 
https://git.linaro.org/landing-teams/working/arm/arm-reference-platforms.git/about/docs/rdn1edge/user-guide.rst

Jean-Philippe Brucker (6):
  iommu/virtio: Add headers for table format probing
  iommu/virtio: Add table format probing
  iommu/virtio: Add headers for binding pasid table in iommu
  iommu/virtio: Add support for INVALIDATE request
  iommu/virtio: Attach Arm PASID tables when available
  iommu/virtio: Add support for Arm LPAE page table format

Vivek Gautam (9):
  iommu/arm-smmu-v3: Create a Context Descriptor library
  iommu: Add a simple PASID table library
  iommu/arm-smmu-v3: Update drivers to work with iommu-pasid-table
  iommu/arm-smmu-v3: Update CD base address info for user-space
  iommu/arm-smmu-v3: Set sync op from consumer driver of cd-lib
  iommu: Add asid_bits to arm smmu-v3 stage1 table info
  iommu/virtio: Update table format probing header
  iommu/virtio: Prepare to add attach pasid table infrastructure
  iommu/virtio: Update fault type and reason info for viommu fault

 drivers/iommu/arm/arm-smmu-v3/Makefile|   2 +-
 .../arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c  | 283 +++
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  16 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 268 +--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |   4 +-
 drivers/iommu/iommu-pasid-table.h | 140 
 drivers/iommu/virtio-iommu.c  | 692 +-
 include/uapi/linux/iommu.h|   2 +-
 include/uapi/linux/virtio_iommu.h | 158 +++-
 9 files changed, 1303 insertions(+), 262 deletions(-)
 create mode 100644 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-cd-lib.c
 create mode 100644 drivers/iommu/iommu-pasid-table.h

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RESEND PATCH v3 0/4] iommu/iova: Solve longterm IOVA issue

2021-01-15 Thread John Garry
+ linux-scsi (see 
https://lore.kernel.org/linux-iommu/1607538189-237944-4-git-send-email-john.ga...@huawei.com/)


On 17/11/2020 10:25, John Garry wrote:

This series contains a patch to solve the longterm IOVA issue which
leizhen originally tried to address at [0].

A sieved kernel log is at the following, showing periodic dumps of IOVA
sizes, per CPU and per depot bin, per IOVA size granule:
https://raw.githubusercontent.com/hisilicon/kernel-dev/topic-iommu-5.10-iova-debug-v3/aging_test

Notice, for example, the following logs:
[13175.355584] print_iova1 cpu_total=40135 depot_total=3866 total=44001
[83483.457858] print_iova1 cpu_total=62532 depot_total=24476 total=87008

Where total IOVA rcache size has grown from 44K->87K over a long time.



JFYI, I am able to reproduce this aging issue on another storage card, 
an LSI SAS 3008, so now it's harder to say it's an issue specific to a 
(buggy) single driver.


A log of the IOVA size dumps is here:
https://raw.githubusercontent.com/hisilicon/kernel-dev/064c4dc8869b3f2ad07edffceafde0b129f276b0/lsi3008_dmesg

Notice again how the total IOVA size goes up over time, like:
[ 68.176914] print_iova1 cpu_total=23663 depot_total=256 total=23919
[ 2337.008194] print_iova1 cpu_total=67361 depot_total=9088 total=76449
[17141.860078] print_iova1 cpu_total=73397 depot_total=10368 total=83765
[27087.850830] print_iova1 cpu_total=73386 depot_total=10624 total=84010
[10434.042877] print_iova1 cpu_total=90652 depot_total=12928 total=103580

I had to change some settings for that storage card to reproduce, though 
[0]. Could explain why no other reports.


So please consider this issue again...

Thanks,
john

[0] 
https://lore.kernel.org/linux-scsi/dd8e6fdc-397d-b6ad-3371-0b65d1932...@huawei.com/T/#m953d21446a5756981412c92d0924ca65c8d2f3a5



Along with this patch, I included the following:
- A smaller helper to clear all IOVAs for a domain
- Change polarity of the IOVA magazine helpers
- Small optimisation from Cong Wang included, which was never applied [1].
   There was some debate of the other patches in that series, but this one
   is quite straightforward.

Differnces to v2:
- Update commit message for patch 3/4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[GIT PULL] IOMMU fixes for -rc4

2021-01-15 Thread Will Deacon
Hi Linus,

Please pull these three IOMMU fixes for -rc4. The main one is a change
to the Intel IOMMU driver to fix the handling of unaligned addresses
when invalidating the TLB. The fix itself is a bit ugly (the caller does
a bunch of shifting which is then effectively undone later in the
callchain), but Lu has patches to clean all of this up in 5.12.

Thanks,

Will

--->8

The following changes since commit 7c29ada5e70083805bc3a68daa23441df421fbee:

  iommu/vt-d: Fix ineffective devTLB invalidation for subdevices (2021-01-07 
14:38:15 +)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git tags/iommu-fixes

for you to fetch changes up to 694a1c0adebee9152a9ba0320468f7921aca647d:

  iommu/vt-d: Fix duplicate included linux/dma-map-ops.h (2021-01-12 16:56:20 
+)


iommu fixes for -rc4

- Fix address alignment handling for VT-D TLB invalidation

- Enable workarounds for buggy Qualcomm firmware on two more SoCs

- Drop duplicate #include


Konrad Dybcio (1):
  iommu: arm-smmu-qcom: Add sdm630/msm8998 compatibles for qcom quirks

Lu Baolu (1):
  iommu/vt-d: Fix unaligned addresses for intel_flush_svm_range_dev()

Tian Tao (1):
  iommu/vt-d: Fix duplicate included linux/dma-map-ops.h

 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c |  2 ++
 drivers/iommu/intel/iommu.c|  1 -
 drivers/iommu/intel/svm.c  | 22 --
 3 files changed, 22 insertions(+), 3 deletions(-)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Consult on ARM SMMU debugfs

2021-01-15 Thread chenxiang (M)



在 2021/1/12 4:01, Robin Murphy 写道:

On 2021-01-07 02:45, chenxiang (M) wrote:

Hi Will,  Robin or other guys,

When debugging SMMU/SVA issue on huawei ARM64 board, we find that it 
lacks of enough debugfs for ARM SMMU driver (such as


the value of STE/CD which we need to check sometimes). Currently it 
creates top-level iommu directory in debugfs, but there is no debugfs


for ARM SMMU driver specially. Do you know whether ARM have the plan 
to do that recently?


FWIW I don't think I've ever felt the need to need to inspect the 
Stream Table on a live system. So far the nature of the STE code has 
been simple enough that it's very hard for any given STE to be *wrong* 
- either it's set up as expected and thus works fine, or it's not 
initialised at all and you get C_BAD_STE, where 99% of the time you 
then just cross-reference the Stream ID against the firmware and find 
that the DT/IORT is wrong.


Similarly I don't think I've even even *seen* an issue that could be 
attributed to a context descriptor, although I appreciate that as we 
start landing more PASID and SVA support the scope for that starts to 
widen considerably.


Thank you for your reply.  I aggree that it is very hard for the content 
of STE/CD to be wrong in current code, but  there are more upsteaming 
features(such as SVA/vSMMU) which are related to SMMU,
when debugging with those features, it is useful for us to locate issues 
if there are interfaces to dump those info. And also when debugging 
together with hardware guys, the content of dump is important for them too.




Feel free to propose a patch if you believe it would be genuinely 
useful and won't just bit-rot into a maintenance burden, but it's not 
something that's on our roadmap here.


OK, we are considering about incorporating following requirements into 
the plan:

- Dump the value of STE/CD of devices
- Dump page table entries of devices
- Dump the entries info of CMDQ/EVENTQ



Thanks,
Robin.

.




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[RESEND PATCH v2 1/2] vfio/iommu_type1: Populate full dirty when detach non-pinned group

2021-01-15 Thread Keqian Zhu
If a group with non-pinned-page dirty scope is detached with dirty
logging enabled, we should fully populate the dirty bitmaps at the
time it's removed since we don't know the extent of its previous DMA,
nor will the group be present to trigger the full bitmap when the user
retrieves the dirty bitmap.

Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages 
tracking")
Suggested-by: Alex Williamson 
Signed-off-by: Keqian Zhu 
---
 drivers/vfio/vfio_iommu_type1.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 0b4dedaa9128..c16924cd54e7 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -236,6 +236,19 @@ static void vfio_dma_populate_bitmap(struct vfio_dma *dma, 
size_t pgsize)
}
 }
 
+static void vfio_iommu_populate_bitmap_full(struct vfio_iommu *iommu)
+{
+   struct rb_node *n;
+   unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
+
+   for (n = rb_first(>dma_list); n; n = rb_next(n)) {
+   struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node);
+
+   if (dma->iommu_mapped)
+   bitmap_set(dma->bitmap, 0, dma->size >> pgshift);
+   }
+}
+
 static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu, size_t pgsize)
 {
struct rb_node *n;
@@ -2415,8 +2428,12 @@ static void vfio_iommu_type1_detach_group(void 
*iommu_data,
 * Removal of a group without dirty tracking may allow the iommu scope
 * to be promoted.
 */
-   if (update_dirty_scope)
+   if (update_dirty_scope) {
update_pinned_page_dirty_scope(iommu);
+   /* Promote pinned_scope successfully during dirty tracking? */
+   if (iommu->dirty_page_tracking && 
iommu->pinned_page_dirty_scope)
+   vfio_iommu_populate_bitmap_full(iommu);
+   }
mutex_unlock(>lock);
 }
 
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH v2 2/2] vfio/iommu_type1: Sanity check pfn_list when remove vfio_dma

2021-01-15 Thread Keqian Zhu
vfio_sanity_check_pfn_list() is used to check whether pfn_list of
vfio_dma is empty when remove the external domain, so it makes a
wrong assumption that only external domain will add pfn to dma pfn_list.

Now we apply this check when remove a specific vfio_dma and extract
the notifier check just for external domain.

Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices")
Signed-off-by: Keqian Zhu 
---
 drivers/vfio/vfio_iommu_type1.c | 24 +---
 1 file changed, 5 insertions(+), 19 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c16924cd54e7..9b7fcff6bd81 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -958,6 +958,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, 
struct vfio_dma *dma,
 
 static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
 {
+   WARN_ON(!RB_EMPTY_ROOT(>pfn_list));
vfio_unmap_unpin(iommu, dma, true);
vfio_unlink_dma(iommu, dma);
put_task_struct(dma->task);
@@ -2251,23 +2252,6 @@ static void vfio_iommu_unmap_unpin_reaccount(struct 
vfio_iommu *iommu)
}
 }
 
-static void vfio_sanity_check_pfn_list(struct vfio_iommu *iommu)
-{
-   struct rb_node *n;
-
-   n = rb_first(>dma_list);
-   for (; n; n = rb_next(n)) {
-   struct vfio_dma *dma;
-
-   dma = rb_entry(n, struct vfio_dma, node);
-
-   if (WARN_ON(!RB_EMPTY_ROOT(>pfn_list)))
-   break;
-   }
-   /* mdev vendor driver must unregister notifier */
-   WARN_ON(iommu->notifier.head);
-}
-
 /*
  * Called when a domain is removed in detach. It is possible that
  * the removed domain decided the iova aperture window. Modify the
@@ -2367,7 +2351,8 @@ static void vfio_iommu_type1_detach_group(void 
*iommu_data,
kfree(group);
 
if (list_empty(>external_domain->group_list)) {
-   vfio_sanity_check_pfn_list(iommu);
+   /* mdev vendor driver must unregister notifier 
*/
+   WARN_ON(iommu->notifier.head);
 
if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu))
vfio_iommu_unmap_unpin_all(iommu);
@@ -2492,7 +2477,8 @@ static void vfio_iommu_type1_release(void *iommu_data)
 
if (iommu->external_domain) {
vfio_release_domain(iommu->external_domain, true);
-   vfio_sanity_check_pfn_list(iommu);
+   /* mdev vendor driver must unregister notifier */
+   WARN_ON(iommu->notifier.head);
kfree(iommu->external_domain);
}
 
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH v2 0/2] vfio/iommu_type1: some fixes

2021-01-15 Thread Keqian Zhu
changelog:

v2:
 - Address suggestions from Alex.
 - Remove unnecessary patches.

Keqian Zhu (2):
  vfio/iommu_type1: Populate full dirty when detach non-pinned group
  vfio/iommu_type1: Sanity check pfn_list when remove vfio_dma

 drivers/vfio/vfio_iommu_type1.c | 42 +
 1 file changed, 22 insertions(+), 20 deletions(-)

-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/5] vfio/iommu_type1: Fixes vfio_dma_populate_bitmap to avoid dirty lose

2021-01-15 Thread Keqian Zhu



On 2021/1/15 1:14, Alex Williamson wrote:
> On Thu, 14 Jan 2021 21:05:23 +0800
> Keqian Zhu  wrote:
> 
>> Hi Alex,
>>
>> On 2021/1/13 5:20, Alex Williamson wrote:
>>> On Thu, 7 Jan 2021 17:28:57 +0800
>>> Keqian Zhu  wrote:
>>>   
 Defer checking whether vfio_dma is of fully-dirty in update_user_bitmap
 is easy to lose dirty log. For example, after promoting pinned_scope of
 vfio_iommu, vfio_dma is not considered as fully-dirty, then we may lose
 dirty log that occurs before vfio_iommu is promoted.

 The key point is that pinned-dirty is not a real dirty tracking way, it
 can't continuously track dirty pages, but just restrict dirty scope. It
 is essentially the same as fully-dirty. Fully-dirty is of full-scope and
 pinned-dirty is of pinned-scope.

 So we must mark pinned-dirty or fully-dirty after we start dirty tracking
 or clear dirty bitmap, to ensure that dirty log is marked right away.  
>>>
>>> I was initially convinced by these first three patches, but upon
>>> further review, I think the premise is wrong.  AIUI, the concern across
>>> these patches is that our dirty bitmap is only populated with pages
>>> dirtied by pinning and we only take into account the pinned page dirty
>>> scope at the time the bitmap is retrieved by the user.  You suppose
>>> this presents a gap where if a vendor driver has not yet identified
>>> with a page pinning scope that the entire bitmap should be considered
>>> dirty regardless of whether that driver later pins pages prior to the
>>> user retrieving the dirty bitmap.  
>> Yes, this is my concern. And there are some other scenarios.
>>
>> 1. If a non-pinned iommu-backed domain is detached between starting
>> dirty log and retrieving dirty bitmap, then the dirty log is missed
>> (As you said in the last paragraph).
>>
>> 2. If all vendor drivers pinned (means iommu pinned_scope is true),
>> and an vendor driver do pin/unpin pair between starting dirty log and
>> retrieving dirty bitmap, then the dirty log of these unpinned pages
>> are missed.
> 
> Can you identity where this happen?  I believe this assertion is
> incorrect.  When dirty logging is enabled vfio_dma_populate_bitmap()
> marks all existing pinned pages as dirty.  In the scenario you
> describe, the iommu pinned page dirty scope is irrelevant.  We only
> track pinned or external DMA access pages for exactly this reason.
> Unpinning a page never clears the dirty bitmap, only the user
> retrieving the bitmap or disabling dirty logging clears the bitmap.  A
> page that has been unpinned is transiently dirty, it is not repopulated
> in the bitmap after the user has retrieved the bitmap because the
> device no longer has access to it to consider it perpetually dirty.
Right, thanks for making the logic more clear to me ;-). Then just one issue to 
fix.

> 
>>> I don't think this is how we intended the cooperation between the
>>> iommu driver and vendor driver to work.  By pinning pages a vendor
>>> driver is not declaring that only their future dirty page scope is
>>> limited to pinned pages, instead they're declaring themselves as a
>>> participant in dirty page tracking and take responsibility for
>>> pinning any necessary pages.  For example we might extend
>>> VFIO_IOMMU_DIRTY_PAGES_FLAG_START to trigger a blocking
>>> notification to groups to not only begin dirty tracking, but also
>>> to synchronously register their current device DMA footprint.  This
>>> patch would require a vendor driver to possibly perform a
>>> gratuitous page pinning in order to set the scope prior to dirty
>>> logging being enabled, or else the initial bitmap will be fully
>>> dirty.  
>> I get what you mean ;-). You said that there is time gap between we
>> attach a device and this device declares itself is dirty traceable.
>>
>> However, this makes it difficult to management dirty log tracking (I
>> list two bugs). If the vfio devices can declare themselves dirty
>> traceable when attach to container, then the logic of dirty tracking
>> is much more clear ;-) .
> 
> There's only one bug above afaict, which should be easily fixed.  I
> think if you actually dig into the problem of a device declaring
> themselves as dirty tracking capable, when the IOMMU backend works on
> group, not devices, and it's groups that are attached to containers,
> you might see that the logic is not so clear.  I also don't see this as
> a significant issue, you're focusing on a niche scenario where a device
> is hot-added to a VM, which is immediately migrated before the device
> identifies itself by pinning pages.  In that scenario the iommu dirty
> scope would be overly broad and we'd indicate all pages are dirty.
> This errors on the side of reporting too much is dirty, which still
> provides a correct result to the user.  The more common scenario would
> be migration of a "long" running, stable VM, where drivers are active
> and devices have already pinned pages if they intend to.
>
OK ;-)

Now I 

[PATCH v2 2/2] vfio/iommu_type1: Sanity check pfn_list when remove vfio_dma

2021-01-15 Thread Keqian Zhu
vfio_sanity_check_pfn_list() is used to check whether pfn_list of
vfio_dma is empty when remove the external domain, so it makes a
wrong assumption that only external domain will add pfn to dma pfn_list.

Now we apply this check when remove a specific vfio_dma and extract
the notifier check just for external domain.

Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices")
Signed-off-by: Keqian Zhu 
---
 drivers/vfio/vfio_iommu_type1.c | 24 +---
 1 file changed, 5 insertions(+), 19 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 4e82b9a3440f..a9bc15e84a4e 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -958,6 +958,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, 
struct vfio_dma *dma,
 
 static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
 {
+   WARN_ON(!RB_EMPTY_ROOT(>pfn_list);
vfio_unmap_unpin(iommu, dma, true);
vfio_unlink_dma(iommu, dma);
put_task_struct(dma->task);
@@ -2251,23 +2252,6 @@ static void vfio_iommu_unmap_unpin_reaccount(struct 
vfio_iommu *iommu)
}
 }
 
-static void vfio_sanity_check_pfn_list(struct vfio_iommu *iommu)
-{
-   struct rb_node *n;
-
-   n = rb_first(>dma_list);
-   for (; n; n = rb_next(n)) {
-   struct vfio_dma *dma;
-
-   dma = rb_entry(n, struct vfio_dma, node);
-
-   if (WARN_ON(!RB_EMPTY_ROOT(>pfn_list)))
-   break;
-   }
-   /* mdev vendor driver must unregister notifier */
-   WARN_ON(iommu->notifier.head);
-}
-
 /*
  * Called when a domain is removed in detach. It is possible that
  * the removed domain decided the iova aperture window. Modify the
@@ -2367,7 +2351,8 @@ static void vfio_iommu_type1_detach_group(void 
*iommu_data,
kfree(group);
 
if (list_empty(>external_domain->group_list)) {
-   vfio_sanity_check_pfn_list(iommu);
+   /* mdev vendor driver must unregister notifier 
*/
+   WARN_ON(iommu->notifier.head);
 
if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu))
vfio_iommu_unmap_unpin_all(iommu);
@@ -2491,7 +2476,8 @@ static void vfio_iommu_type1_release(void *iommu_data)
 
if (iommu->external_domain) {
vfio_release_domain(iommu->external_domain, true);
-   vfio_sanity_check_pfn_list(iommu);
+   /* mdev vendor driver must unregister notifier */
+   WARN_ON(iommu->notifier.head);
kfree(iommu->external_domain);
}
 
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/2] vfio/iommu_type1: Populate full dirty when detach non-pinned group

2021-01-15 Thread Keqian Zhu
If a group with non-pinned-page dirty scope is detached with dirty
logging enabled, we should fully populate the dirty bitmaps at the
time it's removed since we don't know the extent of its previous DMA,
nor will the group be present to trigger the full bitmap when the user
retrieves the dirty bitmap.

Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages 
tracking")
Suggested-by: Alex Williamson 
Signed-off-by: Keqian Zhu 
---
 drivers/vfio/vfio_iommu_type1.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 0b4dedaa9128..4e82b9a3440f 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -236,6 +236,19 @@ static void vfio_dma_populate_bitmap(struct vfio_dma *dma, 
size_t pgsize)
}
 }
 
+static void vfio_iommu_populate_bitmap_full(struct vfio_iommu *iommu)
+{
+   struct rb_node *n;
+   unsigned long pgshift = __ffs(iommu->pgsize_bitmap);
+
+   for (n = rb_first(>dma_list); n; n = rb_next(n)) {
+   struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node);
+
+   if (dma->iommu_mapped)
+   bitmap_set(dma->bitmap, 0, dma->size >> pgshift);
+   }
+}
+
 static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu, size_t pgsize)
 {
struct rb_node *n;
@@ -2415,8 +2428,11 @@ static void vfio_iommu_type1_detach_group(void 
*iommu_data,
 * Removal of a group without dirty tracking may allow the iommu scope
 * to be promoted.
 */
-   if (update_dirty_scope)
+   if (update_dirty_scope) {
update_pinned_page_dirty_scope(iommu);
+   if (iommu->dirty_page_tracking)
+   vfio_iommu_populate_bitmap_full(iommu);
+   }
mutex_unlock(>lock);
 }
 
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/2] vfio/iommu_type1: some fixes

2021-01-15 Thread Keqian Zhu
changelog:

v2:
 - Address suggestions from Alex.
 - Remove unnecessary patches.

Keqian Zhu (2):
  vfio/iommu_type1: Populate full dirty when detach non-pinned group
  vfio/iommu_type1: Sanity check pfn_list when remove vfio_dma

 drivers/vfio/vfio_iommu_type1.c | 42 +
 1 file changed, 22 insertions(+), 20 deletions(-)

-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] dt-bindings: arm-smmu: Add sm8350 compatible string

2021-01-15 Thread Vinod Koul
Add compatible string for sm8350 iommu to documentation.

Signed-off-by: Vinod Koul 
---
 Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index 3b63f2ae24db..161a5d389808 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -37,6 +37,7 @@ properties:
   - qcom,sdm845-smmu-500
   - qcom,sm8150-smmu-500
   - qcom,sm8250-smmu-500
+  - qcom,sm8350-smmu-500
   - const: arm,mmu-500
   - description: Qcom Adreno GPUs implementing "arm,smmu-v2"
 items:
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] iommu: arm-smmu-impl: Add SM8350 qcom iommu implementation

2021-01-15 Thread Vinod Koul
Add SM8350 qcom iommu implementation to the table of
qcom_smmu_impl_of_match table which brings in iommu support for SM8350
SoC

Signed-off-by: Vinod Koul 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index 5dff7ffbef11..8044a9bfca66 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -327,6 +327,7 @@ static const struct of_device_id __maybe_unused 
qcom_smmu_impl_of_match[] = {
{ .compatible = "qcom,sdm845-smmu-500" },
{ .compatible = "qcom,sm8150-smmu-500" },
{ .compatible = "qcom,sm8250-smmu-500" },
+   { .compatible = "qcom,sm8350-smmu-500" },
{ }
 };
 
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [EXTERNAL] Re: amdgpu error whenever IOMMU is enabled

2021-01-15 Thread jroe...@suse.de
Hi Edgar,

On Mon, Nov 23, 2020 at 06:41:18AM +, Merger, Edgar [AUTOSOL/MAS/AUGS] 
wrote:
> Just wanted to follow-up on that topic.
> Is that quirk already put into upstream kernel?

Sorry for the late reply, I had to take an extended sick leave. I will
take care of sending this fix upstream next week.

Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu