Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2022-01-26 Thread Mike Lothian
On Wed, 26 Jan 2022 at 10:12, Maxim Levitsky  wrote:
>
> Great, your system does seem to support GA log
> (but a patch to check if, other that assume blindly that it is supported is
> something that should be done).
>
> So could you bump the LOOP_TIMEOUT like by 10x or so and see if the problem 
> goes away?
>
> (that code should be rewritten to time based wait and not just blindly loop 
> like that,
> I also can prepare a patch for that as well).
>
> Best regards,
> Maxim Levitsky
>

Hi

I've done quite a few restarts with the LOOP_TIMEOUT increased and
I've not seen the issue since

Cheers

Mike
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-01-26 Thread Thomas Gleixner
On Wed, Jan 26 2022 at 09:36, Fenghua Yu wrote:
> On Wed, Jan 26, 2022 at 03:23:42PM +0100, Thomas Gleixner wrote:
>> On Tue, Jan 25 2022 at 07:18, Fenghua Yu wrote:
>> While looking at ioasid_put() usage I tripped over the following UAF
>> issue:
>> 
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -4817,8 +4817,10 @@ static int aux_domain_add_dev(struct dma
>>  auxiliary_unlink_device(domain, dev);
>>  link_failed:
>>  spin_unlock_irqrestore(&device_domain_lock, flags);
>> -if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
>> +if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
>>  ioasid_put(domain->default_pasid);
>> +domain->default_pasid = INVALID_IOASID;
>> +}
>>  
>>  return ret;
>>  }
>> @@ -4847,8 +4849,10 @@ static void aux_domain_remove_dev(struct
>>  
>>  spin_unlock_irqrestore(&device_domain_lock, flags);
>>  
>> -if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
>> +if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
>>  ioasid_put(domain->default_pasid);
>> +domain->default_pasid = INVALID_IOASID;
>> +}
>>  }
>>  
>>  static int prepare_domain_attach_device(struct iommu_domain *domain,
>
> The above patch fixes an existing issue. I will put it in a separate patch,
> right?

Correct.

> It cannot be applied cleanly to the upstream tree. Do you want me to base
> the above patch (and the whole patch set) to the upstream tree or a specific
> tip branch?

Against Linus tree please so that the bugfix applies.

> I will fold the following patch into patch #5. The patch #11 (the doc patch)
> also needs to remove one paragraph talking about refcount.
>
> So I will send the whole patch set with the following changes:
> 1. One new bug fix patch (the above patch)
> 2. Updated patch #5 (with the following patch folded)
> 3. Updated patch #11 (removing refcount description)

Looks good.

Thanks,

tglx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/iova: Separate out rcache init

2022-01-26 Thread John Garry via iommu

Hi Robin,



Signed-off-by: John Garry

Mangled patch? (no "---" separator here)


hmm... not sure. As an experiment, I just downloaded this patch from 
lore.kernel.org and it applies ok.




Overall this looks great, just a few comments further down...



...


+}
+EXPORT_SYMBOL_GPL(iova_domain_init_rcaches);
+
+void iova_domain_free_rcaches(struct iova_domain *iovad)
+{
+   cpuhp_state_remove_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD,
+   &iovad->cpuhp_dead);
+   free_iova_rcaches(iovad);
   }
+EXPORT_SYMBOL_GPL(iova_domain_free_rcaches);

I think we should continue to expect external callers to clean up with
put_iova_domain(). 


ok, fine, makes sense


If they aren't doing that already they have a bug
(albeit minor), and we don't want to give the impression that it's OK to
free the caches at any point*other*  than tearing down the whole
iova_domain, since the implementation really wouldn't expect that.


   /*
* Try inserting IOVA range starting with 'iova_pfn' into 'rcache', and
@@ -831,7 +872,7 @@ static unsigned long iova_rcache_get(struct iova_domain 
*iovad,
   {
unsigned int log_size = order_base_2(size);
   
-	if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE)

+   if (log_size >= IOVA_RANGE_CACHE_MAX_SIZE || !iovad->rcaches)
return 0;
   


..


@@ -102,6 +92,8 @@ struct iova *reserve_iova(struct iova_domain *iovad, 
unsigned long pfn_lo,
unsigned long pfn_hi);
   void init_iova_domain(struct iova_domain *iovad, unsigned long granule,
unsigned long start_pfn);
+int iova_domain_init_rcaches(struct iova_domain *iovad);
+void iova_domain_free_rcaches(struct iova_domain *iovad);

As above, I vote for just forward-declaring the free routine in iova.c
and keeping it entirely private.


ok




   struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn);
   void put_iova_domain(struct iova_domain *iovad);
   #else
@@ -157,6 +149,15 @@ static inline void init_iova_domain(struct iova_domain 
*iovad,
   {
   }
   
+static inline int iova_domain_init_rcaches(struct iova_domain *iovad)

+{
+   return -ENOTSUPP;
+}
+
+static inline void iova_domain_free_rcaches(struct iova_domain *iovad)
+{
+}
+

I'd be inclined not to add stubs at all - I think it's a reasonable
assumption that anyone involved enough to care about rcaches has a hard
dependency on IOMMU_IOVA already.


So iova_domain_free_rcaches() would disappear from here as a result of 
the changes discussed above.


As for iova_domain_init_rcaches(), I was just following the IOMMU/IOVA 
coding practice - that is, always stub.



It's certainly the case today, and I'd
hardly want to encourage more users anyway.


I think that stronger deterrents would be needed :)

Anyway, I can remove it.

Thanks,
John




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-01-26 Thread Fenghua Yu
Hi, Thomas,

On Wed, Jan 26, 2022 at 03:23:42PM +0100, Thomas Gleixner wrote:
> On Tue, Jan 25 2022 at 07:18, Fenghua Yu wrote:
> > On Mon, Jan 24, 2022 at 09:55:56PM +0100, Thomas Gleixner wrote:
> >  /**
> >   * ioasid_put - Release a reference to an ioasid
> >   * @ioasid: the ID to remove
> 
> which in turn makes ioasid_put() a misnomer and the whole refcounting of
> the ioasid a pointless exercise.
> 
> While looking at ioasid_put() usage I tripped over the following UAF
> issue:
> 
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4817,8 +4817,10 @@ static int aux_domain_add_dev(struct dma
>   auxiliary_unlink_device(domain, dev);
>  link_failed:
>   spin_unlock_irqrestore(&device_domain_lock, flags);
> - if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
> + if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
>   ioasid_put(domain->default_pasid);
> + domain->default_pasid = INVALID_IOASID;
> + }
>  
>   return ret;
>  }
> @@ -4847,8 +4849,10 @@ static void aux_domain_remove_dev(struct
>  
>   spin_unlock_irqrestore(&device_domain_lock, flags);
>  
> - if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
> + if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
>   ioasid_put(domain->default_pasid);
> + domain->default_pasid = INVALID_IOASID;
> + }
>  }
>  
>  static int prepare_domain_attach_device(struct iommu_domain *domain,

The above patch fixes an existing issue. I will put it in a separate patch,
right?

It cannot be applied cleanly to the upstream tree. Do you want me to base
the above patch (and the whole patch set) to the upstream tree or a specific
tip branch?

I will fold the following patch into patch #5. The patch #11 (the doc patch)
also needs to remove one paragraph talking about refcount.

So I will send the whole patch set with the following changes:
1. One new bug fix patch (the above patch)
2. Updated patch #5 (with the following patch folded)
3. Updated patch #11 (removing refcount description)

Are the changes OK to you?
 
> 
> Vs. ioasid_put() I think we should fold the following:
> 
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4818,7 +4818,7 @@ static int aux_domain_add_dev(struct dma
>  link_failed:
>   spin_unlock_irqrestore(&device_domain_lock, flags);
>   if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
> - ioasid_put(domain->default_pasid);
> + ioasid_free(domain->default_pasid);
>   domain->default_pasid = INVALID_IOASID;
>   }
>  
> @@ -4850,7 +4850,7 @@ static void aux_domain_remove_dev(struct
>   spin_unlock_irqrestore(&device_domain_lock, flags);
>  
>   if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
> - ioasid_put(domain->default_pasid);
> + ioasid_free(domain->default_pasid);
>   domain->default_pasid = INVALID_IOASID;
>   }
>  }
> --- a/drivers/iommu/ioasid.c
> +++ b/drivers/iommu/ioasid.c
> @@ -2,7 +2,7 @@
>  /*
>   * I/O Address Space ID allocator. There is one global IOASID space, split 
> into
>   * subsets. Users create a subset with DECLARE_IOASID_SET, then allocate and
> - * free IOASIDs with ioasid_alloc and ioasid_put.
> + * free IOASIDs with ioasid_alloc() and ioasid_free().
>   */
>  #include 
>  #include 
> @@ -15,7 +15,6 @@ struct ioasid_data {
>   struct ioasid_set *set;
>   void *private;
>   struct rcu_head rcu;
> - refcount_t refs;
>  };
>  
>  /*
> @@ -315,7 +314,6 @@ ioasid_t ioasid_alloc(struct ioasid_set
>  
>   data->set = set;
>   data->private = private;
> - refcount_set(&data->refs, 1);
>  
>   /*
>* Custom allocator needs allocator data to perform platform specific
> @@ -348,17 +346,11 @@ ioasid_t ioasid_alloc(struct ioasid_set
>  EXPORT_SYMBOL_GPL(ioasid_alloc);
>  
>  /**
> - * ioasid_put - Release a reference to an ioasid
> + * ioasid_free - Free an ioasid
>   * @ioasid: the ID to remove
> - *
> - * Put a reference to the IOASID, free it when the number of references 
> drops to
> - * zero.
> - *
> - * Return: %true if the IOASID was freed, %false otherwise.
>   */
> -bool ioasid_put(ioasid_t ioasid)
> +void ioasid_free(ioasid_t ioasid)
>  {
> - bool free = false;
>   struct ioasid_data *ioasid_data;
>  
>   spin_lock(&ioasid_allocator_lock);
> @@ -368,10 +360,6 @@ bool ioasid_put(ioasid_t ioasid)
>   goto exit_unlock;
>   }
>  
> - free = refcount_dec_and_test(&ioasid_data->refs);
> - if (!free)
> - goto exit_unlock;
> -
>   active_allocator->ops->free(ioasid, active_allocator->ops->pdata);
>   /* Custom allocator needs additional steps to free the xa element */
>   if (active_allocator->flags & IOASID_ALLOCATOR_CUSTOM) {
> @@ -381,9 +369,8 @@ bool ioasid_put(ioasid_t ioasid)
>  
>  exit_unlock:
>  

Re: [PATCH] iommu/iova: Separate out rcache init

2022-01-26 Thread Robin Murphy

On 2022-01-26 13:55, John Garry wrote:

Currently the rcache structures are allocated for all IOVA domains, even if
they do not use "fast" alloc+free interface. This is wasteful of memory.

In addition, fails in init_iova_rcaches() are not handled safely, which is
less than ideal.

Make "fast" users call a separate rcache init explicitly, which includes
error checking.

Signed-off-by: John Garry 


Mangled patch? (no "---" separator here)

Overall this looks great, just a few comments further down...


diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 3a46f2cc9e5d..dd066d990809 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -525,6 +525,7 @@ static int iommu_dma_init_domain(struct iommu_domain 
*domain, dma_addr_t base,
struct iommu_dma_cookie *cookie = domain->iova_cookie;
unsigned long order, base_pfn;
struct iova_domain *iovad;
+   int ret;
  
  	if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)

return -EINVAL;
@@ -559,6 +560,9 @@ static int iommu_dma_init_domain(struct iommu_domain 
*domain, dma_addr_t base,
}
  
  	init_iova_domain(iovad, 1UL << order, base_pfn);

+   ret = iova_domain_init_rcaches(iovad);
+   if (ret)
+   return ret;
  
  	/* If the FQ fails we can simply fall back to strict mode */

if (domain->type == IOMMU_DOMAIN_DMA_FQ && iommu_dma_init_fq(domain))
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index b28c9435b898..d3adc6ea5710 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -15,13 +15,14 @@
  /* The anchor node sits above the top of the usable address space */
  #define IOVA_ANCHOR   ~0UL
  
+#define IOVA_RANGE_CACHE_MAX_SIZE 6	/* log of max cached IOVA range size (in pages) */

+
  static bool iova_rcache_insert(struct iova_domain *iovad,
   unsigned long pfn,
   unsigned long size);
  static unsigned long iova_rcache_get(struct iova_domain *iovad,
 unsigned long size,
 unsigned long limit_pfn);
-static void init_iova_rcaches(struct iova_domain *iovad);
  static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain 
*iovad);
  static void free_iova_rcaches(struct iova_domain *iovad);
  
@@ -64,8 +65,6 @@ init_iova_domain(struct iova_domain *iovad, unsigned long granule,

iovad->anchor.pfn_lo = iovad->anchor.pfn_hi = IOVA_ANCHOR;
rb_link_node(&iovad->anchor.node, NULL, &iovad->rbroot.rb_node);
rb_insert_color(&iovad->anchor.node, &iovad->rbroot);
-   cpuhp_state_add_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, 
&iovad->cpuhp_dead);
-   init_iova_rcaches(iovad);
  }
  EXPORT_SYMBOL_GPL(init_iova_domain);
  
@@ -497,9 +496,9 @@ void put_iova_domain(struct iova_domain *iovad)

  {
struct iova *iova, *tmp;
  
-	cpuhp_state_remove_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD,

-   &iovad->cpuhp_dead);
-   free_iova_rcaches(iovad);
+   if (iovad->rcaches)
+   iova_domain_free_rcaches(iovad);
+
rbtree_postorder_for_each_entry_safe(iova, tmp, &iovad->rbroot, node)
free_iova_mem(iova);
  }
@@ -608,6 +607,7 @@ EXPORT_SYMBOL_GPL(reserve_iova);
   */
  
  #define IOVA_MAG_SIZE 128

+#define MAX_GLOBAL_MAGS 32 /* magazines per bin */
  
  struct iova_magazine {

unsigned long size;
@@ -620,6 +620,13 @@ struct iova_cpu_rcache {
struct iova_magazine *prev;
  };
  
+struct iova_rcache {

+   spinlock_t lock;
+   unsigned long depot_size;
+   struct iova_magazine *depot[MAX_GLOBAL_MAGS];
+   struct iova_cpu_rcache __percpu *cpu_rcaches;
+};
+
  static struct iova_magazine *iova_magazine_alloc(gfp_t flags)
  {
return kzalloc(sizeof(struct iova_magazine), flags);
@@ -693,28 +700,62 @@ static void iova_magazine_push(struct iova_magazine *mag, 
unsigned long pfn)
mag->pfns[mag->size++] = pfn;
  }
  
-static void init_iova_rcaches(struct iova_domain *iovad)

+int iova_domain_init_rcaches(struct iova_domain *iovad)
  {
-   struct iova_cpu_rcache *cpu_rcache;
-   struct iova_rcache *rcache;
unsigned int cpu;
-   int i;
+   int i, ret;
+
+   iovad->rcaches = kcalloc(IOVA_RANGE_CACHE_MAX_SIZE,
+sizeof(struct iova_rcache),
+GFP_KERNEL);
+   if (!iovad->rcaches)
+   return -ENOMEM;
  
  	for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {

+   struct iova_cpu_rcache *cpu_rcache;
+   struct iova_rcache *rcache;
+
rcache = &iovad->rcaches[i];
spin_lock_init(&rcache->lock);
rcache->depot_size = 0;
-   rcache->cpu_rcaches = __alloc_percpu(sizeof(*cpu_rcache), 
cache_line_size());
-   if (WARN_ON(!rcache->cpu_rcaches))
-   continue;
+ 

Re: [PATCH 0/3] Some minor SWIOTLB cleanups

2022-01-26 Thread Christoph Hellwig
Thanks,

applied to the dma-mapping for-next tree.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] x86/hyperv: Set swiotlb_alloc_from_low_pages to false

2022-01-26 Thread Tianyu Lan
From: Tianyu Lan 

In Hyper-V Isolation VM, swiotlb bnounce buffer size maybe 1G at most
and there maybe no enough memory from 0 to 4G according to memory layout.
Devices in Isolation VM can use memory above 4G as DMA memory. Set swiotlb_
alloc_from_low_pages to false in Isolation VM.

Signed-off-by: Tianyu Lan 
---
 arch/x86/kernel/cpu/mshyperv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 5a99f993e639..80a0423ac75d 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -343,6 +343,7 @@ static void __init ms_hyperv_init_platform(void)
 * use swiotlb bounce buffer for dma transaction.
 */
swiotlb_force = SWIOTLB_FORCE;
+   swiotlb_alloc_from_low_pages = false;
 #endif
}
 
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] Swiotlb: Add swiotlb_alloc_from_low_pages switch

2022-01-26 Thread Tianyu Lan
From: Tianyu Lan 

Hyper-V Isolation VM and AMD SEV VM uses swiotlb bounce buffer to
share memory with hypervisor. Current swiotlb bounce buffer is only
allocated from 0 to ARCH_LOW_ADDRESS_LIMIT which is default to
0xUL. Isolation VM and AMD SEV VM needs 1G bounce buffer at most.
This will fail when there is not enough contiguous memory from 0 to 4G
address space and devices also may use memory above 4G address space as
DMA memory. Expose swiotlb_alloc_from_low_pages and platform mey set it
to false when it's not necessary to limit bounce buffer from 0 to 4G memory.

Signed-off-by: Tianyu Lan 
---
 include/linux/swiotlb.h |  1 +
 kernel/dma/swiotlb.c| 13 +++--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index f6c3638255d5..55c178e8eee0 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -191,5 +191,6 @@ static inline bool is_swiotlb_for_alloc(struct device *dev)
 #endif /* CONFIG_DMA_RESTRICTED_POOL */
 
 extern phys_addr_t swiotlb_unencrypted_base;
+extern bool swiotlb_alloc_from_low_pages;
 
 #endif /* __LINUX_SWIOTLB_H */
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index f1e7ea160b43..159fef80f3db 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -73,6 +73,12 @@ enum swiotlb_force swiotlb_force;
 
 struct io_tlb_mem io_tlb_default_mem;
 
+/*
+ * Get IO TLB memory from the low pages if swiotlb_alloc_from_low_pages
+ * is set.
+ */
+bool swiotlb_alloc_from_low_pages = true;
+
 phys_addr_t swiotlb_unencrypted_base;
 
 /*
@@ -284,8 +290,11 @@ swiotlb_init(int verbose)
if (swiotlb_force == SWIOTLB_NO_FORCE)
return;
 
-   /* Get IO TLB memory from the low pages */
-   tlb = memblock_alloc_low(bytes, PAGE_SIZE);
+   if (swiotlb_alloc_from_low_pages)
+   tlb = memblock_alloc_low(bytes, PAGE_SIZE);
+   else
+   tlb = memblock_alloc(bytes, PAGE_SIZE);
+
if (!tlb)
goto fail;
if (swiotlb_init_with_tbl(tlb, default_nslabs, verbose))
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/2] x86/hyperv/Swiotlb: Add swiotlb_alloc_from_low_pages switch

2022-01-26 Thread Tianyu Lan
From: Tianyu Lan 

Hyper-V Isolation VM may fail to allocate swiotlb bounce buffer due
to there is no enough contiguous memory from 0 to 4G in some cases.
Current swiotlb code allocate bounce buffer in the low end memory.
This patchset adds a switch "swiotlb_alloc_from_low_pages" and it's
set to true by default. Platform may clear it if necessary. Devices
in Hyper-V Isolation VM may use memory above 4G as DMA memory and set
the switch to false in order to avoid no enough contiguous memory in
low end address space.

Tianyu Lan (2):
  Swiotlb: Add swiotlb_alloc_from_low_pages switch
  x86/hyperv: Set swiotlb_alloc_from_low_pages to false

 arch/x86/kernel/cpu/mshyperv.c |  1 +
 include/linux/swiotlb.h|  1 +
 kernel/dma/swiotlb.c   | 13 +++--
 3 files changed, 13 insertions(+), 2 deletions(-)

-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] swiotlb: Do not zero buffer in set_memory_decrypted()

2022-01-26 Thread Christoph Hellwig
Thanks,

applied to the dma-mapping for-next tree.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit

2022-01-26 Thread Thomas Gleixner
On Tue, Jan 25 2022 at 07:18, Fenghua Yu wrote:
> On Mon, Jan 24, 2022 at 09:55:56PM +0100, Thomas Gleixner wrote:
>  /**
>   * ioasid_put - Release a reference to an ioasid
>   * @ioasid: the ID to remove

which in turn makes ioasid_put() a misnomer and the whole refcounting of
the ioasid a pointless exercise.

While looking at ioasid_put() usage I tripped over the following UAF
issue:

--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4817,8 +4817,10 @@ static int aux_domain_add_dev(struct dma
auxiliary_unlink_device(domain, dev);
 link_failed:
spin_unlock_irqrestore(&device_domain_lock, flags);
-   if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
+   if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
ioasid_put(domain->default_pasid);
+   domain->default_pasid = INVALID_IOASID;
+   }
 
return ret;
 }
@@ -4847,8 +4849,10 @@ static void aux_domain_remove_dev(struct
 
spin_unlock_irqrestore(&device_domain_lock, flags);
 
-   if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
+   if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
ioasid_put(domain->default_pasid);
+   domain->default_pasid = INVALID_IOASID;
+   }
 }
 
 static int prepare_domain_attach_device(struct iommu_domain *domain,

Vs. ioasid_put() I think we should fold the following:

--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4818,7 +4818,7 @@ static int aux_domain_add_dev(struct dma
 link_failed:
spin_unlock_irqrestore(&device_domain_lock, flags);
if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
-   ioasid_put(domain->default_pasid);
+   ioasid_free(domain->default_pasid);
domain->default_pasid = INVALID_IOASID;
}
 
@@ -4850,7 +4850,7 @@ static void aux_domain_remove_dev(struct
spin_unlock_irqrestore(&device_domain_lock, flags);
 
if (list_empty(&domain->subdevices) && domain->default_pasid > 0) {
-   ioasid_put(domain->default_pasid);
+   ioasid_free(domain->default_pasid);
domain->default_pasid = INVALID_IOASID;
}
 }
--- a/drivers/iommu/ioasid.c
+++ b/drivers/iommu/ioasid.c
@@ -2,7 +2,7 @@
 /*
  * I/O Address Space ID allocator. There is one global IOASID space, split into
  * subsets. Users create a subset with DECLARE_IOASID_SET, then allocate and
- * free IOASIDs with ioasid_alloc and ioasid_put.
+ * free IOASIDs with ioasid_alloc() and ioasid_free().
  */
 #include 
 #include 
@@ -15,7 +15,6 @@ struct ioasid_data {
struct ioasid_set *set;
void *private;
struct rcu_head rcu;
-   refcount_t refs;
 };
 
 /*
@@ -315,7 +314,6 @@ ioasid_t ioasid_alloc(struct ioasid_set
 
data->set = set;
data->private = private;
-   refcount_set(&data->refs, 1);
 
/*
 * Custom allocator needs allocator data to perform platform specific
@@ -348,17 +346,11 @@ ioasid_t ioasid_alloc(struct ioasid_set
 EXPORT_SYMBOL_GPL(ioasid_alloc);
 
 /**
- * ioasid_put - Release a reference to an ioasid
+ * ioasid_free - Free an ioasid
  * @ioasid: the ID to remove
- *
- * Put a reference to the IOASID, free it when the number of references drops 
to
- * zero.
- *
- * Return: %true if the IOASID was freed, %false otherwise.
  */
-bool ioasid_put(ioasid_t ioasid)
+void ioasid_free(ioasid_t ioasid)
 {
-   bool free = false;
struct ioasid_data *ioasid_data;
 
spin_lock(&ioasid_allocator_lock);
@@ -368,10 +360,6 @@ bool ioasid_put(ioasid_t ioasid)
goto exit_unlock;
}
 
-   free = refcount_dec_and_test(&ioasid_data->refs);
-   if (!free)
-   goto exit_unlock;
-
active_allocator->ops->free(ioasid, active_allocator->ops->pdata);
/* Custom allocator needs additional steps to free the xa element */
if (active_allocator->flags & IOASID_ALLOCATOR_CUSTOM) {
@@ -381,9 +369,8 @@ bool ioasid_put(ioasid_t ioasid)
 
 exit_unlock:
spin_unlock(&ioasid_allocator_lock);
-   return free;
 }
-EXPORT_SYMBOL_GPL(ioasid_put);
+EXPORT_SYMBOL_GPL(ioasid_free);
 
 /**
  * ioasid_find - Find IOASID data
--- a/include/linux/ioasid.h
+++ b/include/linux/ioasid.h
@@ -34,7 +34,7 @@ struct ioasid_allocator_ops {
 #if IS_ENABLED(CONFIG_IOASID)
 ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
  void *private);
-bool ioasid_put(ioasid_t ioasid);
+void ioasid_free(ioasid_t ioasid);
 void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid,
  bool (*getter)(void *));
 int ioasid_register_allocator(struct ioasid_allocator_ops *allocator);
@@ -52,10 +52,7 @@ static inline ioasid_t ioasid_alloc(stru
return INVALID_IOASID;
 }
 
-static inline bool ioasid_put(ioasid_t ioasid)
-{
-   return false;
-}
+static inline void ioa

[PATCH] iommu/iova: Separate out rcache init

2022-01-26 Thread John Garry via iommu
Currently the rcache structures are allocated for all IOVA domains, even if
they do not use "fast" alloc+free interface. This is wasteful of memory.

In addition, fails in init_iova_rcaches() are not handled safely, which is
less than ideal.

Make "fast" users call a separate rcache init explicitly, which includes
error checking.

Signed-off-by: John Garry 

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 3a46f2cc9e5d..dd066d990809 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -525,6 +525,7 @@ static int iommu_dma_init_domain(struct iommu_domain 
*domain, dma_addr_t base,
struct iommu_dma_cookie *cookie = domain->iova_cookie;
unsigned long order, base_pfn;
struct iova_domain *iovad;
+   int ret;
 
if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
return -EINVAL;
@@ -559,6 +560,9 @@ static int iommu_dma_init_domain(struct iommu_domain 
*domain, dma_addr_t base,
}
 
init_iova_domain(iovad, 1UL << order, base_pfn);
+   ret = iova_domain_init_rcaches(iovad);
+   if (ret)
+   return ret;
 
/* If the FQ fails we can simply fall back to strict mode */
if (domain->type == IOMMU_DOMAIN_DMA_FQ && iommu_dma_init_fq(domain))
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index b28c9435b898..d3adc6ea5710 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -15,13 +15,14 @@
 /* The anchor node sits above the top of the usable address space */
 #define IOVA_ANCHOR~0UL
 
+#define IOVA_RANGE_CACHE_MAX_SIZE 6/* log of max cached IOVA range size 
(in pages) */
+
 static bool iova_rcache_insert(struct iova_domain *iovad,
   unsigned long pfn,
   unsigned long size);
 static unsigned long iova_rcache_get(struct iova_domain *iovad,
 unsigned long size,
 unsigned long limit_pfn);
-static void init_iova_rcaches(struct iova_domain *iovad);
 static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *iovad);
 static void free_iova_rcaches(struct iova_domain *iovad);
 
@@ -64,8 +65,6 @@ init_iova_domain(struct iova_domain *iovad, unsigned long 
granule,
iovad->anchor.pfn_lo = iovad->anchor.pfn_hi = IOVA_ANCHOR;
rb_link_node(&iovad->anchor.node, NULL, &iovad->rbroot.rb_node);
rb_insert_color(&iovad->anchor.node, &iovad->rbroot);
-   cpuhp_state_add_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, 
&iovad->cpuhp_dead);
-   init_iova_rcaches(iovad);
 }
 EXPORT_SYMBOL_GPL(init_iova_domain);
 
@@ -497,9 +496,9 @@ void put_iova_domain(struct iova_domain *iovad)
 {
struct iova *iova, *tmp;
 
-   cpuhp_state_remove_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD,
-   &iovad->cpuhp_dead);
-   free_iova_rcaches(iovad);
+   if (iovad->rcaches)
+   iova_domain_free_rcaches(iovad);
+
rbtree_postorder_for_each_entry_safe(iova, tmp, &iovad->rbroot, node)
free_iova_mem(iova);
 }
@@ -608,6 +607,7 @@ EXPORT_SYMBOL_GPL(reserve_iova);
  */
 
 #define IOVA_MAG_SIZE 128
+#define MAX_GLOBAL_MAGS 32 /* magazines per bin */
 
 struct iova_magazine {
unsigned long size;
@@ -620,6 +620,13 @@ struct iova_cpu_rcache {
struct iova_magazine *prev;
 };
 
+struct iova_rcache {
+   spinlock_t lock;
+   unsigned long depot_size;
+   struct iova_magazine *depot[MAX_GLOBAL_MAGS];
+   struct iova_cpu_rcache __percpu *cpu_rcaches;
+};
+
 static struct iova_magazine *iova_magazine_alloc(gfp_t flags)
 {
return kzalloc(sizeof(struct iova_magazine), flags);
@@ -693,28 +700,62 @@ static void iova_magazine_push(struct iova_magazine *mag, 
unsigned long pfn)
mag->pfns[mag->size++] = pfn;
 }
 
-static void init_iova_rcaches(struct iova_domain *iovad)
+int iova_domain_init_rcaches(struct iova_domain *iovad)
 {
-   struct iova_cpu_rcache *cpu_rcache;
-   struct iova_rcache *rcache;
unsigned int cpu;
-   int i;
+   int i, ret;
+
+   iovad->rcaches = kcalloc(IOVA_RANGE_CACHE_MAX_SIZE,
+sizeof(struct iova_rcache),
+GFP_KERNEL);
+   if (!iovad->rcaches)
+   return -ENOMEM;
 
for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+   struct iova_cpu_rcache *cpu_rcache;
+   struct iova_rcache *rcache;
+
rcache = &iovad->rcaches[i];
spin_lock_init(&rcache->lock);
rcache->depot_size = 0;
-   rcache->cpu_rcaches = __alloc_percpu(sizeof(*cpu_rcache), 
cache_line_size());
-   if (WARN_ON(!rcache->cpu_rcaches))
-   continue;
+   rcache->cpu_rcaches = __alloc_percpu(sizeof(*cpu_rcache),
+cache_line_size());
+   if

Re: [PATCH 0/7] iommu cleanup and refactoring

2022-01-26 Thread Robin Murphy

On 2022-01-26 13:27, Jason Gunthorpe via iommu wrote:

On Wed, Jan 26, 2022 at 09:51:36AM +0800, Lu Baolu wrote:


they are fundamentally different things in their own right, and the ideal
API should give us the orthogonality to also bind a device to an SVA domain
without PASID (e.g. for KVM stage 2, or userspace assignment of simpler
fault/stall-tolerant devices), or attach PASIDs to regular iommu_domains.


Yes, these are orthogonal things. A iommu driver that supports PASID
ideally should support PASID enabled attach/detatch for every
iommu_domain type it supports.

SVA should not be entangled with PASID beyond that SVA is often used
with PASID - a SVA iommu_domain should be fully usable with a RID too.


The prototype of PASID enabled attach/detach ops could look like:

int (*attach_dev_pasid)(struct iommu_domain *domain,
struct device *dev, ioasid_t id);
void (*detach_dev_pasid)(struct iommu_domain *domain,
 struct device *dev, ioasid_t id);


It seems reasonable and straightforward to me..

These would be domain ops?
  

But the iommu driver should implement different callbacks for

1) attaching an IOMMU DMA domain to a PASID on device;
- kernel DMA with PASID
- mdev-like device passthrough
- etc.
2) attaching a CPU-shared domain to a PASID on device;
- SVA
- guest PASID
- etc.


But this you mean domain->ops would be different? Seems fine, up to
the driver.


Indeed, it makes little practical difference whether the driver provides 
distinct sets of ops or just common callbacks which switch on the domain 
type internally. I expect that's largely going to come down to personal 
preference and how much internal driver code is common between the paths.



I'd hope to see some flow like:

  domain = device->bus->iommu_ops->alloc_sva_domain(dev)
  domain->ops->attach_dev_pasid(domain, dev, current->pasid)

To duplicate the current SVA APIs


+1 - I'd concur that attach/detach belong as domain ops in general. 
There's quite a nice logical split forming here where domain ops are the 
ones that make sense for external subsystems and endpoint drivers to use 
too, while device ops, with the sole exception of domain_alloc, are 
IOMMU API internals. By that reasoning, attach/bind/etc. are firmly in 
the former category.


Thanks,
Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/7] iommu cleanup and refactoring

2022-01-26 Thread Jason Gunthorpe via iommu
On Wed, Jan 26, 2022 at 09:51:36AM +0800, Lu Baolu wrote:

> > > they are fundamentally different things in their own right, and the ideal
> > > API should give us the orthogonality to also bind a device to an SVA 
> > > domain
> > > without PASID (e.g. for KVM stage 2, or userspace assignment of simpler
> > > fault/stall-tolerant devices), or attach PASIDs to regular iommu_domains.
> > 
> > Yes, these are orthogonal things. A iommu driver that supports PASID
> > ideally should support PASID enabled attach/detatch for every
> > iommu_domain type it supports.
> > 
> > SVA should not be entangled with PASID beyond that SVA is often used
> > with PASID - a SVA iommu_domain should be fully usable with a RID too.
> 
> The prototype of PASID enabled attach/detach ops could look like:
> 
>int (*attach_dev_pasid)(struct iommu_domain *domain,
>struct device *dev, ioasid_t id);
>void (*detach_dev_pasid)(struct iommu_domain *domain,
> struct device *dev, ioasid_t id);

It seems reasonable and straightforward to me..

These would be domain ops?
 
> But the iommu driver should implement different callbacks for
> 
> 1) attaching an IOMMU DMA domain to a PASID on device;
>- kernel DMA with PASID
>- mdev-like device passthrough
>- etc.
> 2) attaching a CPU-shared domain to a PASID on device;
>- SVA
>- guest PASID
>- etc.

But this you mean domain->ops would be different? Seems fine, up to
the driver.

I'd hope to see some flow like:

 domain = device->bus->iommu_ops->alloc_sva_domain(dev)
 domain->ops->attach_dev_pasid(domain, dev, current->pasid)

To duplicate the current SVA APIs

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC][DISCUSSION] dma-mapping: allocating noncoherent buffer without mapping

2022-01-26 Thread Robin Murphy

On 2022-01-26 11:54, Hyeonggon Yoo wrote:

Last month we discussed drivers that uses kmalloc(GFP_DMA) for
noncoherent mapping should be converted to use DMA API [1]. Simple
grep with GFP_DMA shows that many of drivers are mistakenly
using GFP_DMA flag.

So our purpose was to make DMA API choose right zone depending on
device's dma mask. Baoquan and I are trying to make drivers to
use dma_alloc_noncoherent() when allocating the buffer.

But it's not simple for some of drivers; there is a gap between
dma_alloc_noncoherent() and the previous convention (allocating buffer
from buddy or slab allocator and mapping it when needed.)

For example, some drivers allocate buffer and reuse it. it just maps
and unmaps whenever needed. And some drivers does not even maps the
buffer. (some drivers that does not use DMA API)

So I think we need to implement a version of dma_alloc_noncoherent()
that does not map the buffer.


This doesn't make sense to me. The point of dma_alloc_* is you get back 
a dedicated DMA buffer which can then be used at any time. In the 
noncoherent case you have to put in explicit dma_sync_*() calls around 
accesses when the CPU or device is expected to have updated the buffer 
contents, but it's still fundamentally the same paradigm.


If you want to update drivers from using streaming mappings on 
specially-allocated pages to using proper dedicated DMA buffers, then 
update the logic in those drivers to use dedicated DMA buffers 
appropriately. Adding an API to allocate a DMA buffer which isn't 
actually a DMA buffer so we can "update" drivers from misusing streaming 
mappings to still misusing streaming mappings is nonsense.


Robin.


I think implementing a helper that internally calls
__dma_direct_alloc_pages() will be okay.

As I'm not expert in this area, I want to hear
others' opinions.

[1] https://lkml.org/lkml/2021/12/13/1121

Thanks,
Hyeonggon.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] dt-bindings: iommu: renesas, ipmmu-vmsa: Reformat renesas, ipmmu-main description

2022-01-26 Thread Geert Uytterhoeven
Remove trailing whitespace and break overly long lines.

Signed-off-by: Geert Uytterhoeven 
---
 .../devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml   | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml 
b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml
index 507853fcc746eea1..67da53e8d4d10aa0 100644
--- a/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml
+++ b/Documentation/devicetree/bindings/iommu/renesas,ipmmu-vmsa.yaml
@@ -69,9 +69,9 @@ properties:
 items:
   - items:
   - description: phandle to main IPMMU
-  - description: the interrupt bit number associated with the 
particular 
-  cache IPMMU device. The interrupt bit number needs to match the 
main 
-  IPMMU IMSSTR register. Only used by cache IPMMU instances.
+  - description: the interrupt bit number associated with the 
particular
+  cache IPMMU device. The interrupt bit number needs to match the
+  main IPMMU IMSSTR register. Only used by cache IPMMU instances.
 description:
   Reference to the main IPMMU phandle plus 1 cell. The cell is
   the interrupt bit number associated with the particular cache IPMMU
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC][DISCUSSION] dma-mapping: allocating noncoherent buffer without mapping

2022-01-26 Thread Hyeonggon Yoo
Last month we discussed drivers that uses kmalloc(GFP_DMA) for 
noncoherent mapping should be converted to use DMA API [1]. Simple
grep with GFP_DMA shows that many of drivers are mistakenly
using GFP_DMA flag.

So our purpose was to make DMA API choose right zone depending on
device's dma mask. Baoquan and I are trying to make drivers to
use dma_alloc_noncoherent() when allocating the buffer.

But it's not simple for some of drivers; there is a gap between
dma_alloc_noncoherent() and the previous convention (allocating buffer
from buddy or slab allocator and mapping it when needed.)

For example, some drivers allocate buffer and reuse it. it just maps
and unmaps whenever needed. And some drivers does not even maps the
buffer. (some drivers that does not use DMA API)

So I think we need to implement a version of dma_alloc_noncoherent()
that does not map the buffer.

I think implementing a helper that internally calls
__dma_direct_alloc_pages() will be okay.

As I'm not expert in this area, I want to hear
others' opinions.

[1] https://lkml.org/lkml/2021/12/13/1121

Thanks,
Hyeonggon.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2022-01-26 Thread Maxim Levitsky
On Wed, 2022-01-26 at 09:54 +, Mike Lothian wrote:
> On Wed, 26 Jan 2022 at 07:34, Maxim Levitsky  wrote:
> > Could you post the whole dmesg, or at least:
> > 
> > dmesg | grep AMD-Vi
> > 
> > 
> > What CPU does your system have?
> > 
> > I suspect that your system doesn't GA log feature enabled in the IOMMU, and 
> > the code never checks
> > for that, and here it fails enabling it, which  before my patches was just
> > ignoring it silently.
> > 
> > 
> > Best regards,
> > Maxim Levitsky
> > > Hope that helps
> > > 
> > > Mike
> > > 
> 
> Hi
> 
> It's an AMD Ryzen 9 5900HX
> 
> [0.186350] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR0, rdevid:160
> [0.186353] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR1, rdevid:160
> [0.186354] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR2, rdevid:160
> [0.186355] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR3, rdevid:160
> [0.355628] pci :00:00.2: AMD-Vi: IOMMU performance counters supported
> [0.356134] pci :00:00.2: AMD-Vi: Found IOMMU cap 0x40
> [0.356136] AMD-Vi: Extended features (0x206d73ef22254ade): PPR
> X2APIC NX GT IA GA PC GA_vAPIC
> [0.356140] AMD-Vi: Interrupt remapping enabled
> [0.356141] AMD-Vi: Virtual APIC enabled
> [0.356142] AMD-Vi: X2APIC enabled
> [0.431377] AMD-Vi: AMD IOMMUv2 loaded and initialized
> 
> I've attached the dmesg, I notice that some boots it doesn't happen
> 
> Cheers
> 
> Mike

Great, your system does seem to support GA log 
(but a patch to check if, other that assume blindly that it is supported is 
something that should be done).

So could you bump the LOOP_TIMEOUT like by 10x or so and see if the problem 
goes away?

(that code should be rewritten to time based wait and not just blindly loop 
like that,
I also can prepare a patch for that as well).

Best regards,
Maxim Levitsky

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu/amd: fixes for suspend/resume

2022-01-26 Thread Mike Lothian
On Wed, 26 Jan 2022 at 07:34, Maxim Levitsky  wrote:
>
> Could you post the whole dmesg, or at least:
>
> dmesg | grep AMD-Vi
>
>
> What CPU does your system have?
>
> I suspect that your system doesn't GA log feature enabled in the IOMMU, and 
> the code never checks
> for that, and here it fails enabling it, which  before my patches was just
> ignoring it silently.
>
>
> Best regards,
> Maxim Levitsky
> >
> > Hope that helps
> >
> > Mike
> >

Hi

It's an AMD Ryzen 9 5900HX

[0.186350] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR0, rdevid:160
[0.186353] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR1, rdevid:160
[0.186354] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR2, rdevid:160
[0.186355] AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR3, rdevid:160
[0.355628] pci :00:00.2: AMD-Vi: IOMMU performance counters supported
[0.356134] pci :00:00.2: AMD-Vi: Found IOMMU cap 0x40
[0.356136] AMD-Vi: Extended features (0x206d73ef22254ade): PPR
X2APIC NX GT IA GA PC GA_vAPIC
[0.356140] AMD-Vi: Interrupt remapping enabled
[0.356141] AMD-Vi: Virtual APIC enabled
[0.356142] AMD-Vi: X2APIC enabled
[0.431377] AMD-Vi: AMD IOMMUv2 loaded and initialized

I've attached the dmesg, I notice that some boots it doesn't happen

Cheers

Mike


x2apic.dmesg
Description: Binary data
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 7/7] iommu: Add iommu_domain::domain_ops

2022-01-26 Thread Jean-Philippe Brucker
On Mon, Jan 24, 2022 at 12:33:02PM -0400, Jason Gunthorpe wrote:
> On Mon, Jan 24, 2022 at 10:16:07AM +, Jean-Philippe Brucker wrote:
> > On Mon, Jan 24, 2022 at 09:58:18AM +, Tian, Kevin wrote:
> > > > From: Lu Baolu 
> > > > Sent: Monday, January 24, 2022 3:11 PM
> > > > +/**
> > > > + * struct domain_ops - per-domain ops
> > > > + * @attach_dev: attach an iommu domain to a device
> > > > + * @detach_dev: detach an iommu domain from a device
> > > 
> > > What is the criteria about whether an op should be iommu_ops or domain_ops
> > > when it requires both domain and device pointers like above two (and 
> > > future
> > > PASID-based attach)?
> > > 
> > > Other examples include:
> > >   @apply_resv_region
> > >   @is_attach_deferred
> > 
> > Could attach_dev() be an IOMMU op?  So a driver could set the domain ops
> > in attach_dev() rather than domain_alloc(). That would allow to install
> > map()/unmap() ops that are tailored for the device's IOMMU, which we don't
> > know at domain_alloc() time. 
> 
> I think we should be moving toward 'domain_alloc' returning the
> correct domain and the way the driver implements the domain shouldn't
> change after that.
> 
> > I'm thinking about a guest that has both physical and virtual
> > endpoints, which would ideally use different kinds of domain ops to
> > support both efficiently (caching mode vs page tables)
> 
> In this case shouldn't domain_alloc() reached from the struct device
> already do the correct thing?

Sure, if we can finalise the domains before attach that could also clean
up the drivers a bit.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu