Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()

2022-03-31 Thread Yi Liu




On 2022/3/29 19:42, Jason Gunthorpe wrote:

On Tue, Mar 29, 2022 at 08:42:13AM +, Tian, Kevin wrote:


btw I'm not sure whether this is what SVA requires. IIRC the problem with
SVA is because PASID TLP prefix is not counted in PCI packet routing thus
a DMA target address with PASID might be treated as P2P if the address
falls into the MMIO BAR of other devices in the group. This is why the
original code needs to strictly apply SVA in a group containing a single
device, instead of a group attached by a single driver, unless we want to
reserve those MMIO ranges in CPU VA space.


I think it is not such a good idea to mix up group with this test

Here you want to say that all TLPs from the RID route to the host
bridge - ie ACS is on/etc. This is subtly different from a group with
a single device. Specifically it is an immutable property of the
fabric and doesn't change after hot plug events.


so the group size can be immutable for specific topology. right? I think 
for non-multi-function devices plugged behind an PCIE bridge which has 
enabled ACS, such devices should have their own groups. Under such topology 
the group size should be 1 constantly. May just enable SVA for such devices.



ie if we have a singleton group that doesn't have ACS and someone
hotplugs in another device on a bridge, then our SVA is completely
broken and we get data corruption.


I think this may be a device plugged in a PCIE-to-PCI bridge, and then 
hotplug a device to this bridge. The group size is variable. right? Per my 
understanding, maybe such a bridge cannot support PASID Prefix at all, 
hence no SVA support for such devices.


--
Regards,
Yi Liu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC v2 02/11] iommu: Add iommu_group_singleton_lockdown()

2022-03-31 Thread Yi Liu



On 2022/3/29 16:42, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Tuesday, March 29, 2022 1:38 PM

Some of the interfaces in the IOMMU core require that only a single
kernel device driver controls the device in the IOMMU group. The
existing method is to check the device count in the IOMMU group in
the interfaces. This is unreliable because any device added to the
IOMMU group later breaks this assumption without notifying the driver
using the interface. This adds iommu_group_singleton_lockdown() that
checks the requirement and locks down the IOMMU group with only single
device driver bound.

Signed-off-by: Lu Baolu 
---
  drivers/iommu/iommu.c | 30 ++
  1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0c42ece25854..9fb8a5b4491e 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -48,6 +48,7 @@ struct iommu_group {
struct list_head entry;
unsigned int owner_cnt;
void *owner;
+   bool singleton_lockdown;
  };

  struct group_device {
@@ -968,15 +969,16 @@ void iommu_group_remove_device(struct device
*dev)
  }
  EXPORT_SYMBOL_GPL(iommu_group_remove_device);

-static int iommu_group_device_count(struct iommu_group *group)
+/* Callers should hold the group->mutex. */
+static bool iommu_group_singleton_lockdown(struct iommu_group *group)
  {
-   struct group_device *entry;
-   int ret = 0;
-
-   list_for_each_entry(entry, &group->devices, list)
-   ret++;
+   if (group->owner_cnt != 1 ||
+   group->domain != group->default_domain ||
+   group->owner)
+   return false;


Curious why there will be a case where group uses default_domain
while still having a owner? I have the impression that owner is used
for userspace DMA where a different domain is used.


+   group->singleton_lockdown = true;

-   return ret;
+   return true;
  }


btw I'm not sure whether this is what SVA requires. IIRC the problem with
SVA is because PASID TLP prefix is not counted in PCI packet routing thus
a DMA target address with PASID might be treated as P2P if the address
falls into the MMIO BAR of other devices in the group. This is why the
original code needs to strictly apply SVA in a group containing a single
device, instead of a group attached by a single driver, unless we want to
reserve those MMIO ranges in CPU VA space.

Jean can correct me if my memory is wrong.


I think so. I remember the major gap is PASID info is not used in the PCI's 
address based TLP routing mechanism. So P2P may happen if the address
happens to be device MMIO. Per the commit message from Jean. Even for 
singular groups, it's not an easy thing. So non-sigleton groups are not

considered so far.

"
IOMMU groups with more than one device aren't supported for SVA at the
moment. There may be P2P traffic between devices within a group, which
cannot be seen by an IOMMU (note that supporting PASID doesn't add any
form of isolation with regard to P2P). Supporting groups would require
calling bind() for all bound processes every time a device is added to a
group, to perform sanity checks (e.g. ensure that new devices support
PASIDs at least as big as those already allocated in the group). It also
means making sure that reserved ranges (IOMMU_RESV_*) of all devices are
carved out of processes. This is already tricky with single devices, but
becomes very difficult with groups. Since SVA-capable devices are expected
to be cleanly isolated, and since we don't have any way to test groups or
hot-plug, we only allow singular groups for now.
"

https://lore.kernel.org/kvm/20180511190641.23008-3-jean-philippe.bruc...@arm.com/

--
Regards,
Yi Liu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3] iommu/vt-d: calculate mask for non-aligned flushes

2022-03-31 Thread David Stevens
From: David Stevens 

Calculate the appropriate mask for non-size-aligned page selective
invalidation. Since psi uses the mask value to mask out the lower order
bits of the target address, properly flushing the iotlb requires using a
mask value such that [pfn, pfn+pages) all lie within the flushed
size-aligned region.  This is not normally an issue because iova.c
always allocates iovas that are aligned to their size. However, iovas
which come from other sources (e.g. userspace via VFIO) may not be
aligned.

To properly flush the IOTLB, both the start and end pfns need to be
equal after applying the mask. That means that the most efficient mask
to use is the index of the lowest bit that is equal where all higher
bits are also equal. For example, if pfn=0x17f and pages=3, then
end_pfn=0x181, so the smallest mask we can use is 8. Any differences
above the highest bit of pages are due to carrying, so by xnor'ing pfn
and end_pfn and then masking out the lower order bits based on pages, we
get 0xff00, where the first set bit is the mask we want to use.

Fixes: 6fe1010d6d9c ("vfio/type1: DMA unmap chunking")
Cc: sta...@vger.kernel.org
Signed-off-by: David Stevens 
Reviewed-by: Kevin Tian 
---
The seeds of the bug were introduced by f76aec76ec7f6, which
simultaniously added the alignement requirement to the iommu driver and
made the iova allocator return aligned iovas. However, I don't think
there was any way to trigger the bug at that time. The tagged VFIO
change is one that actually introduced a code path that could trigger
the bug. There may also be other ways to trigger the bug that I am not
aware of.

v1 -> v2:
 - Calculate an appropriate mask for non-size-aligned iovas instead
   of falling back to domain selective flush.
v2 -> v3:
 - Add more detail to commit message.

 drivers/iommu/intel/iommu.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 5b196cfe9ed2..ab2273300346 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1717,7 +1717,8 @@ static void iommu_flush_iotlb_psi(struct intel_iommu 
*iommu,
  unsigned long pfn, unsigned int pages,
  int ih, int map)
 {
-   unsigned int mask = ilog2(__roundup_pow_of_two(pages));
+   unsigned int aligned_pages = __roundup_pow_of_two(pages);
+   unsigned int mask = ilog2(aligned_pages);
uint64_t addr = (uint64_t)pfn << VTD_PAGE_SHIFT;
u16 did = domain->iommu_did[iommu->seq_id];
 
@@ -1729,10 +1730,30 @@ static void iommu_flush_iotlb_psi(struct intel_iommu 
*iommu,
if (domain_use_first_level(domain)) {
domain_flush_piotlb(iommu, domain, addr, pages, ih);
} else {
+   unsigned long bitmask = aligned_pages - 1;
+
+   /*
+* PSI masks the low order bits of the base address. If the
+* address isn't aligned to the mask, then compute a mask value
+* needed to ensure the target range is flushed.
+*/
+   if (unlikely(bitmask & pfn)) {
+   unsigned long end_pfn = pfn + pages - 1, shared_bits;
+
+   /*
+* Since end_pfn <= pfn + bitmask, the only way bits
+* higher than bitmask can differ in pfn and end_pfn is
+* by carrying. This means after masking out bitmask,
+* high bits starting with the first set bit in
+* shared_bits are all equal in both pfn and end_pfn.
+*/
+   shared_bits = ~(pfn ^ end_pfn) & ~bitmask;
+   mask = shared_bits ? __ffs(shared_bits) : BITS_PER_LONG;
+   }
+
/*
 * Fallback to domain selective flush if no PSI support or
-* the size is too big. PSI requires page size to be 2 ^ x,
-* and the base address is naturally aligned to the size.
+* the size is too big.
 */
if (!cap_pgsel_inv(iommu->cap) ||
mask > cap_max_amask_val(iommu->cap))
-- 
2.35.1.1094.g7c7d902a7c-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC v2 08/11] iommu/sva: Use attach/detach_pasid_dev in SVA interfaces

2022-03-31 Thread Jason Gunthorpe via iommu
On Thu, Mar 31, 2022 at 01:59:22PM -0700, Jacob Pan wrote:

> > +   handle->dev = dev;
> > +   handle->domain = domain;
> > +   handle->pasid = mm->pasid;

> why do we need to store pasid here? Conceptually, pasid is per sva domain
> not per bind. You can get it from handle->domain->sva_cookie.

That is a mistake - SVA needs to follow the general PASID design - the
domain does not encode the PASID, the PASID comes from the device
attachment only.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/omap: Fix regression in probe for NULL pointer dereference

2022-03-31 Thread Drew Fustini
On TLhu, Mar 31, 2022 at 09:23:01AM +0300, Tony Lindgren wrote:
> Commit 3f6634d997db ("iommu: Use right way to retrieve iommu_ops") started
> triggering a NULL pointer dereference for some omap variants:
> 
> __iommu_probe_device from probe_iommu_group+0x2c/0x38
> probe_iommu_group from bus_for_each_dev+0x74/0xbc
> bus_for_each_dev from bus_iommu_probe+0x34/0x2e8
> bus_iommu_probe from bus_set_iommu+0x80/0xc8
> bus_set_iommu from omap_iommu_init+0x88/0xcc
> omap_iommu_init from do_one_initcall+0x44/0x24
> 
> This is caused by omap iommu probe returning 0 instead of ERR_PTR(-ENODEV)
> as noted by Jason Gunthorpe .
> 
> Looks like the regression already happened with an earlier commit
> 6785eb9105e3 ("iommu/omap: Convert to probe/release_device() call-backs")
> that changed the function return type and missed converting one place.
> 
> Cc: Drew Fustini 
> Cc: Lu Baolu 
> Cc: Suman Anna 
> Suggested-by: Jason Gunthorpe 
> Fixes: 6785eb9105e3 ("iommu/omap: Convert to probe/release_device() 
> call-backs")
> Fixes: 3f6634d997db ("iommu: Use right way to retrieve iommu_ops")
> Signed-off-by: Tony Lindgren 
> ---
>  drivers/iommu/omap-iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
> --- a/drivers/iommu/omap-iommu.c
> +++ b/drivers/iommu/omap-iommu.c
> @@ -1661,7 +1661,7 @@ static struct iommu_device 
> *omap_iommu_probe_device(struct device *dev)
>   num_iommus = of_property_count_elems_of_size(dev->of_node, "iommus",
>sizeof(phandle));
>   if (num_iommus < 0)
> - return 0;
> + return ERR_PTR(-ENODEV);
>  
>   arch_data = kcalloc(num_iommus + 1, sizeof(*arch_data), GFP_KERNEL);
>   if (!arch_data)
> -- 
> 2.35.1

Mainline with omap2plus_defconfig now boots ok on my BeagleBoard-X15
with the TI AM5728 SoC after this patch is applied.

Tested-by: Drew Fustini 

thanks,
drew
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC v2 08/11] iommu/sva: Use attach/detach_pasid_dev in SVA interfaces

2022-03-31 Thread Jacob Pan
Hi Lu,

On Tue, 29 Mar 2022 13:37:57 +0800, Lu Baolu 
wrote:

> The existing iommu SVA interfaces are implemented by calling the SVA
> specific iommu ops provided by the IOMMU drivers. There's no need for
> any SVA specific ops in iommu_ops vector anymore as we can achieve
> this through the generic attach/detach_dev_pasid domain ops.
> 
> This refactors the IOMMU SVA interfaces implementation by using the
> attach/detach_pasid_dev ops and align them with the concept of the
> iommu domain. Put the new SVA code in the sva related file in order
> to make it self-contained.
> 
> Signed-off-by: Lu Baolu 
> ---
>  include/linux/iommu.h |  51 +---
>  drivers/iommu/iommu-sva-lib.c | 110 +-
>  drivers/iommu/iommu.c |  92 
>  3 files changed, 138 insertions(+), 115 deletions(-)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index a46285488a57..11c4d99e122d 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -629,7 +629,12 @@ struct iommu_fwspec {
>   * struct iommu_sva - handle to a device-mm bond
>   */
>  struct iommu_sva {
> - struct device   *dev;
> + struct device   *dev;
> + ioasid_tpasid;
> + struct iommu_domain *domain;
> + /* Link to sva domain's bonds list */
> + struct list_headnode;
> + refcount_t  users;
>  };
>  
>  int iommu_fwspec_init(struct device *dev, struct fwnode_handle
> *iommu_fwnode, @@ -672,12 +677,6 @@ int iommu_dev_enable_feature(struct
> device *dev, enum iommu_dev_features f); int
> iommu_dev_disable_feature(struct device *dev, enum iommu_dev_features f);
> bool iommu_dev_feature_enabled(struct device *dev, enum
> iommu_dev_features f); -struct iommu_sva *iommu_sva_bind_device(struct
> device *dev,
> - struct mm_struct *mm,
> - void *drvdata);
> -void iommu_sva_unbind_device(struct iommu_sva *handle);
> -u32 iommu_sva_get_pasid(struct iommu_sva *handle);
> -
>  int iommu_device_use_default_domain(struct device *dev);
>  void iommu_device_unuse_default_domain(struct device *dev);
>  
> @@ -1018,21 +1017,6 @@ iommu_dev_disable_feature(struct device *dev, enum
> iommu_dev_features feat) return -ENODEV;
>  }
>  
> -static inline struct iommu_sva *
> -iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void
> *drvdata) -{
> - return NULL;
> -}
> -
> -static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
> -{
> -}
> -
> -static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
> -{
> - return IOMMU_PASID_INVALID;
> -}
> -
>  static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device
> *dev) {
>   return NULL;
> @@ -1085,6 +1069,29 @@ iommu_put_domain_for_dev_pasid(struct iommu_domain
> *domain) }
>  #endif /* CONFIG_IOMMU_API */
>  
> +#ifdef CONFIG_IOMMU_SVA
> +struct iommu_sva *iommu_sva_bind_device(struct device *dev,
> + struct mm_struct *mm,
> + void *drvdata);
> +void iommu_sva_unbind_device(struct iommu_sva *handle);
> +u32 iommu_sva_get_pasid(struct iommu_sva *handle);
> +#else /* CONFIG_IOMMU_SVA */
> +static inline struct iommu_sva *
> +iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, void
> *drvdata) +{
> + return NULL;
> +}
> +
> +static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
> +{
> +}
> +
> +static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
> +{
> + return IOMMU_PASID_INVALID;
> +}
> +#endif /* CONFIG_IOMMU_SVA */
> +
>  /**
>   * iommu_map_sgtable - Map the given buffer to the IOMMU domain
>   * @domain:  The IOMMU domain to perform the mapping
> diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
> index 78820be23f15..1b45b7d01836 100644
> --- a/drivers/iommu/iommu-sva-lib.c
> +++ b/drivers/iommu/iommu-sva-lib.c
> @@ -17,6 +17,7 @@ struct iommu_sva_cookie {
>   struct mm_struct *mm;
>   ioasid_t pasid;
>   refcount_t users;
> + struct list_head bonds;
>  };
>  
>  /**
> @@ -101,6 +102,7 @@ iommu_sva_alloc_domain(struct device *dev, struct
> mm_struct *mm) cookie->mm = mm;
>   cookie->pasid = mm->pasid;
>   refcount_set(&cookie->users, 1);
> + INIT_LIST_HEAD(&cookie->bonds);
>   domain->type = IOMMU_DOMAIN_SVA;
>   domain->sva_cookie = cookie;
>   curr = xa_store(&sva_domain_array, mm->pasid, domain,
> GFP_KERNEL); @@ -118,6 +120,7 @@ iommu_sva_alloc_domain(struct device
> *dev, struct mm_struct *mm) static void iommu_sva_free_domain(struct
> iommu_domain *domain) {
>   xa_erase(&sva_domain_array, domain->sva_cookie->pasid);
> + WARN_ON(!list_empty(&domain->sva_cookie->bonds));
>   kfree(domain->sva_cookie);
>   domain->ops->free(domain);
>  }
> @@ -137,7 +140,7 @@ void iommu_sva_domain_put_user(struct iommu_domain
> *domain) iom

Re: [PATCH] dma-mapping: move pgprot_decrypted out of dma_pgprot

2022-03-31 Thread Christoph Hellwig
On Thu, Mar 31, 2022 at 12:13:19PM -0400, Alex Xu (Hello71) wrote:
> Tested-by: Alex Xu (Hello71) 

Thanks!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier

2022-03-31 Thread Thierry Reding
On Fri, Feb 11, 2022 at 12:15:44AM +0100, Janne Grunau wrote:
> On 2022-02-09 17:31:16 +0100, Thierry Reding wrote:
> > On Sun, Feb 06, 2022 at 11:27:00PM +0100, Janne Grunau wrote:
> > > On 2021-09-15 17:19:39 +0200, Thierry Reding wrote:
> > > > On Tue, Sep 07, 2021 at 07:44:44PM +0200, Thierry Reding wrote:
> > > > > On Tue, Sep 07, 2021 at 10:33:24AM -0500, Rob Herring wrote:
> > > > > > On Fri, Sep 3, 2021 at 10:36 AM Thierry Reding 
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Fri, Sep 03, 2021 at 09:36:33AM -0500, Rob Herring wrote:
> > > > > > > > On Fri, Sep 3, 2021 at 8:52 AM Thierry Reding 
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> > > > > > > > > >
> > > > > > > > > > Couldn't we keep this all in /reserved-memory? Just add an 
> > > > > > > > > > iova
> > > > > > > > > > version of reg. Perhaps abuse 'assigned-address' for this 
> > > > > > > > > > purpose. The
> > > > > > > > > > issue I see would be handling reserved iova areas without a 
> > > > > > > > > > physical
> > > > > > > > > > area. That can be handled with just a iova and no reg. We 
> > > > > > > > > > already have
> > > > > > > > > > a no reg case.
> > > > > > > > >
> > > > > > > > > I had thought about that initially. One thing I'm worried 
> > > > > > > > > about is that
> > > > > > > > > every child node in /reserved-memory will effectively cause 
> > > > > > > > > the memory
> > > > > > > > > that it described to be reserved. But we don't want that for 
> > > > > > > > > regions
> > > > > > > > > that are "virtual only" (i.e. IOMMU reservations).
> > > > > > > >
> > > > > > > > By virtual only, you mean no physical mapping, just a region of
> > > > > > > > virtual space, right? For that we'd have no 'reg' and therefore 
> > > > > > > > no
> > > > > > > > (physical) reservation by the OS. It's similar to non-static 
> > > > > > > > regions.
> > > > > > > > You need a specific handler for them. We'd probably want a 
> > > > > > > > compatible
> > > > > > > > as well for these virtual reservations.
> > > > > > >
> > > > > > > Yeah, these would be purely used for reserving regions in the 
> > > > > > > IOVA so
> > > > > > > that they won't be used by the IOVA allocator. Typically these 
> > > > > > > would be
> > > > > > > used for cases where those addresses have some special meaning.
> > > > > > >
> > > > > > > Do we want something like:
> > > > > > >
> > > > > > > compatible = "iommu-reserved";
> > > > > > >
> > > > > > > for these? Or would that need to be:
> > > > > > >
> > > > > > > compatible = "linux,iommu-reserved";
> > > > > > >
> > > > > > > ? There seems to be a mix of vendor-prefix vs. non-vendor-prefix
> > > > > > > compatible strings in the reserved-memory DT bindings directory.
> > > > > > 
> > > > > > I would not use 'linux,' here.
> > > > > > 
> > > > > > >
> > > > > > > On the other hand, do we actually need the compatible string? 
> > > > > > > Because we
> > > > > > > don't really want to associate much extra information with this 
> > > > > > > like we
> > > > > > > do for example with "shared-dma-pool". The logic to handle this 
> > > > > > > would
> > > > > > > all be within the IOMMU framework. All we really need is for the
> > > > > > > standard reservation code to skip nodes that don't have a reg 
> > > > > > > property
> > > > > > > so we don't reserve memory for "virtual-only" allocations.
> > > > > > 
> > > > > > It doesn't hurt to have one and I can imagine we might want to 
> > > > > > iterate
> > > > > > over all the nodes. It's slightly easier and more common to iterate
> > > > > > over compatible nodes rather than nodes with some property.
> > > > > > 
> > > > > > > > Are these being global in DT going to be a problem? Presumably 
> > > > > > > > we have
> > > > > > > > a virtual space per IOMMU. We'd know which IOMMU based on a 
> > > > > > > > device's
> > > > > > > > 'iommus' and 'memory-region' properties, but within 
> > > > > > > > /reserved-memory
> > > > > > > > we wouldn't be able to distinguish overlapping addresses from 
> > > > > > > > separate
> > > > > > > > address spaces. Or we could have 2 different IOVAs for 1 
> > > > > > > > physical
> > > > > > > > space. That could be solved with something like this:
> > > > > > > >
> > > > > > > > iommu-addresses = <&iommu1  >;
> > > > > > >
> > > > > > > The only case that would be problematic would be if we have 
> > > > > > > overlapping
> > > > > > > physical regions, because that will probably trip up the standard 
> > > > > > > code.
> > > > > > >
> > > > > > > But this could also be worked around by looking at 
> > > > > > > iommu-addresses. For
> > > > > > > example, if we had something like this:
> > > > > > >
> > > > > > > reserved-memory {
> > > > > > > fb_dc0: fb@8000 {
> > > > > > > reg = <0x8000 0x0100>;
> > > > > > > iommu-addresses = <0xa00

Re: [PATCH] dma-mapping: move pgprot_decrypted out of dma_pgprot

2022-03-31 Thread Alex Xu (Hello71) via iommu
Excerpts from Christoph Hellwig's message of March 31, 2022 2:06 am:
> pgprot_decrypted is used by AMD SME systems to allow access to memory
> that was set to not encrypted using set_memory_decrypted.  That only
> happens for dma-direct memory as the IOMMU solves the addressing
> challenges for the encryption bit using its own remapping.
> 
> Move the pgprot_decrypted call out of dma_pgprot which is also used
> by the IOMMU mappings and into dma-direct so that it is only used with
> memory that was set decrypted.
> 
> Fixes: 5ff79fddf0ef ("dma-mapping: remove CONFIG_DMA_REMAP")
> Reported-by: Alex Xu (Hello71) 
> Signed-off-by: Christoph Hellwig 
> ---
>  kernel/dma/direct.c  | 10 --
>  kernel/dma/mapping.c |  2 --
>  2 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 35a1d29d6a2e9..9743c6ccce1a9 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -277,12 +277,16 @@ void *dma_direct_alloc(struct device *dev, size_t size,
>   }
>  
>   if (remap) {
> + pgprot_t prot = dma_pgprot(dev, PAGE_KERNEL, attrs);
> +
> + if (force_dma_unencrypted(dev))
> + prot = pgprot_decrypted(prot);
> +
>   /* remove any dirty cache lines on the kernel alias */
>   arch_dma_prep_coherent(page, size);
>  
>   /* create a coherent mapping */
> - ret = dma_common_contiguous_remap(page, size,
> - dma_pgprot(dev, PAGE_KERNEL, attrs),
> + ret = dma_common_contiguous_remap(page, size, prot,
>   __builtin_return_address(0));
>   if (!ret)
>   goto out_free_pages;
> @@ -535,6 +539,8 @@ int dma_direct_mmap(struct device *dev, struct 
> vm_area_struct *vma,
>   int ret = -ENXIO;
>  
>   vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
> + if (force_dma_unencrypted(dev))
> + vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot);
>  
>   if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, &ret))
>   return ret;
> diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
> index 559461a826bab..db7244291b745 100644
> --- a/kernel/dma/mapping.c
> +++ b/kernel/dma/mapping.c
> @@ -407,8 +407,6 @@ EXPORT_SYMBOL(dma_get_sgtable_attrs);
>   */
>  pgprot_t dma_pgprot(struct device *dev, pgprot_t prot, unsigned long attrs)
>  {
> - if (force_dma_unencrypted(dev))
> - prot = pgprot_decrypted(prot);
>   if (dev_is_dma_coherent(dev))
>   return prot;
>  #ifdef CONFIG_ARCH_HAS_DMA_WRITE_COMBINE
> -- 
> 2.30.2
> 
> 

Tested-by: Alex Xu (Hello71) 

Thanks,
Alex.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/omap: Fix regression in probe for NULL pointer dereference

2022-03-31 Thread Jason Gunthorpe
On Thu, Mar 31, 2022 at 09:23:01AM +0300, Tony Lindgren wrote:
> Commit 3f6634d997db ("iommu: Use right way to retrieve iommu_ops") started
> triggering a NULL pointer dereference for some omap variants:
> 
> __iommu_probe_device from probe_iommu_group+0x2c/0x38
> probe_iommu_group from bus_for_each_dev+0x74/0xbc
> bus_for_each_dev from bus_iommu_probe+0x34/0x2e8
> bus_iommu_probe from bus_set_iommu+0x80/0xc8
> bus_set_iommu from omap_iommu_init+0x88/0xcc
> omap_iommu_init from do_one_initcall+0x44/0x24
> 
> This is caused by omap iommu probe returning 0 instead of ERR_PTR(-ENODEV)
> as noted by Jason Gunthorpe .
> 
> Looks like the regression already happened with an earlier commit
> 6785eb9105e3 ("iommu/omap: Convert to probe/release_device() call-backs")
> that changed the function return type and missed converting one place.
> 
> Cc: Drew Fustini 
> Cc: Lu Baolu 
> Cc: Suman Anna 
> Suggested-by: Jason Gunthorpe 
> Fixes: 6785eb9105e3 ("iommu/omap: Convert to probe/release_device() 
> call-backs")
> Fixes: 3f6634d997db ("iommu: Use right way to retrieve iommu_ops")
> Signed-off-by: Tony Lindgren 
> ---
>  drivers/iommu/omap-iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Jason Gunthorpe 

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: "dma-mapping: remove CONFIG_DMA_REMAP" causes AMD SME boot fail

2022-03-31 Thread Thorsten Leemhuis
[TLDR: I'm adding the regression report below to regzbot, the Linux
kernel regression tracking bot; all text you find below is compiled from
a few templates paragraphs you might have encountered already already
from similar mails.]

Hi, this is your Linux kernel regression tracker. Sending this just to
the lists, as it's already handled.

On 30.03.22 19:51, Alex Xu (Hello71) wrote:
> 
> After a recent kernel update, booting one of my machines causes it to 
> hang on a black screen. Pressing Lock keys on the USB keyboard does not 
> turn on the indicators, and the machine does not appear on the Ethernet 
> network. I don't have a serial port on this machine. I didn't try 
> netconsole, but I suspect it won't work.
> 
> Setting mem_encrypt=0 seems to resolve the issue. Reverting f5ff79fddf0e 
> ("dma-mapping: remove CONFIG_DMA_REMAP") also appears to resolve the 
> issue.
> 
> The machine in question has an AMD Ryzen 5 1600 and ASRock B450 Pro4.

To be sure below issue doesn't fall through the cracks unnoticed, I'm
adding it to regzbot, my Linux kernel regression tracking bot:

#regzbot ^introduced f5ff79fddf0e
#regzbot title dma: "dma-mapping: remove CONFIG_DMA_REMAP" causes AMD
SME boot fail
#regzbot ignore-activity

If it turns out this isn't a regression, free free to remove it from the
tracking by sending a reply to this thread containing a paragraph like
"#regzbot invalid: reason why this is invalid" (without the quotes).

Ciao, Thorsten
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 8/9] iommu: Remove unused argument in is_attach_deferred

2022-03-31 Thread Drew Fustini
On Wed, Mar 30, 2022 at 02:33:23PM -0300, Jason Gunthorpe wrote:
> On Wed, Mar 30, 2022 at 08:19:37PM +0300, Tony Lindgren wrote:
> 
> > > > __iommu_probe_device from probe_iommu_group+0x2c/0x38
> > > > probe_iommu_group from bus_for_each_dev+0x74/0xbc
> > > > bus_for_each_dev from bus_iommu_probe+0x34/0x2e8
> > > > bus_iommu_probe from bus_set_iommu+0x80/0xc8
> > > > bus_set_iommu from omap_iommu_init+0x88/0xcc
> > > > omap_iommu_init from do_one_initcall+0x44/0x24c
> > > > 
> > > > Any ideas for a fix?
> > > > 
> > > > It would be good to fix this quickly so we don't end up with a broken
> > > > v5.18-rc1..
> > > > 
> > > > For reference, this is mainline commit 41bb23e70b50 ("iommu: Remove 
> > > > unused
> > > > argument in is_attach_deferred").
> > > 
> > > Are you confident in the bisection? I don't see how that commit could
> > > NULL deref..
> > 
> > Oops sorry you're right, the breaking commit is a different patch, it's
> > 3f6634d997db ("iommu: Use right way to retrieve iommu_ops") instead. I must
> > have picked the wrong commit while testing.
> 
> That makes alot more sense
>  
> > > Can you find the code that is the NULL deref?
> > 
> > I can debug a bit more tomorrow.
> 
> Looks like omap's omap_iommu_probe_device() is buggy, it returns 0 on
> error paths:
> 
>   num_iommus = of_property_count_elems_of_size(dev->of_node, "iommus",
>sizeof(phandle));
>   if (num_iommus < 0)
>   return 0;
> 
> This function needs to return an errno -ENODEV
> 
> Otherwise it causes a NULL dev->iommu->iommu_dev and dev_iommu_ops() will
> crash.
> 
> Jason

I tried to boot current mainline (787af64d05cd) on am57xx-evm and it
failed to boot. I made the change you suggested and it boots okay now:
https://gist.github.com/pdp7/918eefe03b5c5db3b5d28d819f6a27f6

thanks,
drew

diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index 4aab631ef517..d9cf2820c02e 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1661,7 +1661,7 @@ static struct iommu_device 
*omap_iommu_probe_device(struct device *dev)
num_iommus = of_property_count_elems_of_size(dev->of_node, "iommus",
 sizeof(phandle));
if (num_iommus < 0)
-   return 0;
+   return ERR_PTR(-ENODEV);

arch_data = kcalloc(num_iommus + 1, sizeof(*arch_data), GFP_KERNEL);
if (!arch_data)

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 08/12] iommufd: IOCTLs for the io_pagetable

2022-03-31 Thread Jason Gunthorpe via iommu
On Wed, Mar 30, 2022 at 09:35:52PM +0800, Yi Liu wrote:

> > +/**
> > + * struct iommu_ioas_copy - ioctl(IOMMU_IOAS_COPY)
> > + * @size: sizeof(struct iommu_ioas_copy)
> > + * @flags: Combination of enum iommufd_ioas_map_flags
> > + * @dst_ioas_id: IOAS ID to change the mapping of
> > + * @src_ioas_id: IOAS ID to copy from
> 
> so the dst and src ioas_id are allocated via the same iommufd.
> right? just out of curious, do you think it is possible that
> the srs/dst ioas_ids are from different iommufds? In that case
> may need to add src/dst iommufd. It's not needed today, just to
> see if any blocker in kernel to support such copy. :-)

Yes, all IDs in all ioctls are within the scope of a single iommufd.

There should be no reason for a single application to open multiple
iommufds.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 08/12] iommufd: IOCTLs for the io_pagetable

2022-03-31 Thread Jason Gunthorpe via iommu
On Thu, Mar 31, 2022 at 03:36:29PM +1100, David Gibson wrote:

> > +/**
> > + * struct iommu_ioas_iova_ranges - ioctl(IOMMU_IOAS_IOVA_RANGES)
> > + * @size: sizeof(struct iommu_ioas_iova_ranges)
> > + * @ioas_id: IOAS ID to read ranges from
> > + * @out_num_iovas: Output total number of ranges in the IOAS
> > + * @__reserved: Must be 0
> > + * @out_valid_iovas: Array of valid IOVA ranges. The array length is the 
> > smaller
> > + *   of out_num_iovas or the length implied by size.
> > + * @out_valid_iovas.start: First IOVA in the allowed range
> > + * @out_valid_iovas.last: Inclusive last IOVA in the allowed range
> > + *
> > + * Query an IOAS for ranges of allowed IOVAs. Operation outside these 
> > ranges is
> > + * not allowed. out_num_iovas will be set to the total number of iovas
> > + * and the out_valid_iovas[] will be filled in as space permits.
> > + * size should include the allocated flex array.
> > + */
> > +struct iommu_ioas_iova_ranges {
> > +   __u32 size;
> > +   __u32 ioas_id;
> > +   __u32 out_num_iovas;
> > +   __u32 __reserved;
> > +   struct iommu_valid_iovas {
> > +   __aligned_u64 start;
> > +   __aligned_u64 last;
> > +   } out_valid_iovas[];
> > +};
> > +#define IOMMU_IOAS_IOVA_RANGES _IO(IOMMUFD_TYPE, 
> > IOMMUFD_CMD_IOAS_IOVA_RANGES)
> 
> Is the information returned by this valid for the lifeime of the IOAS,
> or can it change?  If it can change, what events can change it?
>
> If it *can't* change, then how do we have enough information to
> determine this at ALLOC time, since we don't necessarily know which
> (if any) hardware IOMMU will be attached to it.

It is a good point worth documenting. It can change. Particularly
after any device attachment.

I added this:

 * Query an IOAS for ranges of allowed IOVAs. Mapping IOVA outside these ranges
 * is not allowed. out_num_iovas will be set to the total number of iovas and
 * the out_valid_iovas[] will be filled in as space permits. size should include
 * the allocated flex array.
 *
 * The allowed ranges are dependent on the HW path the DMA operation takes, and
 * can change during the lifetime of the IOAS. A fresh empty IOAS will have a
 * full range, and each attached device will narrow the ranges based on that
 * devices HW restrictions.


> > +#define IOMMU_IOAS_COPY _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_COPY)
> 
> Since it can only copy a single mapping, what's the benefit of this
> over just repeating an IOAS_MAP in the new IOAS?

It causes the underlying pin accounting to be shared and can avoid
calling GUP entirely.

Thanks,
Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu