RE: [PATCH 1/1] iommu/vt-d: Use device numa domain if RHSA is missing

2020-08-27 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Thursday, August 27, 2020 1:57 PM
> 
> If there are multiple NUMA domains but the RHSA is missing in ACPI/DMAR
> table, we could default to the device NUMA domain as fall back. This also
> benefits the vIOMMU use case where only a single vIOMMU is exposed,
> hence
> no RHSA will be present but device numa domain can be correct.

this benefits vIOMMU but not necessarily only applied to single-vIOMMU
case. The logic still holds in multiple vIOMMU cases as long as RHSA is
not provided.

> 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Cc: Ashok Raj 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel/iommu.c | 31 +--
>  1 file changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index e0516d64d7a3..bce158468abf 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -700,12 +700,41 @@ static int
> domain_update_iommu_superpage(struct dmar_domain *domain,
>   return fls(mask);
>  }
> 
> +static int domain_update_device_node(struct dmar_domain *domain)
> +{
> + struct device_domain_info *info;
> + int nid = NUMA_NO_NODE;
> +
> + assert_spin_locked(_domain_lock);
> +
> + if (list_empty(>devices))
> + return NUMA_NO_NODE;
> +
> + list_for_each_entry(info, >devices, link) {
> + if (!info->dev)
> + continue;
> +
> + nid = dev_to_node(info->dev);
> + if (nid != NUMA_NO_NODE)
> + break;
> + }

There could be multiple device numa nodes as devices within the
same domain may sit behind different IOMMUs. Of course there
is no perfect answer in such situation, and this patch is still an
obvious improvement on current always-on-node0 policy. But 
some comment about such implication is welcomed.

> +
> + return nid;
> +}
> +
>  /* Some capabilities may be different across iommus */
>  static void domain_update_iommu_cap(struct dmar_domain *domain)
>  {
>   domain_update_iommu_coherency(domain);
>   domain->iommu_snooping =
> domain_update_iommu_snooping(NULL);
>   domain->iommu_superpage =
> domain_update_iommu_superpage(domain, NULL);
> +
> + /*
> +  * If RHSA is missing, we should default to the device numa domain
> +  * as fall back.
> +  */
> + if (domain->nid == NUMA_NO_NODE)
> + domain->nid = domain_update_device_node(domain);
>  }
> 
>  struct context_entry *iommu_context_addr(struct intel_iommu *iommu, u8
> bus,
> @@ -5086,8 +5115,6 @@ static struct iommu_domain
> *intel_iommu_domain_alloc(unsigned type)
>   if (type == IOMMU_DOMAIN_DMA)
>   intel_init_iova_domain(dmar_domain);
> 
> - domain_update_iommu_cap(dmar_domain);
> -

Is it intended or by mistake? If the former, looks it is a separate fix...

>   domain = _domain->domain;
>   domain->geometry.aperture_start = 0;
>   domain->geometry.aperture_end   =
> --
> 2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH 1/2] dt-bindings: iommu: renesas,ipmmu-vmsa: Add r8a7742 support

2020-08-27 Thread Yoshihiro Shimoda
Hi Lad-san,

> From: Lad Prabhakar, Sent: Tuesday, August 25, 2020 11:18 PM
> 
> Document RZ/G1H (R8A7742) SoC bindings.
> 
> No driver change is needed due to the fallback compatible value
> "renesas,ipmmu-vmsa".
> 
> Signed-off-by: Lad Prabhakar 
> Reviewed-by: Chris Paterson 
> ---

Thank you for the patch!

Reviewed-by: Yoshihiro Shimoda 

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v3 1/1] iommu/vt-d: Serialize IOMMU GCMD register modifications

2020-08-27 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Friday, August 28, 2020 8:06 AM
> 
> The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG
> General
> Description) that:
> 
> If multiple control fields in this register need to be modified, software
> must serialize the modifications through multiple writes to this register.
> 
> However, in irq_remapping.c, modifications of IRE and CFI are done in one
> write. We need to do two separate writes with STS checking after each. It
> also checks the status register before writing command register to avoid
> unnecessary register write.
> 
> Fixes: af8d102f999a4 ("x86/intel/irq_remapping: Clean up x2apic opt-out
> security warning mess")
> Cc: Andy Lutomirski 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Cc: Ashok Raj 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel/irq_remapping.c | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> Change log:
> v1->v2:
>   - v1 posted here
> https://lore.kernel.org/linux-iommu/20200826025825.2322-1-
> baolu...@linux.intel.com/
>   - Add status check before disabling CFI (Kevin)
> v2->v3:
>   - v2 posted here
> https://lore.kernel.org/linux-iommu/20200827042513.30292-1-
> baolu...@linux.intel.com/
>   - Remove unnecessary register read (Kevin)
> 
> diff --git a/drivers/iommu/intel/irq_remapping.c
> b/drivers/iommu/intel/irq_remapping.c
> index 9564d23d094f..a91dd997d268 100644
> --- a/drivers/iommu/intel/irq_remapping.c
> +++ b/drivers/iommu/intel/irq_remapping.c
> @@ -507,12 +507,18 @@ static void iommu_enable_irq_remapping(struct
> intel_iommu *iommu)
> 
>   /* Enable interrupt-remapping */
>   iommu->gcmd |= DMA_GCMD_IRE;
> - iommu->gcmd &= ~DMA_GCMD_CFI;  /* Block compatibility-format
> MSIs */
>   writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
> -
>   IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG,
> readl, (sts & DMA_GSTS_IRES), sts);
> 
> + /* Block compatibility-format MSIs */
> + if (sts & DMA_GSTS_CFIS) {
> + iommu->gcmd &= ~DMA_GCMD_CFI;
> + writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
> + IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG,
> +   readl, !(sts & DMA_GSTS_CFIS), sts);
> + }
> +
>   /*
>* With CFI clear in the Global Command register, we should be
>* protected from dangerous (i.e. compatibility) interrupts
> --
> 2.17.1

Reviewed-by: Kevin Tian 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch V2 29/46] irqdomain/msi: Allow to override msi_domain_alloc/free_irqs()

2020-08-27 Thread Dey, Megha

Hi Thomas,

On 8/26/2020 4:16 AM, Thomas Gleixner wrote:

From: Thomas Gleixner 

To support MSI irq domains which do not fit at all into the regular MSI
irqdomain scheme, like the XEN MSI interrupt management for PV/HVM/DOM0,
it's necessary to allow to override the alloc/free implementation.

This is a preperatory step to switch X86 away from arch_*_msi_irqs() and
store the irq domain pointer right in struct device.

No functional change for existing MSI irq domain users.

Aside of the evil XEN wrapper this is also useful for special MSI domains
which need to do extra alloc/free work before/after calling the generic
core function. Work like allocating/freeing MSI descriptors, MSI storage
space etc.

Signed-off-by: Thomas Gleixner 

---
  include/linux/msi.h |   27 
  kernel/irq/msi.c|   70 
+++-
  2 files changed, 75 insertions(+), 22 deletions(-)

--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -241,6 +241,10 @@ struct msi_domain_info;
   * @msi_finish:   Optional callback to finalize the allocation
   * @set_desc: Set the msi descriptor for an interrupt
   * @handle_error: Optional error handler if the allocation fails
+ * @domain_alloc_irqs: Optional function to override the default allocation
+ * function.
+ * @domain_free_irqs:  Optional function to override the default free
+ * function.
   *
   * @get_hwirq, @msi_init and @msi_free are callbacks used by
   * msi_create_irq_domain() and related interfaces
@@ -248,6 +252,22 @@ struct msi_domain_info;
   * @msi_check, @msi_prepare, @msi_finish, @set_desc and @handle_error
   * are callbacks used by msi_domain_alloc_irqs() and related
   * interfaces which are based on msi_desc.
+ *
+ * @domain_alloc_irqs, @domain_free_irqs can be used to override the
+ * default allocation/free functions (__msi_domain_alloc/free_irqs). This
+ * is initially for a wrapper around XENs seperate MSI universe which can't
+ * be wrapped into the regular irq domains concepts by mere mortals.  This
+ * allows to universally use msi_domain_alloc/free_irqs without having to
+ * special case XEN all over the place.
+ *
+ * Contrary to other operations @domain_alloc_irqs and @domain_free_irqs
+ * are set to the default implementation if NULL and even when
+ * MSI_FLAG_USE_DEF_DOM_OPS is not set to avoid breaking existing users and
+ * because these callbacks are obviously mandatory.
+ *
+ * This is NOT meant to be abused, but it can be useful to build wrappers
+ * for specialized MSI irq domains which need extra work before and after
+ * calling __msi_domain_alloc_irqs()/__msi_domain_free_irqs().
   */
  struct msi_domain_ops {
irq_hw_number_t (*get_hwirq)(struct msi_domain_info *info,
@@ -270,6 +290,10 @@ struct msi_domain_ops {
struct msi_desc *desc);
int (*handle_error)(struct irq_domain *domain,
struct msi_desc *desc, int error);
+   int (*domain_alloc_irqs)(struct irq_domain *domain,
+struct device *dev, int nvec);
+   void(*domain_free_irqs)(struct irq_domain *domain,
+   struct device *dev);
  };
  
  /**

@@ -327,8 +351,11 @@ int msi_domain_set_affinity(struct irq_d
  struct irq_domain *msi_create_irq_domain(struct fwnode_handle *fwnode,
 struct msi_domain_info *info,
 struct irq_domain *parent);
+int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
+   int nvec);
  int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
  int nvec);
+void __msi_domain_free_irqs(struct irq_domain *domain, struct device *dev);
  void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev);
  struct msi_domain_info *msi_get_domain_info(struct irq_domain *domain);
  
--- a/kernel/irq/msi.c

+++ b/kernel/irq/msi.c
@@ -229,11 +229,13 @@ static int msi_domain_ops_check(struct i
  }
  
  static struct msi_domain_ops msi_domain_ops_default = {

-   .get_hwirq  = msi_domain_ops_get_hwirq,
-   .msi_init   = msi_domain_ops_init,
-   .msi_check  = msi_domain_ops_check,
-   .msi_prepare= msi_domain_ops_prepare,
-   .set_desc   = msi_domain_ops_set_desc,
+   .get_hwirq  = msi_domain_ops_get_hwirq,
+   .msi_init   = msi_domain_ops_init,
+   .msi_check  = msi_domain_ops_check,
+   .msi_prepare= msi_domain_ops_prepare,
+   .set_desc   = msi_domain_ops_set_desc,
+   .domain_alloc_irqs  = __msi_domain_alloc_irqs,
+   .domain_free_irqs   = __msi_domain_free_irqs,
  };
  
  static void msi_domain_update_dom_ops(struct msi_domain_info *info)


Re: [patch V2 15/46] x86/irq: Consolidate DMAR irq allocation

2020-08-27 Thread Dey, Megha

Hi Thomas,

On 8/26/2020 1:50 PM, Thomas Gleixner wrote:

On Wed, Aug 26 2020 at 20:32, Thomas Gleixner wrote:

On Wed, Aug 26 2020 at 09:50, Megha Dey wrote:

@@ -329,15 +329,15 @@ static struct irq_chip dmar_msi_controll
   static irq_hw_number_t dmar_msi_get_hwirq(struct msi_domain_info *info,
  msi_alloc_info_t *arg)
   {
-   return arg->dmar_id;
+   return arg->hwirq;

Shouldn't this return the arg->devid which gets set in dmar_alloc_hwirq?

Indeed.

But for simplicity we can set arg->hwirq to the dmar id right in the
alloc function and then once the generic ops are enabled remove the dmar
callback completely

True, can get rid of more code that way.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/1] iommu/vt-d: Serialize IOMMU GCMD register modifications

2020-08-27 Thread Lu Baolu
The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG General
Description) that:

If multiple control fields in this register need to be modified, software
must serialize the modifications through multiple writes to this register.

However, in irq_remapping.c, modifications of IRE and CFI are done in one
write. We need to do two separate writes with STS checking after each. It
also checks the status register before writing command register to avoid
unnecessary register write.

Fixes: af8d102f999a4 ("x86/intel/irq_remapping: Clean up x2apic opt-out 
security warning mess")
Cc: Andy Lutomirski 
Cc: Jacob Pan 
Cc: Kevin Tian 
Cc: Ashok Raj 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel/irq_remapping.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

Change log:
v1->v2:
  - v1 posted here

https://lore.kernel.org/linux-iommu/20200826025825.2322-1-baolu...@linux.intel.com/
  - Add status check before disabling CFI (Kevin)
v2->v3:
  - v2 posted here

https://lore.kernel.org/linux-iommu/20200827042513.30292-1-baolu...@linux.intel.com/
  - Remove unnecessary register read (Kevin)

diff --git a/drivers/iommu/intel/irq_remapping.c 
b/drivers/iommu/intel/irq_remapping.c
index 9564d23d094f..a91dd997d268 100644
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -507,12 +507,18 @@ static void iommu_enable_irq_remapping(struct intel_iommu 
*iommu)
 
/* Enable interrupt-remapping */
iommu->gcmd |= DMA_GCMD_IRE;
-   iommu->gcmd &= ~DMA_GCMD_CFI;  /* Block compatibility-format MSIs */
writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
-
IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG,
  readl, (sts & DMA_GSTS_IRES), sts);
 
+   /* Block compatibility-format MSIs */
+   if (sts & DMA_GSTS_CFIS) {
+   iommu->gcmd &= ~DMA_GCMD_CFI;
+   writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
+   IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG,
+ readl, !(sts & DMA_GSTS_CFIS), sts);
+   }
+
/*
 * With CFI clear in the Global Command register, we should be
 * protected from dangerous (i.e. compatibility) interrupts
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/1] iommu/vt-d: Serialize IOMMU GCMD register modifications

2020-08-27 Thread Lu Baolu

Hi,

On 8/27/20 1:39 PM, Tian, Kevin wrote:

From: Lu Baolu 
Sent: Thursday, August 27, 2020 12:25 PM

The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG
General
Description) that:

If multiple control fields in this register need to be modified, software
must serialize the modifications through multiple writes to this register.

However, in irq_remapping.c, modifications of IRE and CFI are done in one
write. We need to do two separate writes with STS checking after each.

Fixes: af8d102f999a4 ("x86/intel/irq_remapping: Clean up x2apic opt-out
security warning mess")
Cc: Andy Lutomirski 
Cc: Jacob Pan 
Cc: Kevin Tian 
Cc: Ashok Raj 
Signed-off-by: Lu Baolu 
---
  drivers/iommu/intel/irq_remapping.c | 11 +--
  1 file changed, 9 insertions(+), 2 deletions(-)

Change log:
v1->v2:
   - v1 posted here
 https://lore.kernel.org/linux-iommu/20200826025825.2322-1-
baolu...@linux.intel.com/;
   - Add status check before disabling CFI. (Kevin)

diff --git a/drivers/iommu/intel/irq_remapping.c
b/drivers/iommu/intel/irq_remapping.c
index 9564d23d094f..7552bb7e92c8 100644
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -507,12 +507,19 @@ static void iommu_enable_irq_remapping(struct
intel_iommu *iommu)

/* Enable interrupt-remapping */
iommu->gcmd |= DMA_GCMD_IRE;
-   iommu->gcmd &= ~DMA_GCMD_CFI;  /* Block compatibility-format
MSIs */
writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
-
IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG,
  readl, (sts & DMA_GSTS_IRES), sts);

+   /* Block compatibility-format MSIs */
+   sts = readl(iommu->reg + DMAR_GSTS_REG);


no need of this readl as the status is already three in IOMMU_WAIT_OP.


Yes. Really!

Best regards,
baolu




+   if (sts & DMA_GSTS_CFIS) {
+   iommu->gcmd &= ~DMA_GCMD_CFI;
+   writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
+   IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG,
+ readl, !(sts & DMA_GSTS_CFIS), sts);
+   }
+
/*
 * With CFI clear in the Global Command register, we should be
 * protected from dangerous (i.e. compatibility) interrupts
--
2.17.1



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api

2020-08-27 Thread Tom Murphy
On Thu, 27 Aug 2020 at 22:36, Logan Gunthorpe  wrote:
>
>
>
> On 2020-08-23 6:04 p.m., Tom Murphy wrote:
> > I have added a check for the sg_dma_len == 0 :
> > """
> >  } __sgt_iter(struct scatterlist *sgl, bool dma) {
> > struct sgt_iter s = { .sgp = sgl };
> >
> > +   if (sgl && sg_dma_len(sgl) == 0)
> > +   s.sgp = NULL;
> >
> > if (s.sgp) {
> > .
> > """
> > at location [1].
> > but it doens't fix the problem.
>
> Based on my read of the code, it looks like we also need to change usage
> of sgl->length... Something like the rough patch below, maybe?
>
> Also, Tom, do you have an updated version of the patchset to convert the
> Intel IOMMU to dma-iommu available? The last one I've found doesn't
> apply cleanly (I'm assuming parts of it have been merged in slightly
> modified forms).
>

I'll try and post one in the next 24hours

> Thanks,
>
> Logan
>
> --
>
> diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h
> b/drivers/gpu/drm/i915/i915
> index b7b59328cb76..9367ac801f0c 100644
> --- a/drivers/gpu/drm/i915/i915_scatterlist.h
> +++ b/drivers/gpu/drm/i915/i915_scatterlist.h
> @@ -27,13 +27,19 @@ static __always_inline struct sgt_iter {
>  } __sgt_iter(struct scatterlist *sgl, bool dma) {
> struct sgt_iter s = { .sgp = sgl };
>
> +   if (sgl && !sg_dma_len(s.sgp))
> +   s.sgp = NULL;
> +
> if (s.sgp) {
> s.max = s.curr = s.sgp->offset;
> -   s.max += s.sgp->length;
> -   if (dma)
> +
> +   if (dma) {
> +   s.max += sg_dma_len(s.sgp);
> s.dma = sg_dma_address(s.sgp);
> -   else
> +   } else {
> +   s.max += s.sgp->length;
> s.pfn = page_to_pfn(sg_page(s.sgp));
> +   }
> }
>
> return s;
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api

2020-08-27 Thread Logan Gunthorpe



On 2020-08-23 6:04 p.m., Tom Murphy wrote:
> I have added a check for the sg_dma_len == 0 :
> """
>  } __sgt_iter(struct scatterlist *sgl, bool dma) {
> struct sgt_iter s = { .sgp = sgl };
> 
> +   if (sgl && sg_dma_len(sgl) == 0)
> +   s.sgp = NULL;
> 
> if (s.sgp) {
> .
> """
> at location [1].
> but it doens't fix the problem.

Based on my read of the code, it looks like we also need to change usage
of sgl->length... Something like the rough patch below, maybe?

Also, Tom, do you have an updated version of the patchset to convert the
Intel IOMMU to dma-iommu available? The last one I've found doesn't
apply cleanly (I'm assuming parts of it have been merged in slightly
modified forms).

Thanks,

Logan

--

diff --git a/drivers/gpu/drm/i915/i915_scatterlist.h
b/drivers/gpu/drm/i915/i915
index b7b59328cb76..9367ac801f0c 100644
--- a/drivers/gpu/drm/i915/i915_scatterlist.h
+++ b/drivers/gpu/drm/i915/i915_scatterlist.h
@@ -27,13 +27,19 @@ static __always_inline struct sgt_iter {
 } __sgt_iter(struct scatterlist *sgl, bool dma) {
struct sgt_iter s = { .sgp = sgl };

+   if (sgl && !sg_dma_len(s.sgp))
+   s.sgp = NULL;
+
if (s.sgp) {
s.max = s.curr = s.sgp->offset;
-   s.max += s.sgp->length;
-   if (dma)
+
+   if (dma) {
+   s.max += sg_dma_len(s.sgp);
s.dma = sg_dma_address(s.sgp);
-   else
+   } else {
+   s.max += s.sgp->length;
s.pfn = page_to_pfn(sg_page(s.sgp));
+   }
}

return s;
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 6/9] dt-bindings: gpio: renesas, rcar-gpio: Add r8a774e1 support

2020-08-27 Thread Bartosz Golaszewski
On Thu, Aug 27, 2020 at 6:40 PM Lad, Prabhakar
 wrote:
>
> Hi Linus and Bartosz,
>
> On Mon, Jul 13, 2020 at 10:35 PM Lad Prabhakar
>  wrote:
> >
> > Document Renesas RZ/G2H (R8A774E1) GPIO blocks compatibility within the
> > relevant dt-bindings.
> >
> > Signed-off-by: Lad Prabhakar 
> > ---
> >  Documentation/devicetree/bindings/gpio/renesas,rcar-gpio.yaml | 1 +
> >  1 file changed, 1 insertion(+)
> >
> Gentle ping.
>
> Cheers,
> Prabhakar

This doesn't apply on top of v5.9-rc1.

Bart
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/iova: Replace cmpxchg with xchg in queue_iova

2020-08-27 Thread Robin Murphy

On 2020-08-27 09:43, Shaokun Zhang wrote:

From: Yuqi Jin 

The performance of the atomic_xchg is better than atomic_cmpxchg because
no comparison is required. While the value of @fq_timer_on can only be 0
or 1. Let's use atomic_xchg instead of atomic_cmpxchg here because we
only need to check that the value changes from 0 to 1 or from 1 to 1.


Looks reasonable to me - the "compare" part is already covered by the 
separate atomic_read(), and as you say there's no harm if the "exchange" 
part races as long as it's still atomic.


Reviewed-by: Robin Murphy 


Cc: Joerg Roedel 
Signed-off-by: Yuqi Jin 
Signed-off-by: Shaokun Zhang 
---
  drivers/iommu/iova.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 45a251da5453..30d969a4c5fd 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -579,7 +579,7 @@ void queue_iova(struct iova_domain *iovad,
  
  	/* Avoid false sharing as much as possible. */

if (!atomic_read(>fq_timer_on) &&
-   !atomic_cmpxchg(>fq_timer_on, 0, 1))
+   !atomic_xchg(>fq_timer_on, 1))
mod_timer(>fq_timer,
  jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT));
  }


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RESEND PATCH v4] iommu/mediatek: check 4GB mode by reading infracfg

2020-08-27 Thread Robin Murphy

On 2020-08-27 06:31, Yong Wu wrote:

On Wed, 2020-08-26 at 16:56 +0800, Miles Chen wrote:

In previous discussion [1] and [2], we found that it is risky to
use max_pfn or totalram_pages to tell if 4GB mode is enabled.

Check 4GB mode by reading infracfg register, remove the usage
of the un-exported symbol max_pfn.

This is a step towards building mtk_iommu as a kernel module.

[1] https://lore.kernel.org/lkml/20200603161132.2441-1-miles.c...@mediatek.com/
[2] https://lore.kernel.org/lkml/20200604080120.2628-1-miles.c...@mediatek.com/
[3] https://lore.kernel.org/lkml/20200715205120.GA778876@bogus/

Cc: Mike Rapoport 
Cc: David Hildenbrand 
Cc: Yong Wu 
Cc: Yingjoe Chen 
Cc: Christoph Hellwig 
Cc: Rob Herring 
Cc: Matthias Brugger 
Signed-off-by: Miles Chen 

---

Change since v3
- use lore.kernel.org links
- move "change since..." after "---"

Change since v2:
- determine compatible string by m4u_plat
- rebase to next-20200720
- add "---"

Change since v1:
- remove the phandle usage, search for infracfg instead [3]
- use infracfg instead of infracfg_regmap
- move infracfg definitaions to linux/soc/mediatek/infracfg.h
- update enable_4GB only when has_4gb_mode
---
  drivers/iommu/mtk_iommu.c | 34 +++
  include/linux/soc/mediatek/infracfg.h |  3 +++
  2 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 785b228d39a6..adc350150492 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -3,7 +3,6 @@
   * Copyright (c) 2015-2016 MediaTek Inc.
   * Author: Yong Wu 
   */
-#include 
  #include 
  #include 
  #include 
@@ -15,13 +14,16 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
+#include 
  #include 
  #include 
  
@@ -640,8 +642,11 @@ static int mtk_iommu_probe(struct platform_device *pdev)

struct resource *res;
resource_size_t ioaddr;
struct component_match  *match = NULL;
+   struct regmap   *infracfg;
void*protect;
int i, larb_nr, ret;
+   u32 val;
+   char*p;
  
  	data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);

if (!data)
@@ -655,10 +660,29 @@ static int mtk_iommu_probe(struct platform_device *pdev)
return -ENOMEM;
data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN);
  
-	/* Whether the current dram is over 4GB */

-   data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT));
-   if (!MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE))
-   data->enable_4GB = false;
+   data->enable_4GB = false;


Nit: this isn't really necessary, since the structure is kzalloc()ed.


+   if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) {
+   switch (data->plat_data->m4u_plat) {
+   case M4U_MT2712:
+   p = "mediatek,mt2712-infracfg";
+   break;
+   case M4U_MT8173:
+   p = "mediatek,mt8173-infracfg";
+   break;
+   default:
+   p = NULL;
+   }
+


This can be simplified:

 if (data->plat_data->m4u_plat == M4U_MT2712)
p = "mediatek,mt2712-infracfg";
else if(data->plat_data->m4u_plat == M4U_MT8173)
p = "mediatek,mt8173-infracfg";
else
return -EINVAL;


Right, at this point the HAS_4GB_MODE flag is entirely redundant and 
should be removed. FWIW I still think your suggestion of putting the 
infracfg names into plat_data would be even better and cleaner - there's 
plenty of precedent for that sort of thing (see "git grep '\.clk_name'" 
for example).


Robin.



Then,
Reviewed-by: Yong Wu 



+   infracfg = syscon_regmap_lookup_by_compatible(p);
+
+   if (IS_ERR(infracfg))
+   return PTR_ERR(infracfg);
+
+   ret = regmap_read(infracfg, REG_INFRA_MISC, );
+   if (ret)
+   return ret;
+   data->enable_4GB = !!(val & F_DDR_4GB_SUPPORT_EN);
+   }
  
  	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);

data->base = devm_ioremap_resource(dev, res);
diff --git a/include/linux/soc/mediatek/infracfg.h 
b/include/linux/soc/mediatek/infracfg.h
index fd25f0148566..233463d789c6 100644
--- a/include/linux/soc/mediatek/infracfg.h
+++ b/include/linux/soc/mediatek/infracfg.h
@@ -32,6 +32,9 @@
  #define MT7622_TOP_AXI_PROT_EN_WB (BIT(2) | BIT(6) | \
 BIT(7) | BIT(8))
  
+#define REG_INFRA_MISC0xf00

+#define F_DDR_4GB_SUPPORT_EN   BIT(13)
+
  int mtk_infracfg_set_bus_protection(struct regmap *infracfg, u32 mask,
bool reg_update);
  

Re: [patch V2 34/46] PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable

2020-08-27 Thread Bjorn Helgaas
[+cc Rob,
cover https://lore.kernel.org/r/20200826111628.794979...@linutronix.de/
this  https://lore.kernel.org/r/20200826112333.992429...@linutronix.de/]

On Wed, Aug 26, 2020 at 01:17:02PM +0200, Thomas Gleixner wrote:
> From: Thomas Gleixner 
> 
> The arch_.*_msi_irq[s] fallbacks are compiled in whether an architecture
> requires them or not. Architectures which are fully utilizing hierarchical
> irq domains should never call into that code.
> 
> It's not only architectures which depend on that by implementing one or
> more of the weak functions, there is also a bunch of drivers which relies
> on the weak functions which invoke msi_controller::setup_irq[s] and
> msi_controller::teardown_irq.
> 
> Make the architectures and drivers which rely on them select them in Kconfig
> and if not selected replace them by stub functions which emit a warning and
> fail the PCI/MSI interrupt allocation.

Sorry, I really don't understand this, so these are probably stupid
questions.

If CONFIG_PCI_MSI_ARCH_FALLBACKS is defined, we will supply
implementations of:

  arch_setup_msi_irq
  arch_teardown_msi_irq
  arch_setup_msi_irqs
  arch_teardown_msi_irqs
  default_teardown_msi_irqs# non-weak

You select CONFIG_PCI_MSI_ARCH_FALLBACKS for ia64, mips, powerpc,
s390, sparc, and x86.  I see that all of those arches implement at
least one of the functions above.  But x86 doesn't and I can't figure
out why it needs to select CONFIG_PCI_MSI_ARCH_FALLBACKS.

I assume there's a way to convert these arches to hierarchical irq
domains so they wouldn't need this at all?  Is there a sample
conversion to look at?

And I can't figure out what's special about tegra, rcar, and xilinx
that makes them need it as well.  Is there something I could grep for
to identify them?  Is there a way to convert them so they don't need
it?

> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -193,17 +193,38 @@ void pci_msi_mask_irq(struct irq_data *d
>  void pci_msi_unmask_irq(struct irq_data *data);
>  
>  /*
> - * The arch hooks to setup up msi irqs. Those functions are
> - * implemented as weak symbols so that they /can/ be overriden by
> - * architecture specific code if needed.
> + * The arch hooks to setup up msi irqs. Default functions are implemented

s/msi/MSI/ to match the one below.

> + * as weak symbols so that they /can/ be overriden by architecture specific
> + * code if needed. These hooks must be enabled by the architecture or by
> + * drivers which depend on them via msi_controller based MSI handling.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 13/18] iommu/tegra: Add IOMMU_DOMAIN_DMA support

2020-08-27 Thread Robin Murphy

On 2020-08-27 16:45, Thierry Reding wrote:

On Thu, Aug 20, 2020 at 04:08:32PM +0100, Robin Murphy wrote:

Now that arch/arm is wired up for default domains and iommu-dma,
implement the corresponding driver-side support for DMA domains.

Signed-off-by: Robin Murphy 
---
  drivers/iommu/tegra-smmu.c | 37 +
  1 file changed, 21 insertions(+), 16 deletions(-)


We can't do that yet because it will currently still break for use-cases
such as display where we don't properly set up identity mappings during
boot. The result is that the dma-iommu code will enable translations
before the driver gets a chance to set up any mappings and if the
display controller was left on by the bootloader, scanning out a splash
screen, this causes faults between the point where dma-iommu is being
set up for the display controller and where the display controller
starts mapping its own buffers (rather than the ones mapped by the
bootloader).


Rest assured that I understand the situation all too well ;) As with 
tegra-gart, the unspoken point here is that since tegra-smmu implements 
of_xlate(), then arm_setup_iommu_dma_ops() must already be causing the 
exact same problem, no? This patch only seeks to move any existing 
behaviour over to the common backend, regardless of whether it was ever 
really appropriate in the first place.



That said, I do have a series that I've been carrying around for longer
than I've wanted that does exactly this for Tegra SMMU and I'd prefer if
you could drop this particular change from your series so that I can
keep working on resolving the identity mapping issues first.


That would mean you'd see a functional change from the final patch, 
wherein nothing would ever be able to get translation unless drivers do 
their own explicit IOMMU API management. If you definitely want that 
change then OK, but it would still be nice to do it "properly" with 
IOMMU_DOMAIN_IDENTITY support, rather than just forcibly failing default 
domain allocation. I'm having a go at reworking the tegra-gart patch in 
that direction, so I can certainly try it for tegra-smmu as well.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 6/9] dt-bindings: gpio: renesas, rcar-gpio: Add r8a774e1 support

2020-08-27 Thread Lad, Prabhakar
Hi Linus and Bartosz,

On Mon, Jul 13, 2020 at 10:35 PM Lad Prabhakar
 wrote:
>
> Document Renesas RZ/G2H (R8A774E1) GPIO blocks compatibility within the
> relevant dt-bindings.
>
> Signed-off-by: Lad Prabhakar 
> ---
>  Documentation/devicetree/bindings/gpio/renesas,rcar-gpio.yaml | 1 +
>  1 file changed, 1 insertion(+)
>
Gentle ping.

Cheers,
Prabhakar
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/9] docs: Document IO Address Space ID (IOASID) APIs

2020-08-27 Thread Auger Eric
Hi Jacob,
On 8/24/20 12:32 PM, Jean-Philippe Brucker wrote:
> On Fri, Aug 21, 2020 at 09:35:10PM -0700, Jacob Pan wrote:
>> IOASID is used to identify address spaces that can be targeted by device
>> DMA. It is a system-wide resource that is essential to its many users.
>> This document is an attempt to help developers from all vendors navigate
>> the APIs. At this time, ARM SMMU and Intel’s Scalable IO Virtualization
>> (SIOV) enabled platforms are the primary users of IOASID. Examples of
>> how SIOV components interact with IOASID APIs are provided in that many
>> APIs are driven by the requirements from SIOV.
>>
>> Signed-off-by: Liu Yi L 
>> Signed-off-by: Wu Hao 
>> Signed-off-by: Jacob Pan 
>> ---
>>  Documentation/ioasid.rst | 618 
>> +++
>>  1 file changed, 618 insertions(+)
>>  create mode 100644 Documentation/ioasid.rst
>>
>> diff --git a/Documentation/ioasid.rst b/Documentation/ioasid.rst
> 
> Thanks for writing this up. Should it go to Documentation/driver-api/, or
> Documentation/driver-api/iommu/? I think this also needs to Cc
> linux-...@vger.kernel.org and cor...@lwn.net
> 
>> new file mode 100644
>> index ..b6a8cdc885ff
>> --- /dev/null
>> +++ b/Documentation/ioasid.rst
>> @@ -0,0 +1,618 @@
>> +.. ioasid:
>> +
>> +=
>> +IO Address Space ID
>> +=
>> +
>> +IOASID is a generic name for PCIe Process Address ID (PASID) or ARM
>> +SMMU sub-stream ID. An IOASID identifies an address space that DMA
> 
> "SubstreamID"
On ARM if we don't use PASIDs we have streamids (SID) which can also
identify address spaces that DMA requests can target. So maybe this
definition is not sufficient.

> 
>> +requests can target.
>> +
>> +The primary use cases for IOASID are Shared Virtual Address (SVA) and
>> +IO Virtual Address (IOVA). However, the requirements for IOASID
> 
> IOVA alone isn't a use case, maybe "multiple IOVA spaces per device"?
> 
>> +management can vary among hardware architectures.
>> +
>> +This document covers the generic features supported by IOASID
>> +APIs. Vendor-specific use cases are also illustrated with Intel's VT-d
>> +based platforms as the first example.
>> +
>> +.. contents:: :local:
>> +
>> +Glossary
>> +
>> +PASID - Process Address Space ID
>> +
>> +IOASID - IO Address Space ID (generic term for PCIe PASID and
>> +sub-stream ID in SMMU)
> 
> "SubstreamID"
> 
>> +
>> +SVA/SVM - Shared Virtual Addressing/Memory
>> +
>> +ENQCMD - New Intel X86 ISA for efficient workqueue submission [1]
> 
> Maybe drop the "New", to keep the documentation perennial. It might be
> good to add internal links here to the specifications URLs at the bottom.
> 
>> +
>> +DSA - Intel Data Streaming Accelerator [2]
>> +
>> +VDCM - Virtual device composition module [3]
>> +
>> +SIOV - Intel Scalable IO Virtualization
>> +
>> +
>> +Key Concepts
>> +
>> +
>> +IOASID Set
>> +---
>> +An IOASID set is a group of IOASIDs allocated from the system-wide
>> +IOASID pool. An IOASID set is created and can be identified by a
>> +token of u64. Refer to IOASID set APIs for more details.
> 
> Identified either by an u64 or an mm_struct, right?  Maybe just drop the
> second sentence if it's detailed in the IOASID set section below.
> 
>> +
>> +IOASID set is particularly useful for guest SVA where each guest could
>> +have its own IOASID set for security and efficiency reasons.
>> +
>> +IOASID Set Private ID (SPID)
>> +
>> +SPIDs are introduced as IOASIDs within its set. Each SPID maps to a
>> +system-wide IOASID but the namespace of SPID is within its IOASID
>> +set.
> 
> The intro isn't super clear. Perhaps this is simpler:
> "Each IOASID set has a private namespace of SPIDs. An SPID maps to a
> single system-wide IOASID."
or, "within an ioasid set, each ioasid can be associated with an alias
ID, named SPID."
> 
>> SPIDs can be used as guest IOASIDs where each guest could do
>> +IOASID allocation from its own pool and map them to host physical
>> +IOASIDs. SPIDs are particularly useful for supporting live migration
>> +where decoupling guest and host physical resources are necessary.
>> +
>> +For example, two VMs can both allocate guest PASID/SPID #101 but map to
>> +different host PASIDs #201 and #202 respectively as shown in the
>> +diagram below.
>> +::
>> +
>> + .--..--.
>> + |   VM 1   ||   VM 2   |
>> + |  ||  |
>> + |--||--|
>> + | GPASID/SPID 101  || GPASID/SPID 101  |
>> + '--'---' Guest
>> + __|__|__
>> +   |  |   Host
>> +   v  v
>> + .--..--.
>> + | Host IOASID 201  || Host IOASID 202  |
>> + 

Re: [PATCH 16/18] staging/media/tegra-vde: Clean up IOMMU workaround

2020-08-27 Thread Thierry Reding
On Thu, Aug 27, 2020 at 10:05:14AM +0300, Dmitry Osipenko wrote:
> 24.08.2020 17:01, Robin Murphy пишет:
> ...
> >> Robin, thank you very much for the clarifications!
> >>
> >> In accordance to yours comments, this patch can't be applied until Tegra
> >> SMMU will support IOMMU_DOMAIN_IDENTITY and implement def_domain_type()
> >> callback that returns IOMMU_DOMAIN_IDENTITY for the VDE device.
> >>
> >> Otherwise you're breaking the VDE driver because
> >> dma_buf_map_attachment() [1] returns the IOMMU SGT of the implicit
> >> domain which is then mapped into the VDE's explicit domain [2], and this
> >> is a nonsense.
> > 
> > It's true that iommu_dma_ops will do some work in the unattached default
> > domain, but non-coherent cache maintenance will still be performed
> > correctly on the underlying memory, which is really all that you care
> > about for this case. As for tegra_vde_iommu_map(), that seems to do the
> > right thing in only referencing the physical side of the scatterlist
> > (via iommu_map_sg()) and ignoring the DMA side, so things ought to work
> > out OK even if it is a little non-obvious.
> 
> I'll need to double-check this, it's indeed not clear to me right now.
> 
> I see that if Tegra DRM driver uses implicit IOMMU domain, then when VDE
> driver imports DMA-buf from Terga DRM and the imported buffer will be
> auto-mapped to the implicit VDE IOVA [1].
> 
> [1]
> https://elixir.bootlin.com/linux/v5.9-rc2/source/drivers/gpu/drm/tegra/gem.c#L574
> 
> >> Hence, either VDE driver should bypass iommu_dma_ops from the start or
> >> it needs a way to kick out the ops, like it does this using ARM's
> >> arm_iommu_detach_device().
> >>
> >>
> >> The same applies to the Tegra GPU devices, otherwise you're breaking
> >> them as well because Tegra DRM is sensible to implicit vs explicit
> >> domain.
> > 
> > Note that Tegra DRM will only be as broken as its current state on
> > arm64, and I was under the impression that that was OK now - at least I
> > don't recall seeing any complaints since 43c5bf11a610. Although that
> > commit and the one before it are resolving the scalability issue that
> > they describe, it was very much in my mind at the time that they also
> > have the happy side-effect described above - the default domain isn't
> > *completely* out of the way, but it's far enough that sensible cases
> > should be able to work as expected.
> 
> The Tegra DRM has a very special quirk for ARM32 that was added in this
> commit [2] and driver relies on checking of whether explicit or implicit
> IOMMU is used in order to activate the quirk.
> 
> [2]
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=273da5a046965ccf0ec79eb63f2d5173467e20fa
> 
> Once the implicit IOMMU is used for the DRM driver, the quirk no longer
> works (if I'm not missing something). This problem needs to be resolved
> before implicit IOMMU could be used by the Tegra DRM on ARM32.
> 
> >> BTW, I tried to apply this series and T30 doesn't boot anymore. I don't
> >> have more info for now.
> > 
> > Yeah, I'm still trying to get to the bottom of whether it's actually
> > working as intended at all, even on my RK3288. So far my debugging
> > instrumentation has been confusingly inconclusive :/
> 
> Surely it will take some time to resolve all the problems and it's great
> that you're pushing this work!
> 
> I'll try to help with fixing the ARM32 Tegra side of the problems. I
> added this to my "TODO" list and should be able to take a closer look
> during of this/next weeks!

I do have a patch lying around somewhere that implements the mapping
cache that was referenced in the above commit. Let me know if I should
dig that up and send it out.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 13/18] iommu/tegra: Add IOMMU_DOMAIN_DMA support

2020-08-27 Thread Thierry Reding
On Thu, Aug 20, 2020 at 04:08:32PM +0100, Robin Murphy wrote:
> Now that arch/arm is wired up for default domains and iommu-dma,
> implement the corresponding driver-side support for DMA domains.
> 
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/tegra-smmu.c | 37 +
>  1 file changed, 21 insertions(+), 16 deletions(-)

We can't do that yet because it will currently still break for use-cases
such as display where we don't properly set up identity mappings during
boot. The result is that the dma-iommu code will enable translations
before the driver gets a chance to set up any mappings and if the
display controller was left on by the bootloader, scanning out a splash
screen, this causes faults between the point where dma-iommu is being
set up for the display controller and where the display controller
starts mapping its own buffers (rather than the ones mapped by the
bootloader).

That said, I do have a series that I've been carrying around for longer
than I've wanted that does exactly this for Tegra SMMU and I'd prefer if
you could drop this particular change from your series so that I can
keep working on resolving the identity mapping issues first.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v7 9/9] x86/mmu: Allocate/free PASID

2020-08-27 Thread Fenghua Yu
A PASID is allocated for an "mm" the first time any thread binds
to an SVM capable device and is freed from the "mm" when the SVM is
unbound by the last thread. It's possible for the "mm" to have different
PASID values in different binding/unbinding SVM cycles.

The mm's PASID (non-zero for valid PASID or 0 for invalid PASID) is
propagated to per-thread PASID MSR for all threads within the mm through
through IPI, context switch, or inherit to ensure a running thread has
the right PASID MSR matching the mm's PASID.

Suggested-by: Andy Lutomirski 
Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
v7:
- Don't fix up PASID in #GP. Instead, update the PASID MSR by IPI and
  context switch after PASID allocation and free. Inherit PASID from
  parent. (Andy)

Before v7:
- Allocate a PASID for the mm and free it until mm exit.

 arch/x86/include/asm/disabled-features.h |  2 +-
 arch/x86/include/asm/fpu/api.h   | 12 +
 arch/x86/include/asm/fpu/internal.h  |  2 +
 arch/x86/kernel/fpu/xstate.c | 56 
 drivers/iommu/intel/svm.c| 28 +++-
 5 files changed, 98 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/disabled-features.h 
b/arch/x86/include/asm/disabled-features.h
index 588d83e9da49..5861d34f9771 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -56,7 +56,7 @@
 # define DISABLE_PTI   (1 << (X86_FEATURE_PTI & 31))
 #endif
 
-#ifdef CONFIG_INTEL_IOMMU_SVM
+#ifdef CONFIG_IOMMU_SUPPORT
 # define DISABLE_ENQCMD0
 #else
 # define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31))
diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index b774c52e5411..dcd9503b1098 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -62,4 +62,16 @@ extern void switch_fpu_return(void);
  */
 extern int cpu_has_xfeatures(u64 xfeatures_mask, const char **feature_name);
 
+/*
+ * Tasks that are not using SVA have mm->pasid set to zero to note that they
+ * will not have the valid bit set in MSR_IA32_PASID while they are running.
+ */
+#define PASID_DISABLED 0
+
+#ifdef CONFIG_IOMMU_SUPPORT
+/* Update current's PASID MSR/state by mm's PASID. */
+void update_pasid(void);
+#else
+static inline void update_pasid(void) { }
+#endif
 #endif /* _ASM_X86_FPU_API_H */
diff --git a/arch/x86/include/asm/fpu/internal.h 
b/arch/x86/include/asm/fpu/internal.h
index 0a460f2a3f90..2d737e02b59a 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -583,6 +583,8 @@ static inline void switch_fpu_finish(struct fpu *new_fpu)
pkru_val = pk->pkru;
}
__write_pkru(pkru_val);
+
+   update_pasid();
 }
 
 /*
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 67f1a03b9b23..556040e14f1c 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1402,3 +1402,59 @@ int proc_pid_arch_status(struct seq_file *m, struct 
pid_namespace *ns,
return 0;
 }
 #endif /* CONFIG_PROC_PID_ARCH_STATUS */
+
+#ifdef CONFIG_IOMMU_SUPPORT
+void update_pasid(void)
+{
+   u64 pasid_state;
+   u32 pasid;
+
+   if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
+   return;
+
+   if (!current->mm)
+   return;
+
+   pasid = READ_ONCE(current->mm->pasid);
+   /* Set the valid bit in the PASID MSR/state only for valid pasid. */
+   pasid_state = pasid == PASID_DISABLED ?
+ pasid : pasid | MSR_IA32_PASID_VALID;
+
+   /*
+* No need to hold fregs_lock() since the task's fpstate won't
+* be changed by others (e.g. ptrace) while the task is being
+* switched to or is in IPI.
+*/
+   if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
+   /* The MSR is active and can be directly updated. */
+   wrmsrl(MSR_IA32_PASID, pasid_state);
+   } else {
+   struct fpu *fpu = >thread.fpu;
+   struct ia32_pasid_state *ppasid_state;
+   struct xregs_state *xsave;
+
+   /*
+* The CPU's xstate registers are not currently active. Just
+* update the PASID state in the memory buffer here. The
+* PASID MSR will be loaded when returning to user mode.
+*/
+   xsave = >state.xsave;
+   xsave->header.xfeatures |= XFEATURE_MASK_PASID;
+   ppasid_state = get_xsave_addr(xsave, XFEATURE_PASID);
+   if (ppasid_state) {
+   /*
+* Only update the task's PASID state when it's
+* different from the mm's pasid.
+*/
+   if (ppasid_state->pasid != pasid_state) {
+   /*
+* Invalid fpregs so that xrstors will pick up
+

[PATCH v7 2/9] iommu/vt-d: Change flags type to unsigned int in binding mm

2020-08-27 Thread Fenghua Yu
"flags" passed to intel_svm_bind_mm() is a bit mask and should be
defined as "unsigned int" instead of "int".

Change its type to "unsigned int".

Suggested-by: Thomas Gleixner 
Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
Reviewed-by: Lu Baolu 
---
v5:
- Reviewed by Lu Baolu

v2:
- Add this new patch per Thomas' comment.

 drivers/iommu/intel/svm.c   | 7 ---
 include/linux/intel-iommu.h | 2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index e78a74a9c1cf..fc90a079e228 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -446,7 +446,8 @@ int intel_svm_unbind_gpasid(struct device *dev, u32 pasid)
 
 /* Caller must hold pasid_mutex, mm reference */
 static int
-intel_svm_bind_mm(struct device *dev, int flags, struct svm_dev_ops *ops,
+intel_svm_bind_mm(struct device *dev, unsigned int flags,
+ struct svm_dev_ops *ops,
  struct mm_struct *mm, struct intel_svm_dev **sd)
 {
struct intel_iommu *iommu = device_to_iommu(dev, NULL, NULL);
@@ -1033,7 +1034,7 @@ intel_svm_bind(struct device *dev, struct mm_struct *mm, 
void *drvdata)
 {
struct iommu_sva *sva = ERR_PTR(-EINVAL);
struct intel_svm_dev *sdev = NULL;
-   int flags = 0;
+   unsigned int flags = 0;
int ret;
 
/*
@@ -1042,7 +1043,7 @@ intel_svm_bind(struct device *dev, struct mm_struct *mm, 
void *drvdata)
 * and intel_svm etc.
 */
if (drvdata)
-   flags = *(int *)drvdata;
+   flags = *(unsigned int *)drvdata;
mutex_lock(_mutex);
ret = intel_svm_bind_mm(dev, flags, NULL, mm, );
if (ret)
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 7322073f62d0..9c3e8337442a 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -765,7 +765,7 @@ struct intel_svm {
struct mm_struct *mm;
 
struct intel_iommu *iommu;
-   int flags;
+   unsigned int flags;
u32 pasid;
int gpasid; /* In case that guest PASID is different from host PASID */
struct list_head devs;
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 1/9] iommu: Change type of pasid to u32

2020-08-27 Thread Fenghua Yu
PASID is defined as a few different types in iommu including "int",
"u32", and "unsigned int". To be consistent and to match with uapi
definitions, define PASID and its variations (e.g. max PASID) as "u32".
"u32" is also shorter and a little more explicit than "unsigned int".

No PASID type change in uapi although it defines PASID as __u64 in
some places.

Suggested-by: Thomas Gleixner 
Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
Reviewed-by: Lu Baolu 
Acked-by: Felix Kuehling 
---
v7:
- Add "Acked-by: Felix Kuehling "

v6:
- Change return type to u32 for kfd_pasid_alloc() (Felix)

v5:
- Reviewed by Lu Baolu

v4:
- Change PASID type from "unsigned int" to "u32" (Christoph)

v2:
- Create this new patch to define PASID as "unsigned int" consistently in
  iommu (Thomas)

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  4 +--
 .../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c|  2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h |  2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c   |  6 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ids.h   |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c|  8 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h|  8 ++---
 .../gpu/drm/amd/amdkfd/cik_event_interrupt.c  |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h   |  2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  7 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_events.c   |  8 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_events.h   |  4 +--
 drivers/gpu/drm/amd/amdkfd/kfd_iommu.c|  6 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c|  4 +--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 20 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  2 +-
 .../gpu/drm/amd/include/kgd_kfd_interface.h   |  2 +-
 drivers/iommu/amd/amd_iommu.h | 10 +++---
 drivers/iommu/amd/iommu.c | 31 ++-
 drivers/iommu/amd/iommu_v2.c  | 20 ++--
 drivers/iommu/intel/dmar.c|  7 +++--
 drivers/iommu/intel/iommu.c   |  4 +--
 drivers/iommu/intel/pasid.c   | 31 +--
 drivers/iommu/intel/pasid.h   | 24 +++---
 drivers/iommu/intel/svm.c | 12 +++
 drivers/iommu/iommu.c |  2 +-
 drivers/misc/uacce/uacce.c|  2 +-
 include/linux/amd-iommu.h |  8 ++---
 include/linux/intel-iommu.h   | 12 +++
 include/linux/intel-svm.h |  2 +-
 include/linux/iommu.h | 10 +++---
 include/linux/uacce.h |  2 +-
 38 files changed, 141 insertions(+), 141 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index ffe149aafc39..dfef5a7e0f5a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -207,11 +207,11 @@ uint8_t amdgpu_amdkfd_get_xgmi_hops_count(struct kgd_dev 
*dst, struct kgd_dev *s
})
 
 /* GPUVM API */
-int amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, unsigned int 
pasid,
+int amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, u32 pasid,
void **vm, void **process_info,
struct dma_fence **ef);
 int amdgpu_amdkfd_gpuvm_acquire_process_vm(struct kgd_dev *kgd,
-   struct file *filp, unsigned int pasid,
+   struct file *filp, u32 pasid,
void **vm, void **process_info,
struct dma_fence **ef);
 void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
index bf927f432506..ee531c3988d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c
@@ -105,7 +105,7 @@ static void kgd_program_sh_mem_settings(struct kgd_dev 
*kgd, uint32_t vmid,
unlock_srbm(kgd);
 }
 
-static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, unsigned int pasid,
+static int kgd_set_pasid_vmid_mapping(struct kgd_dev *kgd, u32 pasid,
unsigned int vmid)
 {
struct amdgpu_device *adev = get_amdgpu_device(kgd);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index 744366c7ee85..4d41317b9292 100644
--- 

[PATCH v7 0/9] x86: tag application address space for devices

2020-08-27 Thread Fenghua Yu
Typical hardware devices require a driver stack to translate application
buffers to hardware addresses, and a kernel-user transition to notify the
hardware of new work. What if both the translation and transition overhead
could be eliminated? This is what Shared Virtual Address (SVA) and ENQCMD
enabled hardware like Data Streaming Accelerator (DSA) aims to achieve.
Applications map portals in their local-address-space and directly submit
work to them using a new instruction.

This series enables ENQCMD and associated management of the new MSR
(MSR_IA32_PASID). This new MSR allows an application address space to be
associated with what the PCIe spec calls a Process Address Space ID (PASID).
This PASID tag is carried along with all requests between applications and
devices and allows devices to interact with the process address space.

SVA and ENQCMD enabled device drivers need this series. The phase 2 DSA
patches with SVA and ENQCMD support was released on the top of this series:
https://lore.kernel.org/patchwork/cover/1244060/

This series only provides simple and basic support for ENQCMD and the MSR:
1. Clean up type definitions (patch 1-2). These patches can be in a
   separate series.
   - Define "pasid" as "u32" consistently
   - Define "flags" as "unsigned int"
2. Explain different various technical terms used in the series (patch 3).
3. Enumerate support for ENQCMD in the processor (patch 4).
4. Handle FPU PASID state and the MSR during context switch (patches 5-6).
5. Define "pasid" in mm_struct (patch 7).
5. Clear PASID state for new mm and forked and cloned thread (patch 8-9).
6. Allocate and free PASID for a process (patch 10).
7. Fix up the PASID MSR in #GP handler when one thread in a process
   executes ENQCMD for the first time (patches 11-12).

This patch series and the DSA phase 2 series are in
https://github.com/intel/idxd-driver/tree/idxd-stage2

References:
1. Detailed information on the ENQCMD/ENQCMDS instructions and the
IA32_PASID MSR can be found in Intel Architecture Instruction Set
Extensions and Future Features Programming Reference:
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

2. Detailed information on DSA can be found in DSA specification:
https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification

Chang log:
v7:
- Don't fix up PASID in #GP. Instead, update the PASID MSR by IPI and
  context switch after PASID allocation and free. Inherit PASID from
  parent. (Andy)

v6:
- Change return type to u32 for kfd_pasid_alloc() in patch 1 (Felix)

v5:
- Mark ENQCMD disabled when configured out and use cpu_feature_enabled()
  to simplify the feature checking code in patch 10 and 12 (PeterZ and
  Dave Hansen)
- Add Reviewed-by: Lu Baolu to patch 1, 2, 10, and 12.

v4:
- Define PASID as "u32" instead of "unsigned int" in patch 1, 7, 10, 12.
  (Christoph)
- Drop v3 patch 2 which changes PASID type in ocxl because it's not related
  to x86 and was rejected by ocxl maintainer Frederic Barrat
- A split patch which changes PASID type to u32 in crypto/hisilicon/qm.c
  was released separately to linux-crypto mailing list because it's not
  related to x86 and is a standalone patch:

v3:
- Change names of bind_mm() and unbind_mm() to match to new APIs in
  patch 4 (Baolu)
- Change CONFIG_PCI_PASID to CONFIG_IOMMU_SUPPORT because non-PCI device
  can have PASID in ARM in patch 8 (Jean)
- Add a few sanity checks in __free_pasid() and alloc_pasid() in
  patch 11 (Baolu)
- Add patch 12 to define a new flag "has_valid_pasid" for a task and
  use the flag to identify if the task has a valid PASID MSR (PeterZ)
- Add fpu__pasid_write() to update the MSR in fixup() in patch 13
- Check if mm->pasid can be found in fixup() in patch 13

v2:
- Add patches 1-3 to define "pasid" and "flags" as "unsigned int"
  consistently (Thomas)
  (these 3 patches could be in a separate patch set)
- Add patch 8 to move "pasid" to generic mm_struct (Christoph).
  Jean-Philippe Brucker released a virtually same patch. Upstream only
  needs one of the two.
- Add patch 9 to initialize PASID in a new mm.
- Plus other changes described in each patch (Thomas)

Ashok Raj (1):
  docs: x86: Add documentation for SVA (Shared Virtual Addressing)

Fenghua Yu (7):
  iommu: Change type of pasid to u32
  iommu/vt-d: Change flags type to unsigned int in binding mm
  x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions
  x86/msr-index: Define IA32_PASID MSR
  mm: Define pasid in mm
  x86/cpufeatures: Mark ENQCMD as disabled when configured out
  x86/mmu: Allocate/free PASID

Yu-cheng Yu (1):
  x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature

 Documentation/x86/index.rst   |   1 +
 Documentation/x86/sva.rst | 254 ++
 arch/x86/include/asm/cpufeatures.h|   1 +
 arch/x86/include/asm/disabled-features.h  |   9 +-
 

[PATCH v7 3/9] docs: x86: Add documentation for SVA (Shared Virtual Addressing)

2020-08-27 Thread Fenghua Yu
From: Ashok Raj 

ENQCMD and Data Streaming Accelerator (DSA) and all of their associated
features are a complicated stack with lots of interconnected pieces.
This documentation provides a big picture overview for all of the
features.

Signed-off-by: Ashok Raj 
Co-developed-by: Fenghua Yu 
Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
v7:
- Change the doc for updating PASID by IPI and context switch (Andy).

v3:
- Replace deprecated intel_svm_bind_mm() by iommu_sva_bind_mm() (Baolu)
- Fix a couple of typos (Baolu)

v2:
- Fix the doc format and add the doc in toctree (Thomas)
- Modify the doc for better description (Thomas, Tony, Dave)

 Documentation/x86/index.rst |   1 +
 Documentation/x86/sva.rst   | 254 
 2 files changed, 255 insertions(+)
 create mode 100644 Documentation/x86/sva.rst

diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 265d9e9a093b..e5d5ff096685 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -30,3 +30,4 @@ x86-specific Documentation
usb-legacy-support
i386/index
x86_64/index
+   sva
diff --git a/Documentation/x86/sva.rst b/Documentation/x86/sva.rst
new file mode 100644
index ..6e7ac565e127
--- /dev/null
+++ b/Documentation/x86/sva.rst
@@ -0,0 +1,254 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+Shared Virtual Addressing (SVA) with ENQCMD
+===
+
+Background
+==
+
+Shared Virtual Addressing (SVA) allows the processor and device to use the
+same virtual addresses avoiding the need for software to translate virtual
+addresses to physical addresses. SVA is what PCIe calls Shared Virtual
+Memory (SVM)
+
+In addition to the convenience of using application virtual addresses
+by the device, it also doesn't require pinning pages for DMA.
+PCIe Address Translation Services (ATS) along with Page Request Interface
+(PRI) allow devices to function much the same way as the CPU handling
+application page-faults. For more information please refer to PCIe
+specification Chapter 10: ATS Specification.
+
+Use of SVA requires IOMMU support in the platform. IOMMU also is required
+to support PCIe features ATS and PRI. ATS allows devices to cache
+translations for the virtual address. IOMMU driver uses the mmu_notifier()
+support to keep the device tlb cache and the CPU cache in sync. PRI allows
+the device to request paging the virtual address before using if they are
+not paged in the CPU page tables.
+
+
+Shared Hardware Workqueues
+==
+
+Unlike Single Root I/O Virtualization (SRIOV), Scalable IOV (SIOV) permits
+the use of Shared Work Queues (SWQ) by both applications and Virtual
+Machines (VM's). This allows better hardware utilization vs. hard
+partitioning resources that could result in under utilization. In order to
+allow the hardware to distinguish the context for which work is being
+executed in the hardware by SWQ interface, SIOV uses Process Address Space
+ID (PASID), which is a 20bit number defined by the PCIe SIG.
+
+PASID value is encoded in all transactions from the device. This allows the
+IOMMU to track I/O on a per-PASID granularity in addition to using the PCIe
+Resource Identifier (RID) which is the Bus/Device/Function.
+
+
+ENQCMD
+==
+
+ENQCMD is a new instruction on Intel platforms that atomically submits a
+work descriptor to a device. The descriptor includes the operation to be
+performed, virtual addresses of all parameters, virtual address of a completion
+record, and the PASID (process address space ID) of the current process.
+
+ENQCMD works with non-posted semantics and carries a status back if the
+command was accepted by hardware. This allows the submitter to know if the
+submission needs to be retried or other device specific mechanisms to
+implement fairness or ensure forward progress can be made.
+
+ENQCMD is the glue that ensures applications can directly submit commands
+to the hardware and also permit hardware to be aware of application context
+to perform I/O operations via use of PASID.
+
+Process Address Space Tagging
+=
+
+A new thread scoped MSR (IA32_PASID) provides the connection between
+user processes and the rest of the hardware. When an application first
+accesses an SVA capable device this MSR is initialized with a newly
+allocated PASID. The driver for the device calls an IOMMU specific API
+that sets up the routing for DMA and page-requests.
+
+For example, the Intel Data Streaming Accelerator (DSA) uses
+iommu_sva_bind_device(), which will do the following.
+
+- Allocate the PASID, and program the process page-table (cr3) in the PASID
+  context entries.
+- Register for mmu_notifier() to track any page-table invalidations to keep
+  the device tlb in sync. For example, when a page-table entry is invalidated,
+  IOMMU propagates the invalidation to device tlb. This will force any
+  

[PATCH v7 4/9] x86/cpufeatures: Enumerate ENQCMD and ENQCMDS instructions

2020-08-27 Thread Fenghua Yu
Work submission instruction comes in two flavors. ENQCMD can be called
both in ring 3 and ring 0 and always uses the contents of PASID MSR when
shipping the command to the device. ENQCMDS allows a kernel driver to
submit commands on behalf of a user process. The driver supplies the
PASID value in ENQCMDS. There isn't any usage of ENQCMD in the kernel
as of now.

The CPU feature flag is shown as "enqcmd" in /proc/cpuinfo.

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
v2:
- Re-write commit message (Thomas)

 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index 2901d5df4366..fea10d04d05f 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -353,6 +353,7 @@
 #define X86_FEATURE_CLDEMOTE   (16*32+25) /* CLDEMOTE instruction */
 #define X86_FEATURE_MOVDIRI(16*32+27) /* MOVDIRI instruction */
 #define X86_FEATURE_MOVDIR64B  (16*32+28) /* MOVDIR64B instruction */
+#define X86_FEATURE_ENQCMD (16*32+29) /* ENQCMD and ENQCMDS 
instructions */
 
 /* AMD-defined CPU features, CPUID level 0x8007 (EBX), word 17 */
 #define X86_FEATURE_OVERFLOW_RECOV (17*32+ 0) /* MCA overflow recovery 
support */
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 3cbe24ca80ab..3a02707c1f4d 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -69,6 +69,7 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_CQM_MBM_TOTAL,X86_FEATURE_CQM_LLC   },
{ X86_FEATURE_CQM_MBM_LOCAL,X86_FEATURE_CQM_LLC   },
{ X86_FEATURE_AVX512_BF16,  X86_FEATURE_AVX512VL  },
+   { X86_FEATURE_ENQCMD,   X86_FEATURE_XSAVES},
{}
 };
 
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 6/9] x86/msr-index: Define IA32_PASID MSR

2020-08-27 Thread Fenghua Yu
The IA32_PASID MSR (0xd93) contains the Process Address Space Identifier
(PASID), a 20-bit value. Bit 31 must be set to indicate the value
programmed in the MSR is valid. Hardware uses PASID to identify process
address space and direct responses to the right address space.

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
v2:
- Change "identify process" to "identify process address space" in the
  commit message (Thomas)

 arch/x86/include/asm/msr-index.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 2859ee4f39a8..aaddc6a9e237 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -257,6 +257,9 @@
 #define MSR_IA32_LASTINTFROMIP 0x01dd
 #define MSR_IA32_LASTINTTOIP   0x01de
 
+#define MSR_IA32_PASID 0x0d93
+#define MSR_IA32_PASID_VALID   BIT_ULL(31)
+
 /* DEBUGCTLMSR bits (others vary by model): */
 #define DEBUGCTLMSR_LBR(1UL <<  0) /* last branch 
recording */
 #define DEBUGCTLMSR_BTF_SHIFT  1
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 7/9] mm: Define pasid in mm

2020-08-27 Thread Fenghua Yu
PASID is shared by all threads in a process. So the logical place to keep
track of it is in the "mm". Both ARM and X86 need to use the PASID in the
"mm".

Suggested-by: Christoph Hellwig 
Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
v4:
- Change PASID type to u32 (Christoph)

v3:
- Change CONFIG_PCI_PASID to CONFIG_IOMMU_SUPPORT because non-PCI device
  can have PASID in ARM (Jean)

v2:
- This new patch moves "pasid" from x86 specific mm_context_t to generic
  struct mm_struct per Christopher's comment: 
https://lore.kernel.org/linux-iommu/20200414170252.714402-1-jean-phili...@linaro.org/T/#mb57110ffe1aaa24750eeea4f93b611f0d1913911
- Jean-Philippe Brucker released a virtually same patch. I still put this
  patch in the series for better review. The upstream kernel only needs one
  of the two patches eventually.
https://lore.kernel.org/linux-iommu/20200519175502.2504091-2-jean-phili...@linaro.org/
- Change CONFIG_IOASID to CONFIG_PCI_PASID (Ashok)

 include/linux/mm_types.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 496c3ff97cce..1ff0615ef19f 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -542,6 +542,10 @@ struct mm_struct {
atomic_long_t hugetlb_usage;
 #endif
struct work_struct async_put_work;
+
+#ifdef CONFIG_IOMMU_SUPPORT
+   u32 pasid;
+#endif
} __randomize_layout;
 
/*
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 5/9] x86/fpu/xstate: Add supervisor PASID state for ENQCMD feature

2020-08-27 Thread Fenghua Yu
From: Yu-cheng Yu 

ENQCMD instruction reads PASID from IA32_PASID MSR. The MSR is stored
in the task's supervisor FPU PASID state and is context switched by
XSAVES/XRSTORS.

Signed-off-by: Yu-cheng Yu 
Co-developed-by: Fenghua Yu 
Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
v2:
- Modify the commit message (Thomas)

 arch/x86/include/asm/fpu/types.h  | 11 ++-
 arch/x86/include/asm/fpu/xstate.h |  2 +-
 arch/x86/kernel/fpu/xstate.c  |  6 +-
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index c87364ea6446..f5a38a5f3ae1 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -114,7 +114,7 @@ enum xfeature {
XFEATURE_Hi16_ZMM,
XFEATURE_PT_UNIMPLEMENTED_SO_FAR,
XFEATURE_PKRU,
-   XFEATURE_RSRVD_COMP_10,
+   XFEATURE_PASID,
XFEATURE_RSRVD_COMP_11,
XFEATURE_RSRVD_COMP_12,
XFEATURE_RSRVD_COMP_13,
@@ -134,6 +134,7 @@ enum xfeature {
 #define XFEATURE_MASK_Hi16_ZMM (1 << XFEATURE_Hi16_ZMM)
 #define XFEATURE_MASK_PT   (1 << XFEATURE_PT_UNIMPLEMENTED_SO_FAR)
 #define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU)
+#define XFEATURE_MASK_PASID(1 << XFEATURE_PASID)
 #define XFEATURE_MASK_LBR  (1 << XFEATURE_LBR)
 
 #define XFEATURE_MASK_FPSSE(XFEATURE_MASK_FP | XFEATURE_MASK_SSE)
@@ -256,6 +257,14 @@ struct arch_lbr_state {
struct lbr_entryentries[];
 } __packed;
 
+/*
+ * State component 10 is supervisor state used for context-switching the
+ * PASID state.
+ */
+struct ia32_pasid_state {
+   u64 pasid;
+} __packed;
+
 struct xstate_header {
u64 xfeatures;
u64 xcomp_bv;
diff --git a/arch/x86/include/asm/fpu/xstate.h 
b/arch/x86/include/asm/fpu/xstate.h
index 14ab815132d4..47a92232d595 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -35,7 +35,7 @@
  XFEATURE_MASK_BNDCSR)
 
 /* All currently supported supervisor features */
-#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (0)
+#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID)
 
 /*
  * A supervisor state component may not always contain valuable information,
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 038e19c0019e..67f1a03b9b23 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -37,6 +37,7 @@ static const char *xfeature_names[] =
"AVX-512 ZMM_Hi256" ,
"Processor Trace (unused)"  ,
"Protection Keys User registers",
+   "PASID state",
"unknown xstate feature",
 };
 
@@ -51,6 +52,7 @@ static short xsave_cpuid_features[] __initdata = {
X86_FEATURE_AVX512F,
X86_FEATURE_INTEL_PT,
X86_FEATURE_PKU,
+   X86_FEATURE_ENQCMD,
 };
 
 /*
@@ -318,6 +320,7 @@ static void __init print_xstate_features(void)
print_xstate_feature(XFEATURE_MASK_ZMM_Hi256);
print_xstate_feature(XFEATURE_MASK_Hi16_ZMM);
print_xstate_feature(XFEATURE_MASK_PKRU);
+   print_xstate_feature(XFEATURE_MASK_PASID);
 }
 
 /*
@@ -592,6 +595,7 @@ static void check_xstate_against_struct(int nr)
XCHECK_SZ(sz, nr, XFEATURE_ZMM_Hi256, struct avx_512_zmm_uppers_state);
XCHECK_SZ(sz, nr, XFEATURE_Hi16_ZMM,  struct avx_512_hi16_state);
XCHECK_SZ(sz, nr, XFEATURE_PKRU,  struct pkru_state);
+   XCHECK_SZ(sz, nr, XFEATURE_PASID, struct ia32_pasid_state);
 
/*
 * Make *SURE* to add any feature numbers in below if
@@ -601,7 +605,7 @@ static void check_xstate_against_struct(int nr)
if ((nr < XFEATURE_YMM) ||
(nr >= XFEATURE_MAX) ||
(nr == XFEATURE_PT_UNIMPLEMENTED_SO_FAR) ||
-   ((nr >= XFEATURE_RSRVD_COMP_10) && (nr <= XFEATURE_LBR))) {
+   ((nr >= XFEATURE_RSRVD_COMP_11) && (nr <= XFEATURE_LBR))) {
WARN_ONCE(1, "no structure for xstate: %d\n", nr);
XSTATE_WARN_ON(1);
}
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 8/9] x86/cpufeatures: Mark ENQCMD as disabled when configured out

2020-08-27 Thread Fenghua Yu
Currently the ENQCMD feature cannot be used if CONFIG_INTEL_IOMMU_SVM
is not set. Add X86_FEATURE_ENQCMD to the disabled features mask.

Signed-off-by: Fenghua Yu 
Reviewed-by: Tony Luck 
---
v7:
- Split this patch from a previous patch.

 arch/x86/include/asm/disabled-features.h | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/disabled-features.h 
b/arch/x86/include/asm/disabled-features.h
index 4ea8584682f9..588d83e9da49 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -56,6 +56,12 @@
 # define DISABLE_PTI   (1 << (X86_FEATURE_PTI & 31))
 #endif
 
+#ifdef CONFIG_INTEL_IOMMU_SVM
+# define DISABLE_ENQCMD0
+#else
+# define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31))
+#endif
+
 /*
  * Make sure to add features to the correct mask
  */
@@ -75,7 +81,8 @@
 #define DISABLED_MASK130
 #define DISABLED_MASK140
 #define DISABLED_MASK150
-#define DISABLED_MASK16
(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP)
+#define DISABLED_MASK16
(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \
+DISABLE_ENQCMD)
 #define DISABLED_MASK170
 #define DISABLED_MASK180
 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
-- 
2.19.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch V2 01/46] iommu/amd: Prevent NULL pointer dereference

2020-08-27 Thread Joerg Roedel
On Wed, Aug 26, 2020 at 01:16:29PM +0200, Thomas Gleixner wrote:
> From: Thomas Gleixner 
> 
> Dereferencing irq_data before checking it for NULL is suboptimal.
> 
> Signed-off-by: Thomas Gleixner 

Acked-by: Joerg Roedel 
Reviewed-by: Joerg Roedel 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 00/11] PCI: brcmstb: enable PCIe for STB chips

2020-08-27 Thread Jim Quinlan via iommu
On Thu, Aug 27, 2020 at 2:35 AM Christoph Hellwig  wrote:
>
> On Tue, Aug 25, 2020 at 10:40:27AM -0700, Florian Fainelli wrote:
> > Hi,
> >
> > On 8/24/2020 12:30 PM, Jim Quinlan wrote:
> >>
> >> Patchset Summary:
> >>Enhance a PCIe host controller driver.  Because of its unusual design
> >>we are foced to change dev->dma_pfn_offset into a more general role
> >>allowing multiple offsets.  See the 'v1' notes below for more info.
> >
> > We are version 11 and counting, and it is not clear to me whether there is
> > any chance of getting these patches reviewed and hopefully merged for the
> > 5.10 merge window.
> >
> > There are a lot of different files being touched, so what would be the
> > ideal way of routing those changes towards inclusion?
>
> FYI, I offered to take the dma-mapping bits through the dma-mapping tree.
> I have a bit of a backlog, but plan to review and if Jim is ok with that
> apply the current version.
Sounds good to me.
Thanks, Jim
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Aw: [PATCH 00/18] Convert arch/arm to use iommu-dma

2020-08-27 Thread Matthias Brugger




On 27/08/2020 14:31, Frank Wunderlich wrote:

Tested full series on bananapi r2 (mt7623/mt2701, 5.9-rc1 + hdmi-patches), 
works so far fbcon+x without issues

Tested-by: Frank Wunderlich 



Thanks for testing.

Robin this is especially relevant for:
[PATCH 09/18] iommu/mediatek-v1: Add IOMMU_DOMAIN_DMA support

Regards,
Matthias
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Aw: [PATCH 00/18] Convert arch/arm to use iommu-dma

2020-08-27 Thread Frank Wunderlich
Tested full series on bananapi r2 (mt7623/mt2701, 5.9-rc1 + hdmi-patches), 
works so far fbcon+x without issues

Tested-by: Frank Wunderlich 

regards Frank
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 8/9] dt-bindings: net: renesas,ravb: Add support for r8a774e1 SoC

2020-08-27 Thread Lad, Prabhakar
Hi David,

On Mon, Jul 13, 2020 at 10:36 PM Lad Prabhakar
 wrote:
>
> From: Marian-Cristian Rotariu 
>
> Document RZ/G2H (R8A774E1) SoC bindings.
>
> Signed-off-by: Marian-Cristian Rotariu 
> 
> Signed-off-by: Lad Prabhakar 
> ---
>  Documentation/devicetree/bindings/net/renesas,ravb.txt | 1 +
>  1 file changed, 1 insertion(+)
>
Gentle ping, this patch is not queued up yet at [1].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/log/

Cheers,
Prabhakar
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 0/2] Add support for ACPI device in RMRR

2020-08-27 Thread FelixCuioc
BIOS allocate reserved memory ranges that may be DMA targets.
BIOS may report each such reserved memory region through the
RMRR structures,along with the devices that requires access to
the specified reserved memory region.

The purpose of this series is to achieve ACPI device in RMRR
access reserved memory.Therefore,it is necessary to increase
the analysis of acpi device in RMRR and establish a mapping
for this device.

The first patch adds interfaces for detecting ACPI device
in RMRR and in order to distinguish it from pci device,
some interface functions are modified.

The second patch adds support for probing ACPI device in RMRR.
In probe_acpi_namespace_devices(),add support for direct mapping
of ACPI device and add support for physical node of acpi device
to be NULL.

v2->v3:
   - Add the blank line between functions.
   - Make dmar_acpi_insert_dev_scope() bool,change the 1/0 to true/false
 and add a comment explaining.
   - Delete unused initialization.
   - if dmar_acpi_insert_dev_scope() always returns zero,will not
 call dmar_rmrr_add_acpi_dev().
   - Use a proper error code.
   - Use if(!pdev).
   - Use goto unlock instead of mutex_unlock().


FelixCuioc (2):
  iommu/vt-d:Add support for detecting ACPI device in RMRR
  iommu/vt-d:Add support for probing ACPI device in RMRR

 drivers/iommu/intel/dmar.c  | 76 +
 drivers/iommu/intel/iommu.c | 52 -
 drivers/iommu/iommu.c   |  6 +++
 include/linux/dmar.h| 12 +-
 include/linux/iommu.h   |  3 ++
 5 files changed, 114 insertions(+), 35 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/2] iommu/vt-d:Add support for probing ACPI device in RMRR

2020-08-27 Thread FelixCuioc
After acpi device in RMRR is detected,it is necessary
to establish a mapping for these devices.
In acpi_device_create_direct_mappings(),create a mapping
for the acpi device in RMRR.
Add a helper to achieve the acpi namespace device can
access the RMRR region.

Signed-off-by: FelixCuioc 
---
 drivers/iommu/intel/iommu.c | 29 +
 drivers/iommu/iommu.c   |  6 ++
 include/linux/iommu.h   |  3 +++
 3 files changed, 38 insertions(+)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 208a91605288..51d7a5b18f41 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4799,6 +4799,21 @@ static int __init platform_optin_force_iommu(void)
return 1;
 }
 
+static int acpi_device_create_direct_mappings(struct device *pn_dev, struct 
device *acpi_device)
+{
+   struct iommu_group *group;
+
+   acpi_device->bus->iommu_ops = _iommu_ops;
+   group = iommu_group_get(pn_dev);
+   if (!group) {
+   pr_warn("ACPI name space devices create direct mappings 
wrong!\n");
+   return -EINVAL;
+   }
+   __acpi_device_create_direct_mappings(group, acpi_device);
+
+   return 0;
+}
+
 static int __init probe_acpi_namespace_devices(void)
 {
struct dmar_drhd_unit *drhd;
@@ -4813,6 +4828,7 @@ static int __init probe_acpi_namespace_devices(void)
struct acpi_device_physical_node *pn;
struct iommu_group *group;
struct acpi_device *adev;
+   struct device *pn_dev = NULL;
 
if (dev->bus != _bus_type)
continue;
@@ -4823,6 +4839,7 @@ static int __init probe_acpi_namespace_devices(void)
>physical_node_list, node) {
group = iommu_group_get(pn->dev);
if (group) {
+   pn_dev = pn->dev;
iommu_group_put(group);
continue;
}
@@ -4831,7 +4848,19 @@ static int __init probe_acpi_namespace_devices(void)
ret = iommu_probe_device(pn->dev);
if (ret)
break;
+   pn_dev = pn->dev;
+   }
+   if (!pn_dev) {
+   dev->bus->iommu_ops = _iommu_ops;
+   ret = iommu_probe_device(dev);
+   if (ret) {
+   pr_err("acpi_device probe fail! 
ret:%d\n", ret);
+   goto unlock;
+   }
+   goto unlock;
}
+   ret = acpi_device_create_direct_mappings(pn_dev, dev);
+unlock:
mutex_unlock(>physical_node_lock);
 
if (ret)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 609bd25bf154..4f714a2d5ef7 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -779,6 +779,12 @@ static bool iommu_is_attach_deferred(struct iommu_domain 
*domain,
return false;
 }
 
+void  __acpi_device_create_direct_mappings(struct iommu_group *group, struct 
device *acpi_device)
+{
+   iommu_create_device_direct_mappings(group, acpi_device);
+}
+EXPORT_SYMBOL_GPL(__acpi_device_create_direct_mappings);
+
 /**
  * iommu_group_add_device - add a device to an iommu group
  * @group: the group into which to add the device (reference should be held)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index fee209efb756..9be134775886 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -514,6 +514,9 @@ extern void iommu_domain_window_disable(struct iommu_domain 
*domain, u32 wnd_nr)
 extern int report_iommu_fault(struct iommu_domain *domain, struct device *dev,
  unsigned long iova, int flags);
 
+extern void __acpi_device_create_direct_mappings(struct iommu_group *group,
+   struct device *acpi_device);
+
 static inline void iommu_flush_tlb_all(struct iommu_domain *domain)
 {
if (domain->ops->flush_iotlb_all)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/2] iommu/vt-d:Add support for detecting ACPI device in RMRR

2020-08-27 Thread FelixCuioc
Some ACPI devices need to issue dma requests to access
the reserved memory area.BIOS uses the device scope type
ACPI_NAMESPACE_DEVICE in RMRR to report these ACPI devices.
This patch add support for detecting ACPI devices in RMRR.

Signed-off-by: FelixCuioc 
---
 drivers/iommu/intel/dmar.c  | 76 +
 drivers/iommu/intel/iommu.c | 23 ++-
 include/linux/dmar.h| 12 +-
 3 files changed, 76 insertions(+), 35 deletions(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 93e6345f3414..f6691c36bd3f 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -215,7 +215,7 @@ static bool dmar_match_pci_path(struct dmar_pci_notify_info 
*info, int bus,
 }
 
 /* Return: > 0 if match found, 0 if no match found, < 0 if error happens */
-int dmar_insert_dev_scope(struct dmar_pci_notify_info *info,
+int dmar_pci_insert_dev_scope(struct dmar_pci_notify_info *info,
  void *start, void*end, u16 segment,
  struct dmar_dev_scope *devices,
  int devices_cnt)
@@ -304,7 +304,7 @@ static int dmar_pci_bus_add_dev(struct dmar_pci_notify_info 
*info)
 
drhd = container_of(dmaru->hdr,
struct acpi_dmar_hardware_unit, header);
-   ret = dmar_insert_dev_scope(info, (void *)(drhd + 1),
+   ret = dmar_pci_insert_dev_scope(info, (void *)(drhd + 1),
((void *)drhd) + drhd->header.length,
dmaru->segment,
dmaru->devices, dmaru->devices_cnt);
@@ -697,47 +697,59 @@ dmar_find_matched_drhd_unit(struct pci_dev *dev)
return dmaru;
 }
 
-static void __init dmar_acpi_insert_dev_scope(u8 device_number,
- struct acpi_device *adev)
+/* Return: > 0 if match found, 0 if no match found */
+bool dmar_acpi_insert_dev_scope(u8 device_number,
+   struct acpi_device *adev,
+   void *start, void *end,
+   struct dmar_dev_scope *devices,
+   int devices_cnt)
 {
-   struct dmar_drhd_unit *dmaru;
-   struct acpi_dmar_hardware_unit *drhd;
struct acpi_dmar_device_scope *scope;
struct device *tmp;
int i;
struct acpi_dmar_pci_path *path;
 
+   for (; start < end; start += scope->length) {
+   scope = start;
+   if (scope->entry_type != ACPI_DMAR_SCOPE_TYPE_NAMESPACE)
+   continue;
+   if (scope->enumeration_id != device_number)
+   continue;
+   path = (void *)(scope + 1);
+   for_each_dev_scope(devices, devices_cnt, i, tmp)
+   if (tmp == NULL) {
+   devices[i].bus = scope->bus;
+   devices[i].devfn = PCI_DEVFN(path->device, 
path->function);
+   rcu_assign_pointer(devices[i].dev,
+  get_device(>dev));
+   return true;
+   }
+   WARN_ON(i >= devices_cnt);
+   }
+   return false;
+}
+
+static int dmar_acpi_bus_add_dev(u8 device_number, struct acpi_device *adev)
+{
+   struct dmar_drhd_unit *dmaru;
+   struct acpi_dmar_hardware_unit *drhd;
+   int ret;
+
for_each_drhd_unit(dmaru) {
drhd = container_of(dmaru->hdr,
struct acpi_dmar_hardware_unit,
header);
+   ret = dmar_acpi_insert_dev_scope(device_number, adev, (void 
*)(drhd+1),
+   ((void 
*)drhd)+drhd->header.length,
+   dmaru->devices, 
dmaru->devices_cnt);
+   if (ret)
+   break;
+   }
+   if (ret > 0)
+   ret = dmar_rmrr_add_acpi_dev(device_number, adev);
 
-   for (scope = (void *)(drhd + 1);
-(unsigned long)scope < ((unsigned long)drhd) + 
drhd->header.length;
-scope = ((void *)scope) + scope->length) {
-   if (scope->entry_type != ACPI_DMAR_SCOPE_TYPE_NAMESPACE)
-   continue;
-   if (scope->enumeration_id != device_number)
-   continue;
+   return ret;
 
-   path = (void *)(scope + 1);
-   pr_info("ACPI device \"%s\" under DMAR at %llx as 
%02x:%02x.%d\n",
-   dev_name(>dev), dmaru->reg_base_addr,
-   scope->bus, path->device, path->function);
-   for_each_dev_scope(dmaru->devices, dmaru->devices_cnt, 
i, tmp)
-   

Re: [RESEND PATCH v4] iommu/mediatek: check 4GB mode by reading infracfg

2020-08-27 Thread Matthias Brugger




On 26/08/2020 10:56, Miles Chen wrote:

In previous discussion [1] and [2], we found that it is risky to
use max_pfn or totalram_pages to tell if 4GB mode is enabled.

Check 4GB mode by reading infracfg register, remove the usage
of the un-exported symbol max_pfn.

This is a step towards building mtk_iommu as a kernel module.

[1] https://lore.kernel.org/lkml/20200603161132.2441-1-miles.c...@mediatek.com/
[2] https://lore.kernel.org/lkml/20200604080120.2628-1-miles.c...@mediatek.com/
[3] https://lore.kernel.org/lkml/20200715205120.GA778876@bogus/

Cc: Mike Rapoport 
Cc: David Hildenbrand 
Cc: Yong Wu 
Cc: Yingjoe Chen 
Cc: Christoph Hellwig 
Cc: Rob Herring 
Cc: Matthias Brugger 
Signed-off-by: Miles Chen 


Reviewed-by: Matthias Brugger 



---

Change since v3
- use lore.kernel.org links
- move "change since..." after "---"

Change since v2:
- determine compatible string by m4u_plat
- rebase to next-20200720
- add "---"

Change since v1:
- remove the phandle usage, search for infracfg instead [3]
- use infracfg instead of infracfg_regmap
- move infracfg definitaions to linux/soc/mediatek/infracfg.h
- update enable_4GB only when has_4gb_mode
---
  drivers/iommu/mtk_iommu.c | 34 +++
  include/linux/soc/mediatek/infracfg.h |  3 +++
  2 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 785b228d39a6..adc350150492 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -3,7 +3,6 @@
   * Copyright (c) 2015-2016 MediaTek Inc.
   * Author: Yong Wu 
   */
-#include 
  #include 
  #include 
  #include 
@@ -15,13 +14,16 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
+#include 
  #include 
  #include 
  
@@ -640,8 +642,11 @@ static int mtk_iommu_probe(struct platform_device *pdev)

struct resource *res;
resource_size_t ioaddr;
struct component_match  *match = NULL;
+   struct regmap   *infracfg;
void*protect;
int i, larb_nr, ret;
+   u32 val;
+   char*p;
  
  	data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);

if (!data)
@@ -655,10 +660,29 @@ static int mtk_iommu_probe(struct platform_device *pdev)
return -ENOMEM;
data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN);
  
-	/* Whether the current dram is over 4GB */

-   data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT));
-   if (!MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE))
-   data->enable_4GB = false;
+   data->enable_4GB = false;
+   if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) {
+   switch (data->plat_data->m4u_plat) {
+   case M4U_MT2712:
+   p = "mediatek,mt2712-infracfg";
+   break;
+   case M4U_MT8173:
+   p = "mediatek,mt8173-infracfg";
+   break;
+   default:
+   p = NULL;
+   }
+
+   infracfg = syscon_regmap_lookup_by_compatible(p);
+
+   if (IS_ERR(infracfg))
+   return PTR_ERR(infracfg);
+
+   ret = regmap_read(infracfg, REG_INFRA_MISC, );
+   if (ret)
+   return ret;
+   data->enable_4GB = !!(val & F_DDR_4GB_SUPPORT_EN);
+   }
  
  	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);

data->base = devm_ioremap_resource(dev, res);
diff --git a/include/linux/soc/mediatek/infracfg.h 
b/include/linux/soc/mediatek/infracfg.h
index fd25f0148566..233463d789c6 100644
--- a/include/linux/soc/mediatek/infracfg.h
+++ b/include/linux/soc/mediatek/infracfg.h
@@ -32,6 +32,9 @@
  #define MT7622_TOP_AXI_PROT_EN_WB (BIT(2) | BIT(6) | \
 BIT(7) | BIT(8))
  
+#define REG_INFRA_MISC0xf00

+#define F_DDR_4GB_SUPPORT_EN   BIT(13)
+
  int mtk_infracfg_set_bus_protection(struct regmap *infracfg, u32 mask,
bool reg_update);
  int mtk_infracfg_clear_bus_protection(struct regmap *infracfg, u32 mask,


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/arm-smmu-v3: add tracepoints for cmdq_issue_cmdlist

2020-08-27 Thread Barry Song
cmdq_issue_cmdlist() is the hotspot that uses a lot of time. This patch
adds tracepoints for it to help debug.

Signed-off-by: Barry Song 
---
 * can furthermore develop an eBPF program to benchmark using this trace

  cmdlistlat.c:
#include 

BPF_HASH(start, u32);
BPF_HISTOGRAM(dist);

TRACEPOINT_PROBE(arm_smmu_v3, issue_cmdlist_entry)
{
u32 pid;
u64 ts, *val;

pid = bpf_get_current_pid_tgid();
ts = bpf_ktime_get_ns();
start.update(, );
return 0;
}

TRACEPOINT_PROBE(arm_smmu_v3, issue_cmdlist_exit)
{
u32 pid;
u64 *tsp, delta;

pid = bpf_get_current_pid_tgid();
tsp = start.lookup();

if (tsp != 0) {
delta = bpf_ktime_get_ns() - *tsp;
dist.increment(bpf_log2l(delta));
start.delete();
}

return 0;
}

 cmdlistlat.py:
#!/usr/bin/python3
#
from __future__ import print_function
from bcc import BPF
from ctypes import c_ushort, c_int, c_ulonglong
from time import sleep
from sys import argv

def usage():
print("USAGE: %s [interval [count]]" % argv[0])
exit()

# arguments
interval = 5
count = -1
if len(argv) > 1:
try:
interval = int(argv[1])
if interval == 0:
raise
if len(argv) > 2:
count = int(argv[2])
except: # also catches -h, --help
usage()

# load BPF program
b = BPF(src_file = "cmdlistlat.c")

# header
print("Tracing... Hit Ctrl-C to end.")

# output
loop = 0
do_exit = 0
while (1):
if count > 0:
loop += 1
if loop > count:
exit()
try:
sleep(interval)
except KeyboardInterrupt:
pass; do_exit = 1

print()
b["dist"].print_log2_hist("nsecs")
b["dist"].clear()
if do_exit:
exit()


 drivers/iommu/arm/arm-smmu-v3/Makefile|  1 +
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-trace.h | 48 +++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  8 
 3 files changed, 57 insertions(+)
 create mode 100644 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-trace.h

diff --git a/drivers/iommu/arm/arm-smmu-v3/Makefile 
b/drivers/iommu/arm/arm-smmu-v3/Makefile
index 569e24e9f162..dba1087f91f3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/Makefile
+++ b/drivers/iommu/arm/arm-smmu-v3/Makefile
@@ -1,2 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0
+ccflags-y += -I$(src)   # needed for trace events
 obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-trace.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-trace.h
new file mode 100644
index ..29ab96706124
--- /dev/null
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-trace.h
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 Hisilicon Limited.
+ */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM arm_smmu_v3
+
+#if !defined(_ARM_SMMU_V3_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _ARM_SMMU_V3_TRACE_H
+
+#include 
+
+struct device;
+
+DECLARE_EVENT_CLASS(issue_cmdlist_class,
+   TP_PROTO(struct device *dev, int n, bool sync),
+   TP_ARGS(dev, n, sync),
+
+   TP_STRUCT__entry(
+   __string(device, dev_name(dev))
+   __field(int, n)
+   __field(bool, sync)
+   ),
+   TP_fast_assign(
+   __assign_str(device, dev_name(dev));
+   __entry->n = n;
+   __entry->sync = sync;
+   ),
+   TP_printk("%s cmd number=%d sync=%d",
+   __get_str(device), __entry->n, __entry->sync)
+);
+
+#define DEFINE_ISSUE_CMDLIST_EVENT(name)   \
+DEFINE_EVENT(issue_cmdlist_class, name,\
+   TP_PROTO(struct device *dev, int n, bool sync), \
+   TP_ARGS(dev, n, sync))
+
+DEFINE_ISSUE_CMDLIST_EVENT(issue_cmdlist_entry);
+DEFINE_ISSUE_CMDLIST_EVENT(issue_cmdlist_exit);
+
+#endif /* _ARM_SMMU_V3_TRACE_H */
+
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE arm-smmu-v3-trace
+#include 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7332251dd8cd..e2d7d5f1d234 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -33,6 +33,8 @@
 
 #include 
 
+#include "arm-smmu-v3-trace.h"
+
 /* MMIO registers */
 #define ARM_SMMU_IDR0  0x0
 #define IDR0_ST_LVLGENMASK(28, 27)
@@ -1389,6 +1391,8 @@ static int arm_smmu_cmdq_issue_cmdlist(struct 
arm_smmu_device *smmu,
}, head = llq;
int ret = 0;
 
+   trace_issue_cmdlist_entry(smmu->dev, n, sync);
+
/* 1. Allocate some space in the queue */
local_irq_save(flags);
llq.val = READ_ONCE(cmdq->q.llq.val);
@@ -1493,6 +1497,7 @@ static int 

[PATCH v5 1/3] iommu/arm-smmu-v3: replace symbolic permissions by octal permissions for module parameter

2020-08-27 Thread Barry Song
This fixed the below checkpatch issue:
WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using
octal permissions '0444'.
417: FILE: drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:417:
module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO);

Reviewed-by: Robin Murphy 
Signed-off-by: Barry Song 
---
 -v5: add Robin's reviewed-by

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 7196207be7ea..eea5f7c6d9ab 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -414,7 +414,7 @@
 #define MSI_IOVA_LENGTH0x10
 
 static bool disable_bypass = 1;
-module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO);
+module_param_named(disable_bypass, disable_bypass, bool, 0444);
 MODULE_PARM_DESC(disable_bypass,
"Disable bypass streams such that incoming transactions from devices 
that are not attached to an iommu domain will report an abort back to the 
device and will not be allowed to pass through the SMMU.");
 
-- 
2.27.0


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 0/3] iommu/arm-smmu-v3: permit users to disable msi polling

2020-08-27 Thread Barry Song
patch 1/3 and patch 2/3 are the preparation of patch 3/3 which permits users
to disable MSI-based polling by cmd line.

-v5:
  add Robin's reviewed-by

-v4:
  with respect to Robin's comments
  * cleanup the code of the existing module parameter disable_bypass
  * add ARM_SMMU_OPT_MSIPOLL flag. on the other hand, we only need to check
a bit in options rather than two bits in features

Barry Song (3):
  iommu/arm-smmu-v3: replace symbolic permissions by octal permissions
for module parameter
  iommu/arm-smmu-v3: replace module_param_named by module_param for
disable_bypass
  iommu/arm-smmu-v3: permit users to disable msi polling

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

-- 
2.27.0


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 2/3] iommu/arm-smmu-v3: replace module_param_named by module_param for disable_bypass

2020-08-27 Thread Barry Song
Just use module_param() - going out of the way to specify a "different"
name that's identical to the variable name is silly.

Reviewed-by: Robin Murphy 
Signed-off-by: Barry Song 
---
 -v5: add Robin's reviewed-by

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index eea5f7c6d9ab..5b40d535a7c8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -414,7 +414,7 @@
 #define MSI_IOVA_LENGTH0x10
 
 static bool disable_bypass = 1;
-module_param_named(disable_bypass, disable_bypass, bool, 0444);
+module_param(disable_bypass, bool, 0444);
 MODULE_PARM_DESC(disable_bypass,
"Disable bypass streams such that incoming transactions from devices 
that are not attached to an iommu domain will report an abort back to the 
device and will not be allowed to pass through the SMMU.");
 
-- 
2.27.0


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v5 3/3] iommu/arm-smmu-v3: permit users to disable msi polling

2020-08-27 Thread Barry Song
Polling by MSI isn't necessarily faster than polling by SEV. Tests on
hi1620 show hns3 100G NIC network throughput can improve from 25G to
27G if we disable MSI polling while running 16 netperf threads sending
UDP packets in size 32KB. TX throughput can improve from 7G to 7.7G for
single thread.
The reason for the throughput improvement is that the latency to poll
the completion of CMD_SYNC becomes smaller. After sending a CMD_SYNC
in an empty cmd queue, typically we need to wait for 280ns using MSI
polling. But we only need around 190ns after disabling MSI polling.
This patch provides a command line option so that users can decide to
use MSI polling or not based on their tests.

Reviewed-by: Robin Murphy 
Signed-off-by: Barry Song 
---
 -v5: add Robin's reviewed-by

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 5b40d535a7c8..7332251dd8cd 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -418,6 +418,11 @@ module_param(disable_bypass, bool, 0444);
 MODULE_PARM_DESC(disable_bypass,
"Disable bypass streams such that incoming transactions from devices 
that are not attached to an iommu domain will report an abort back to the 
device and will not be allowed to pass through the SMMU.");
 
+static bool disable_msipolling;
+module_param(disable_msipolling, bool, 0444);
+MODULE_PARM_DESC(disable_msipolling,
+   "Disable MSI-based polling for CMD_SYNC completion.");
+
 enum pri_resp {
PRI_RESP_DENY = 0,
PRI_RESP_FAIL = 1,
@@ -652,6 +657,7 @@ struct arm_smmu_device {
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
 #define ARM_SMMU_OPT_PAGE0_REGS_ONLY   (1 << 1)
+#define ARM_SMMU_OPT_MSIPOLL   (1 << 2)
u32 options;
 
struct arm_smmu_cmdqcmdq;
@@ -992,8 +998,7 @@ static void arm_smmu_cmdq_build_sync_cmd(u64 *cmd, struct 
arm_smmu_device *smmu,
 * Beware that Hi16xx adds an extra 32 bits of goodness to its MSI
 * payload, so the write will zero the entire command on that platform.
 */
-   if (smmu->features & ARM_SMMU_FEAT_MSI &&
-   smmu->features & ARM_SMMU_FEAT_COHERENCY) {
+   if (smmu->options & ARM_SMMU_OPT_MSIPOLL) {
ent.sync.msiaddr = q->base_dma + Q_IDX(>llq, prod) *
   q->ent_dwords * 8;
}
@@ -1332,8 +1337,7 @@ static int __arm_smmu_cmdq_poll_until_consumed(struct 
arm_smmu_device *smmu,
 static int arm_smmu_cmdq_poll_until_sync(struct arm_smmu_device *smmu,
 struct arm_smmu_ll_queue *llq)
 {
-   if (smmu->features & ARM_SMMU_FEAT_MSI &&
-   smmu->features & ARM_SMMU_FEAT_COHERENCY)
+   if (smmu->options & ARM_SMMU_OPT_MSIPOLL)
return __arm_smmu_cmdq_poll_until_msi(smmu, llq);
 
return __arm_smmu_cmdq_poll_until_consumed(smmu, llq);
@@ -3741,8 +3745,11 @@ static int arm_smmu_device_hw_probe(struct 
arm_smmu_device *smmu)
if (reg & IDR0_SEV)
smmu->features |= ARM_SMMU_FEAT_SEV;
 
-   if (reg & IDR0_MSI)
+   if (reg & IDR0_MSI) {
smmu->features |= ARM_SMMU_FEAT_MSI;
+   if (coherent && !disable_msipolling)
+   smmu->options |= ARM_SMMU_OPT_MSIPOLL;
+   }
 
if (reg & IDR0_HYP)
smmu->features |= ARM_SMMU_FEAT_HYP;
-- 
2.27.0


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/iova: Replace cmpxchg with xchg in queue_iova

2020-08-27 Thread Shaokun Zhang
From: Yuqi Jin 

The performance of the atomic_xchg is better than atomic_cmpxchg because
no comparison is required. While the value of @fq_timer_on can only be 0
or 1. Let's use atomic_xchg instead of atomic_cmpxchg here because we
only need to check that the value changes from 0 to 1 or from 1 to 1.

Cc: Joerg Roedel 
Signed-off-by: Yuqi Jin 
Signed-off-by: Shaokun Zhang 
---
 drivers/iommu/iova.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 45a251da5453..30d969a4c5fd 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -579,7 +579,7 @@ void queue_iova(struct iova_domain *iovad,
 
/* Avoid false sharing as much as possible. */
if (!atomic_read(>fq_timer_on) &&
-   !atomic_cmpxchg(>fq_timer_on, 0, 1))
+   !atomic_xchg(>fq_timer_on, 1))
mod_timer(>fq_timer,
  jiffies + msecs_to_jiffies(IOVA_FQ_TIMEOUT));
 }
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch V2 43/46] genirq/msi: Provide and use msi_domain_set_default_info_flags()

2020-08-27 Thread Marc Zyngier

On 2020-08-26 12:17, Thomas Gleixner wrote:

MSI interrupts have some common flags which should be set not only for
PCI/MSI interrupts.

Move the PCI/MSI flag setting into a common function so it can be 
reused.


Signed-off-by: Thomas Gleixner 
---
V2: New patch
---
 drivers/pci/msi.c   |7 +--
 include/linux/msi.h |1 +
 kernel/irq/msi.c|   24 
 3 files changed, 26 insertions(+), 6 deletions(-)

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1469,12 +1469,7 @@ struct irq_domain *pci_msi_create_irq_do
if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
pci_msi_domain_update_chip_ops(info);

-   info->flags |= MSI_FLAG_ACTIVATE_EARLY;
-   if (IS_ENABLED(CONFIG_GENERIC_IRQ_RESERVATION_MODE))
-   info->flags |= MSI_FLAG_MUST_REACTIVATE;
-
-   /* PCI-MSI is oneshot-safe */
-   info->chip->flags |= IRQCHIP_ONESHOT_SAFE;
+   msi_domain_set_default_info_flags(info);

domain = msi_create_irq_domain(fwnode, info, parent);
if (!domain)
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -410,6 +410,7 @@ int platform_msi_domain_alloc(struct irq
 void platform_msi_domain_free(struct irq_domain *domain, unsigned int 
virq,

  unsigned int nvec);
 void *platform_msi_get_host_data(struct irq_domain *domain);
+void msi_domain_set_default_info_flags(struct msi_domain_info *info);
 #endif /* CONFIG_GENERIC_MSI_IRQ_DOMAIN */

 #ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -70,6 +70,30 @@ void get_cached_msi_msg(unsigned int irq
 EXPORT_SYMBOL_GPL(get_cached_msi_msg);

 #ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN
+void msi_domain_set_default_info_flags(struct msi_domain_info *info)
+{
+	/* Required so that a device latches a valid MSI message on startup 
*/

+   info->flags |= MSI_FLAG_ACTIVATE_EARLY;


As far as I remember the story behind this flag (it's been a while),
it was working around a PCI-specific issue, hence being located in
the PCI code.

Now, the "program the MSI before enabling it" concept makes sense no 
matter
what bus this is on, and I wonder why we are even keeping this flag 
around.
Can't we just drop it together with the check in 
msi_domain_alloc_irqs()?


Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 0/6] Add virtio-iommu built-in topology

2020-08-27 Thread Jean-Philippe Brucker
On Wed, Aug 26, 2020 at 09:26:02AM -0400, Michael S. Tsirkin wrote:
> On Fri, Aug 21, 2020 at 03:15:34PM +0200, Jean-Philippe Brucker wrote:
> > Add a topology description to the virtio-iommu driver and enable x86
> > platforms.
> > 
> > Since [v2] we have made some progress on adding ACPI support for
> > virtio-iommu, which is the preferred boot method on x86. It will be a
> > new vendor-agnostic table describing para-virtual topologies in a
> > minimal format. However some platforms don't use either ACPI or DT for
> > booting (for example microvm), and will need the alternative topology
> > description method proposed here. In addition, since the process to get
> > a new ACPI table will take a long time, this provides a boot method even
> > to ACPI-based platforms, if only temporarily for testing and
> > development.
> 
> OK should I park this in next now? Seems appropriate ...

Yes that sounds like a good idea. It could uncover new bugs since there is
more automated testing happening for x86.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch V2 28/46] x86/xen: Consolidate XEN-MSI init

2020-08-27 Thread Jürgen Groß

On 26.08.20 13:16, Thomas Gleixner wrote:

From: Thomas Gleixner 

X86 cannot store the irq domain pointer in struct device without breaking
XEN because the irq domain pointer takes precedence over arch_*_msi_irqs()
fallbacks.

To achieve this XEN MSI interrupt management needs to be wrapped into an
irq domain.

Move the x86_msi ops setup into a single function to prepare for this.

Signed-off-by: Thomas Gleixner 


Reviewed-by: Juergen Gross 


Juergen
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch V2 27/46] x86/xen: Rework MSI teardown

2020-08-27 Thread Jürgen Groß

On 26.08.20 13:16, Thomas Gleixner wrote:

From: Thomas Gleixner 

X86 cannot store the irq domain pointer in struct device without breaking
XEN because the irq domain pointer takes precedence over arch_*_msi_irqs()
fallbacks.

XENs MSI teardown relies on default_teardown_msi_irqs() which invokes
arch_teardown_msi_irq(). default_teardown_msi_irqs() is a trivial iterator
over the msi entries associated to a device.

Implement this loop in xen_teardown_msi_irqs() to prepare for removal of
the fallbacks for X86.

This is a preparatory step to wrap XEN MSI alloc/free into a irq domain
which in turn allows to store the irq domain pointer in struct device and
to use the irq domain functions directly.

Signed-off-by: Thomas Gleixner 


Reviewed-by: Juergen Gross 


Juergen
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] dma-pool: turn the end variable in cma_in_zone into a u64

2020-08-27 Thread Christoph Hellwig
phys_addr_t can be 32-bit, in which case smatch will complain:

kernel/dma/pool.c:79 cma_in_zone() warn: always true condition '(end <= 32) 
== 64)) ?~0:((1 << (32)) - 1))) => (0-u32max <= u32max)'

Just turn the variable into a u64 to make the range check valid.

Fixes: d7e673ec2c8e ("dma-pool: Only allocate from CMA when in same memory 
zone")
Reported-by: kernel test robot 
Signed-off-by: Christoph Hellwig 
---
 kernel/dma/pool.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c
index 1281c0f0442bc5..b59325bad8edda 100644
--- a/kernel/dma/pool.c
+++ b/kernel/dma/pool.c
@@ -60,7 +60,7 @@ static void dma_atomic_pool_size_add(gfp_t gfp, size_t size)
 static bool cma_in_zone(gfp_t gfp)
 {
unsigned long size;
-   phys_addr_t end;
+   u64 end;
struct cma *cma;
 
cma = dev_get_cma_area(NULL);
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dma-pool: Fix an uninitialized variable bug in atomic_pool_expand()

2020-08-27 Thread Christoph Hellwig
Thanks,

applied to the dma-mapping for-linux tree.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 16/18] staging/media/tegra-vde: Clean up IOMMU workaround

2020-08-27 Thread Dmitry Osipenko
24.08.2020 17:01, Robin Murphy пишет:
...
>> Robin, thank you very much for the clarifications!
>>
>> In accordance to yours comments, this patch can't be applied until Tegra
>> SMMU will support IOMMU_DOMAIN_IDENTITY and implement def_domain_type()
>> callback that returns IOMMU_DOMAIN_IDENTITY for the VDE device.
>>
>> Otherwise you're breaking the VDE driver because
>> dma_buf_map_attachment() [1] returns the IOMMU SGT of the implicit
>> domain which is then mapped into the VDE's explicit domain [2], and this
>> is a nonsense.
> 
> It's true that iommu_dma_ops will do some work in the unattached default
> domain, but non-coherent cache maintenance will still be performed
> correctly on the underlying memory, which is really all that you care
> about for this case. As for tegra_vde_iommu_map(), that seems to do the
> right thing in only referencing the physical side of the scatterlist
> (via iommu_map_sg()) and ignoring the DMA side, so things ought to work
> out OK even if it is a little non-obvious.

I'll need to double-check this, it's indeed not clear to me right now.

I see that if Tegra DRM driver uses implicit IOMMU domain, then when VDE
driver imports DMA-buf from Terga DRM and the imported buffer will be
auto-mapped to the implicit VDE IOVA [1].

[1]
https://elixir.bootlin.com/linux/v5.9-rc2/source/drivers/gpu/drm/tegra/gem.c#L574

>> Hence, either VDE driver should bypass iommu_dma_ops from the start or
>> it needs a way to kick out the ops, like it does this using ARM's
>> arm_iommu_detach_device().
>>
>>
>> The same applies to the Tegra GPU devices, otherwise you're breaking
>> them as well because Tegra DRM is sensible to implicit vs explicit
>> domain.
> 
> Note that Tegra DRM will only be as broken as its current state on
> arm64, and I was under the impression that that was OK now - at least I
> don't recall seeing any complaints since 43c5bf11a610. Although that
> commit and the one before it are resolving the scalability issue that
> they describe, it was very much in my mind at the time that they also
> have the happy side-effect described above - the default domain isn't
> *completely* out of the way, but it's far enough that sensible cases
> should be able to work as expected.

The Tegra DRM has a very special quirk for ARM32 that was added in this
commit [2] and driver relies on checking of whether explicit or implicit
IOMMU is used in order to activate the quirk.

[2]
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=273da5a046965ccf0ec79eb63f2d5173467e20fa

Once the implicit IOMMU is used for the DRM driver, the quirk no longer
works (if I'm not missing something). This problem needs to be resolved
before implicit IOMMU could be used by the Tegra DRM on ARM32.

>> BTW, I tried to apply this series and T30 doesn't boot anymore. I don't
>> have more info for now.
> 
> Yeah, I'm still trying to get to the bottom of whether it's actually
> working as intended at all, even on my RK3288. So far my debugging
> instrumentation has been confusingly inconclusive :/

Surely it will take some time to resolve all the problems and it's great
that you're pushing this work!

I'll try to help with fixing the ARM32 Tegra side of the problems. I
added this to my "TODO" list and should be able to take a closer look
during of this/next weeks!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v11 00/11] PCI: brcmstb: enable PCIe for STB chips

2020-08-27 Thread Christoph Hellwig
On Tue, Aug 25, 2020 at 10:40:27AM -0700, Florian Fainelli wrote:
> Hi,
>
> On 8/24/2020 12:30 PM, Jim Quinlan wrote:
>>
>> Patchset Summary:
>>Enhance a PCIe host controller driver.  Because of its unusual design
>>we are foced to change dev->dma_pfn_offset into a more general role
>>allowing multiple offsets.  See the 'v1' notes below for more info.
>
> We are version 11 and counting, and it is not clear to me whether there is 
> any chance of getting these patches reviewed and hopefully merged for the 
> 5.10 merge window.
>
> There are a lot of different files being touched, so what would be the 
> ideal way of routing those changes towards inclusion?

FYI, I offered to take the dma-mapping bits through the dma-mapping tree.
I have a bit of a backlog, but plan to review and if Jim is ok with that
apply the current version.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/1] iommu/vt-d: Use device numa domain if RHSA is missing

2020-08-27 Thread Lu Baolu
If there are multiple NUMA domains but the RHSA is missing in ACPI/DMAR
table, we could default to the device NUMA domain as fall back. This also
benefits the vIOMMU use case where only a single vIOMMU is exposed, hence
no RHSA will be present but device numa domain can be correct.

Cc: Jacob Pan 
Cc: Kevin Tian 
Cc: Ashok Raj 
Signed-off-by: Lu Baolu 
---
 drivers/iommu/intel/iommu.c | 31 +--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index e0516d64d7a3..bce158468abf 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -700,12 +700,41 @@ static int domain_update_iommu_superpage(struct 
dmar_domain *domain,
return fls(mask);
 }
 
+static int domain_update_device_node(struct dmar_domain *domain)
+{
+   struct device_domain_info *info;
+   int nid = NUMA_NO_NODE;
+
+   assert_spin_locked(_domain_lock);
+
+   if (list_empty(>devices))
+   return NUMA_NO_NODE;
+
+   list_for_each_entry(info, >devices, link) {
+   if (!info->dev)
+   continue;
+
+   nid = dev_to_node(info->dev);
+   if (nid != NUMA_NO_NODE)
+   break;
+   }
+
+   return nid;
+}
+
 /* Some capabilities may be different across iommus */
 static void domain_update_iommu_cap(struct dmar_domain *domain)
 {
domain_update_iommu_coherency(domain);
domain->iommu_snooping = domain_update_iommu_snooping(NULL);
domain->iommu_superpage = domain_update_iommu_superpage(domain, NULL);
+
+   /*
+* If RHSA is missing, we should default to the device numa domain
+* as fall back.
+*/
+   if (domain->nid == NUMA_NO_NODE)
+   domain->nid = domain_update_device_node(domain);
 }
 
 struct context_entry *iommu_context_addr(struct intel_iommu *iommu, u8 bus,
@@ -5086,8 +5115,6 @@ static struct iommu_domain 
*intel_iommu_domain_alloc(unsigned type)
if (type == IOMMU_DOMAIN_DMA)
intel_init_iova_domain(dmar_domain);
 
-   domain_update_iommu_cap(dmar_domain);
-
domain = _domain->domain;
domain->geometry.aperture_start = 0;
domain->geometry.aperture_end   =
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu