Re: [PATCHv3] iommu/arm-smmu-qcom: Add debug support for TLB sync timeouts

2022-07-10 Thread Will Deacon
On Fri, 8 Jul 2022 15:12:30 +0530, Sai Prakash Ranjan wrote:
> TLB sync timeouts can be due to various reasons such as TBU power down
> or pending TCU/TBU invalidation/sync and so on. Debugging these often
> require dumping of some implementation defined registers to know the
> status of TBU/TCU operations and some of these registers are not
> accessible in non-secure world such as from kernel and requires SMC
> calls to read them in the secure world. So, add this debug support
> to dump implementation defined registers for TLB sync timeout issues.
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu-qcom: Add debug support for TLB sync timeouts
  https://git.kernel.org/will/c/b9b721d117e9

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/4] iommu/arm-smmu-v3: cleanup arm_smmu_dev_{enable,disable}_feature

2022-07-10 Thread Will Deacon
On Fri, Jul 08, 2022 at 10:06:16AM +0200, Christoph Hellwig wrote:
> Fold the arm_smmu_dev_has_feature arm_smmu_dev_feature_enabled into
> the main methods.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 55 ++---
>  1 file changed, 14 insertions(+), 41 deletions(-)

Thanks for the cleanup:

Acked-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/4] iommu: remove the put_resv_regions method

2022-07-10 Thread Will Deacon
On Fri, Jul 08, 2022 at 12:19:51PM +0200, Christoph Hellwig wrote:
> On Fri, Jul 08, 2022 at 11:12:45AM +0100, Will Deacon wrote:
> > Heads up, but I think this might collide (trivially?) with:
> > 
> > https://lore.kernel.org/r/20220615101044.1972-1-shameerali.kolothum.th...@huawei.com
> > 
> > which Joerg has queued up already. It looks like the cleanup still makes
> > sense though, so that's good.
> 
> This series sits on top of that one - I waited for it to hit the IOMMU
> tree before resending to avoid the conflict.

Ah brill, sorry for the noise.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/4] iommu: remove the put_resv_regions method

2022-07-10 Thread Will Deacon
On Fri, Jul 08, 2022 at 10:06:15AM +0200, Christoph Hellwig wrote:
> All drivers that implement get_resv_regions just use
> generic_put_resv_regions to implement the put side.  Remove the
> indirections and document the allocations constraints.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/iommu/amd/iommu.c   |  1 -
>  drivers/iommu/apple-dart.c  |  1 -
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  1 -
>  drivers/iommu/arm/arm-smmu/arm-smmu.c   |  1 -
>  drivers/iommu/intel/iommu.c |  1 -
>  drivers/iommu/iommu.c   | 21 -
>  drivers/iommu/mtk_iommu.c   |  1 -
>  drivers/iommu/virtio-iommu.c|  5 ++---
>  include/linux/iommu.h   |  4 
>  9 files changed, 6 insertions(+), 30 deletions(-)

Heads up, but I think this might collide (trivially?) with:

https://lore.kernel.org/r/20220615101044.1972-1-shameerali.kolothum.th...@huawei.com

which Joerg has queued up already. It looks like the cleanup still makes
sense though, so that's good.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu: arm-smmu-impl: Add 8250 display compatible to the client list.

2022-07-06 Thread Will Deacon
On Tue, 14 Jun 2022 16:01:35 -0700, Emma Anholt wrote:
> Required for turning on per-process page tables for the GPU.
> 
> 

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/2] iommu: arm-smmu-impl: Add 8250 display compatible to the client list.
  https://git.kernel.org/will/c/3482c0b73073
[2/2] arm64: dts: qcom: sm8250: Enable per-process page tables.
  (no commit info)

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()

2022-07-06 Thread Will Deacon
On Wed, Jul 06, 2022 at 01:03:44PM +0100, John Garry wrote:
> On 06/07/2022 13:00, Will Deacon wrote:
> > On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote:
> > > Function iommu_group_store_type() supports changing the default domain
> > > of an IOMMU group.
> > > 
> > > Many conditions need to be satisfied and steps taken for this action to be
> > > successful.
> > > 
> > > Satisfying these conditions and steps will be required for setting other
> > > IOMMU group attributes, so factor into a common part and a part specific
> > > to update the IOMMU group attribute.
> > > 
> > > No functional change intended.
> > > 
> > > Some code comments are tidied up also.
> > > 
> > > Signed-off-by: John Garry
> > > ---
> > >   drivers/iommu/iommu.c | 96 ---
> > >   1 file changed, 62 insertions(+), 34 deletions(-)
> > Acked-by: Will Deacon
> > 
> 
> Thanks, but currently I have no plans to progress this series, in favour of
> this 
> https://lore.kernel.org/linux-iommu/1656590892-42307-1-git-send-email-john.ga...@huawei.com/T/#me0e806913050c95f6e6ba2c7f7d96d51ce191204

heh, then I'll stop reviewing it then :) Shame, I quite liked it so far!

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible

2022-07-06 Thread Will Deacon
On Thu, Apr 07, 2022 at 03:52:53PM +0800, Leizhen (ThunderTown) wrote:
> On 2022/4/4 19:27, John Garry wrote:
> > Some low-level drivers may request DMA mappings whose IOVA length exceeds
> > that of the current rcache upper limit.
> > 
> > This means that allocations for those IOVAs will never be cached, and
> > always must be allocated and freed from the RB tree per DMA mapping cycle.
> > This has a significant effect on performance, more so since commit
> > 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
> > fails"), as discussed at [0].
> > 
> > As a first step towards allowing the rcache range upper limit be
> > configured, hold this value in the IOVA rcache structure, and allocate
> > the rcaches separately.
> > 
> > Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.
> > 
> > [0] 
> > https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leiz...@huawei.com/
> > 
> > Signed-off-by: John Garry 
> > ---
> >  drivers/iommu/iova.c | 20 ++--
> >  include/linux/iova.h |  3 +++
> >  2 files changed, 13 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> > index db77aa675145..5c22b9187b79 100644
> > --- a/drivers/iommu/iova.c
> > +++ b/drivers/iommu/iova.c
> > @@ -15,8 +15,6 @@
> >  /* The anchor node sits above the top of the usable address space */
> >  #define IOVA_ANCHOR~0UL
> >  
> > -#define IOVA_RANGE_CACHE_MAX_SIZE 6/* log of max cached IOVA range 
> > size (in pages) */
> > -
> >  static bool iova_rcache_insert(struct iova_domain *iovad,
> >unsigned long pfn,
> >unsigned long size);
> > @@ -443,7 +441,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned 
> > long size,
> >  * rounding up anything cacheable to make sure that can't happen. The
> >  * order of the unadjusted size will still match upon freeing.
> >  */
> > -   if (size < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1)))
> > +   if (size < (1 << (iovad->rcache_max_size - 1)))
> > size = roundup_pow_of_two(size);
> >  
> > iova_pfn = iova_rcache_get(iovad, size, limit_pfn + 1);
> > @@ -713,13 +711,15 @@ int iova_domain_init_rcaches(struct iova_domain 
> > *iovad)
> > unsigned int cpu;
> > int i, ret;
> >  
> > -   iovad->rcaches = kcalloc(IOVA_RANGE_CACHE_MAX_SIZE,
> > +   iovad->rcache_max_size = 6; /* Arbitrarily high default */
> 
> It would be better to assign this constant value to iovad->rcache_max_size
> in init_iova_domain().

I think it's fine where it is as it's a meaningless number outside of the
rcache code.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v5 2/5] iova: Allow rcache range upper limit to be flexible

2022-07-06 Thread Will Deacon
On Mon, Apr 04, 2022 at 07:27:11PM +0800, John Garry wrote:
> Some low-level drivers may request DMA mappings whose IOVA length exceeds
> that of the current rcache upper limit.
> 
> This means that allocations for those IOVAs will never be cached, and
> always must be allocated and freed from the RB tree per DMA mapping cycle.
> This has a significant effect on performance, more so since commit
> 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
> fails"), as discussed at [0].
> 
> As a first step towards allowing the rcache range upper limit be
> configured, hold this value in the IOVA rcache structure, and allocate
> the rcaches separately.
> 
> Delete macro IOVA_RANGE_CACHE_MAX_SIZE in case it's reused by mistake.
> 
> [0] 
> https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leiz...@huawei.com/
> 
> Signed-off-by: John Garry 
> ---
>  drivers/iommu/iova.c | 20 ++--
>  include/linux/iova.h |  3 +++
>  2 files changed, 13 insertions(+), 10 deletions(-)

Acked-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RESEND v5 1/5] iommu: Refactor iommu_group_store_type()

2022-07-06 Thread Will Deacon
On Mon, Apr 04, 2022 at 07:27:10PM +0800, John Garry wrote:
> Function iommu_group_store_type() supports changing the default domain
> of an IOMMU group.
> 
> Many conditions need to be satisfied and steps taken for this action to be
> successful.
> 
> Satisfying these conditions and steps will be required for setting other
> IOMMU group attributes, so factor into a common part and a part specific
> to update the IOMMU group attribute.
> 
> No functional change intended.
> 
> Some code comments are tidied up also.
> 
> Signed-off-by: John Garry 
> ---
>  drivers/iommu/iommu.c | 96 ---
>  1 file changed, 62 insertions(+), 34 deletions(-)

Acked-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv2] iommu/arm-smmu-qcom: Add debug support for TLB sync timeouts

2022-07-06 Thread Will Deacon
On Thu, May 26, 2022 at 09:44:03AM +0530, Sai Prakash Ranjan wrote:
> TLB sync timeouts can be due to various reasons such as TBU power down
> or pending TCU/TBU invalidation/sync and so on. Debugging these often
> require dumping of some implementation defined registers to know the
> status of TBU/TCU operations and some of these registers are not
> accessible in non-secure world such as from kernel and requires SMC
> calls to read them in the secure world. So, add this debug support
> to dump implementation defined registers for TLB sync timeout issues.
> 
> Signed-off-by: Sai Prakash Ranjan 
> ---
> 
> Changes in v2:
>  * Use scm call consistently so that it works on older chipsets where
>some of these regs are secure registers.
>  * Add device specific data to get the implementation defined register
>offsets.
> 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 161 ++---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c  |   2 +
>  drivers/iommu/arm/arm-smmu/arm-smmu.h  |   1 +
>  3 files changed, 146 insertions(+), 18 deletions(-)

If this is useful to you, then I suppose it's something we could support,
however I'm pretty worried about our ability to maintain/scale this stuff
as it is extended to support additional SoCs and other custom debugging
features.

Perhaps you could stick it all in arm-smmu-qcom-debug.c and have a new
config option for that, so at least it's even further out of the way?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 0/2] iommu/mediatek: TTBR up to 35bit support

2022-07-06 Thread Will Deacon
On Thu, Jun 30, 2022 at 05:29:24PM +0800, yf.w...@mediatek.com wrote:
> This patchset adds MediaTek TTBR up to 35bit support for single normal zone.
> 
> Changes in v12:
> - Update [PATCH 1/2]: remove GENMASK(31, 7)
> - Update [PATCH 2/2]: remove MMU_PT_ADDR_MASK definition.

For both patches:

Acked-by: Will Deacon 

Joerg -- please can you pick these up for 5.20?

Thanks,

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/4] dt-bindings: qcom-iommu: Add Qualcomm MSM8953 compatible

2022-07-06 Thread Will Deacon
On Sun, Jun 12, 2022 at 11:22:13AM +0200, Luca Weiss wrote:
> Document the compatible used for IOMMU on the msm8953 SoC.
> 
> Signed-off-by: Luca Weiss 
> ---
> Changes from v1:
> - new patch
> 
>  Documentation/devicetree/bindings/iommu/qcom,iommu.txt | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
> b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
> index 059139abce35..e6cecfd360eb 100644
> --- a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
> +++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
> @@ -10,6 +10,7 @@ to non-secure vs secure interrupt line.
>  - compatible   : Should be one of:
>  
>  "qcom,msm8916-iommu"
> +"qcom,msm8953-iommu"

I'm assuming Andy or Bjorn will pick this up.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: Fix undefined behavior in GBPA_UPDATE

2022-07-01 Thread Will Deacon
On Thu, Jun 30, 2022 at 09:39:59AM +0300, Xenia Ragiadakou wrote:
> The expression 1 << 31 results in undefined behaviour because the type of
> integer constant 1 is (signed) int and the result of shifting 1 by 31 bits
> is not representable in the (signed) int type.
> 
> Change the type of 1 to unsigned int by adding the U suffix.
> 
> Signed-off-by: Xenia Ragiadakou 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index cd48590ada30..44fbd499edea 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -96,7 +96,7 @@
>  #define CR2_E2H  (1 << 0)
>  
>  #define ARM_SMMU_GBPA0x44
> -#define GBPA_UPDATE  (1 << 31)
> +#define GBPA_UPDATE  (1U << 31)

There are loads of these kicking around in the kernel sources and we compile
with -fno-strict-overflow.

If you really want to change these, then let's use the BIT() macro instead,
but I think it's really just churn.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 1/3] iommu/io-pgtable-arm-v7s: Add a quirk to allow pgtable PA up to 35bit

2022-06-14 Thread Will Deacon
Hi,

For some reason, this series has landed in my spam folder so apologies
for the delay :/

On Sat, Jun 11, 2022 at 06:26:53PM +0800, yf.w...@mediatek.com wrote:
> From: Yunfei Wang 
> 
> Single memory zone feature will remove ZONE_DMA32 and ZONE_DMA and
> cause pgtable PA size larger than 32bit.
> 
> Since Mediatek IOMMU hardware support at most 35bit PA in pgtable,
> so add a quirk to allow the PA of pgtables support up to bit35.
> 
> Signed-off-by: Ning Li 
> Signed-off-by: Yunfei Wang 
> ---
>  drivers/iommu/io-pgtable-arm-v7s.c | 48 ++
>  include/linux/io-pgtable.h | 17 +++
>  2 files changed, 46 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
> b/drivers/iommu/io-pgtable-arm-v7s.c
> index be066c1503d3..d4702d8d825a 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -182,14 +182,8 @@ static bool arm_v7s_is_mtk_enabled(struct io_pgtable_cfg 
> *cfg)
>   (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_EXT);
>  }
>  
> -static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
> - struct io_pgtable_cfg *cfg)
> +static arm_v7s_iopte to_iopte_mtk(phys_addr_t paddr, arm_v7s_iopte pte)
>  {
> - arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
> -
> - if (!arm_v7s_is_mtk_enabled(cfg))
> - return pte;
> -
>   if (paddr & BIT_ULL(32))
>   pte |= ARM_V7S_ATTR_MTK_PA_BIT32;
>   if (paddr & BIT_ULL(33))
> @@ -199,6 +193,17 @@ static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, 
> int lvl,
>   return pte;
>  }
>  
> +static arm_v7s_iopte paddr_to_iopte(phys_addr_t paddr, int lvl,
> + struct io_pgtable_cfg *cfg)
> +{
> + arm_v7s_iopte pte = paddr & ARM_V7S_LVL_MASK(lvl);
> +
> + if (!arm_v7s_is_mtk_enabled(cfg))
> + return pte;
> +
> + return to_iopte_mtk(paddr, pte);

nit, but can we rename and rework this so it reads a bit better, please?
Something like:


if (arm_v7s_is_mtk_enabled(cfg))
return to_mtk_iopte(paddr, pte);

return pte;


>  static phys_addr_t iopte_to_paddr(arm_v7s_iopte pte, int lvl,
> struct io_pgtable_cfg *cfg)
>  {
> @@ -234,6 +239,7 @@ static arm_v7s_iopte *iopte_deref(arm_v7s_iopte pte, int 
> lvl,
>  static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
>  struct arm_v7s_io_pgtable *data)
>  {
> + gfp_t gfp_l1 = __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA;
>   struct io_pgtable_cfg *cfg = >iop.cfg;
>   struct device *dev = cfg->iommu_dev;
>   phys_addr_t phys;
> @@ -241,9 +247,11 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
>   size_t size = ARM_V7S_TABLE_SIZE(lvl, cfg);
>   void *table = NULL;
>  
> + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)
> + gfp_l1 = GFP_KERNEL | __GFP_ZERO;

I think it's a bit grotty to override the flags inline like this (same for
the slab flag later on). Something like this is a bit cleaner:


/*
 * Comment explaining why GFP_KERNEL is desirable here.
 * I'm assuming it's because the walker can address all of memory.
 */
gfp_l1 = cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT ?
 GFP_KERNEL : ARM_V7S_TABLE_GFP_DMA;

...

__get_free_pages(gfp_l1 | __GFP_ZERO, ...);


and similar for the slab flag.

>   if (lvl == 1)
> - table = (void *)__get_free_pages(
> - __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
> + table = (void *)__get_free_pages(gfp_l1, get_order(size));
>   else if (lvl == 2)
>   table = kmem_cache_zalloc(data->l2_tables, gfp);
>  
> @@ -251,7 +259,8 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
>   return NULL;
>  
>   phys = virt_to_phys(table);
> - if (phys != (arm_v7s_iopte)phys) {
> + if (phys != (arm_v7s_iopte)phys &&
> + !(cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)) {
>   /* Doesn't fit in PTE */

Shouldn't we be checking that the address is within 35 bits here? Perhaps we
should generate a mask from the oas instead of just using the cast.

>   dev_err(dev, "Page table does not fit in PTE: %pa", );
>   goto out_free;
> @@ -457,9 +466,14 @@ static arm_v7s_iopte arm_v7s_install_table(arm_v7s_iopte 
> *table,
>  arm_v7s_iopte curr,
>  struct io_pgtable_cfg *cfg)
>  {
> + phys_addr_t phys = virt_to_phys(table);
>   arm_v7s_iopte old, new;
>  
> - new = virt_to_phys(table) | ARM_V7S_PTE_TYPE_TABLE;
> + new = phys | ARM_V7S_PTE_TYPE_TABLE;
> +
> + if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_MTK_TTBR_EXT)
> + new = to_iopte_mtk(phys, new);
> +
>   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
>   

Re: [PATCH v12 6/9] iommu/arm-smmu-v3: Introduce strtab init helper

2022-06-09 Thread Will Deacon
On Tue, May 03, 2022 at 05:33:27PM +0100, Shameer Kolothum wrote:
> Introduce a helper to check the sid range and to init the l2 strtab
> entries(bypass). This will be useful when we have to initialize the
> l2 strtab with bypass for RMR SIDs.
> 
> Signed-off-by: Shameer Kolothum 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 28 +++--
>  1 file changed, 15 insertions(+), 13 deletions(-)

Acked-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/6] iommu/qcom: Use the asid read from device-tree if specified

2022-05-31 Thread Will Deacon
On Tue, May 31, 2022 at 09:15:22AM -0700, Rob Clark wrote:
> On Tue, May 31, 2022 at 8:46 AM Will Deacon  wrote:
> >
> > On Fri, May 27, 2022 at 11:28:56PM +0200, Konrad Dybcio wrote:
> > > From: AngeloGioacchino Del Regno 
> > > 
> > >
> > > As specified in this driver, the context banks are 0x1000 apart.
> > > Problem is that sometimes the context number (our asid) does not
> > > match this logic and we end up using the wrong one: this starts
> > > being a problem in the case that we need to send TZ commands
> > > to do anything on a specific context.
> >
> > I don't understand this. The ASID is a software construct, so it shouldn't
> > matter what we use. If it does matter, then please can you explain why? The
> > fact that the context banks are 0x1000 apart seems unrelated.
> 
> I think the connection is that mapping from ctx bank to ASID is 1:1

But in what sense? How is the ASID used beyond a tag in the TLB? The commit
message hints at "TZ commands" being a problem.

I'm not doubting that this is needed to make the thing work, I just don't
understand why.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/6] iommu/qcom: Write TCR before TTBRs to fix ASID access behavior

2022-05-31 Thread Will Deacon
On Fri, May 27, 2022 at 11:28:57PM +0200, Konrad Dybcio wrote:
> From: AngeloGioacchino Del Regno 
> 
> As also stated in the arm-smmu driver, we must write the TCR before
> writing the TTBRs, since the TCR determines the access behavior of
> some fields.

Where is this stated in the arm-smmu driver?

> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> Signed-off-by: Marijn Suijten 
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/iommu/arm/arm-smmu/qcom_iommu.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c 
> b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
> index 1728d4d7fe25..75f353866c40 100644
> --- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
> +++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
> @@ -273,18 +273,18 @@ static int qcom_iommu_init_domain(struct iommu_domain 
> *domain,
>   ctx->secure_init = true;
>   }
>  
> - /* TTBRs */
> - iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
> - pgtbl_cfg.arm_lpae_s1_cfg.ttbr |
> - FIELD_PREP(ARM_SMMU_TTBRn_ASID, ctx->asid));
> - iommu_writeq(ctx, ARM_SMMU_CB_TTBR1, 0);
> -
>   /* TCR */
>   iommu_writel(ctx, ARM_SMMU_CB_TCR2,
>   arm_smmu_lpae_tcr2(_cfg));
>   iommu_writel(ctx, ARM_SMMU_CB_TCR,
>arm_smmu_lpae_tcr(_cfg) | ARM_SMMU_TCR_EAE);
>  
> + /* TTBRs */
> + iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
> + pgtbl_cfg.arm_lpae_s1_cfg.ttbr |
> + FIELD_PREP(ARM_SMMU_TTBRn_ASID, ctx->asid));
> + iommu_writeq(ctx, ARM_SMMU_CB_TTBR1, 0);

I'd have thought that SCTLR.M would be clear here, so it shouldn't matter
what order we write these in.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/6] iommu/qcom: Use the asid read from device-tree if specified

2022-05-31 Thread Will Deacon
On Fri, May 27, 2022 at 11:28:56PM +0200, Konrad Dybcio wrote:
> From: AngeloGioacchino Del Regno 
> 
> As specified in this driver, the context banks are 0x1000 apart.
> Problem is that sometimes the context number (our asid) does not
> match this logic and we end up using the wrong one: this starts
> being a problem in the case that we need to send TZ commands
> to do anything on a specific context.

I don't understand this. The ASID is a software construct, so it shouldn't
matter what we use. If it does matter, then please can you explain why? The
fact that the context banks are 0x1000 apart seems unrelated.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] vfio: Remove VFIO_TYPE1_NESTING_IOMMU

2022-05-20 Thread Will Deacon
On Fri, May 20, 2022 at 12:02:23PM +0100, Robin Murphy wrote:
> On 2022-05-10 17:55, Jason Gunthorpe via iommu wrote:
> > This control causes the ARM SMMU drivers to choose a stage 2
> > implementation for the IO pagetable (vs the stage 1 usual default),
> > however this choice has no visible impact to the VFIO user.
> 
> Oh, I should have read more carefully... this isn't entirely true. Stage 2
> has a different permission model from stage 1, so although it's arguably
> undocumented behaviour, VFIO users that know enough about the underlying
> system could use this to get write-only mappings if they so wish.

There's also an impact on combining memory attributes, but it's not hugely
clear how that impacts userspace via things like VFIO_DMA_CC_IOMMU.

> There may potentially also be visible differences in translation performance
> between stages, although I imagine that's firmly over in the niche of things
> that users might look at for system validation purposes, rather than for
> practical usefulness.
> 
> > Further qemu
> > never implemented this and no other userspace user is known.
> > 
> > The original description in commit f5c9ecebaf2a ("vfio/iommu_type1: add
> > new VFIO_TYPE1_NESTING_IOMMU IOMMU type") suggested this was to "provide
> > SMMU translation services to the guest operating system" however the rest
> > of the API to set the guest table pointer for the stage 1 was never
> > completed, or at least never upstreamed, rendering this part useless dead
> > code.
> > 
> > Since the current patches to enable nested translation, aka userspace page
> > tables, rely on iommufd and will not use the enable_nesting()
> > iommu_domain_op, remove this infrastructure. However, don't cut too deep
> > into the SMMU drivers for now expecting the iommufd work to pick it up -
> > we still need to create S2 IO page tables.
> > 
> > Remove VFIO_TYPE1_NESTING_IOMMU and everything under it including the
> > enable_nesting iommu_domain_op.
> > 
> > Just in-case there is some userspace using this continue to treat
> > requesting it as a NOP, but do not advertise support any more.
> 
> This also seems a bit odd, especially given that it's not actually a no-op;
> surely either it's supported and functional or it isn't?
> 
> In all honesty, I'm not personally attached to this code either way. If this
> patch had come 5 years ago, when the interface already looked like a bit of
> a dead end, I'd probably have agreed more readily. But now, when we're
> possibly mere months away from implementing the functional equivalent for
> IOMMUFD, which if done right might be able to support a trivial compat layer
> for this anyway, I just don't see what we gain from not at least waiting to
> see where that ends up. The given justification reads as "get rid of this
> code that we already know we'll need to bring back in some form, and
> half-break an unpopular VFIO ABI because it doesn't do *everything* that its
> name might imply", which just isn't convincing me.

I'm inclined to agree although we can very easily revert this patch when we
need to bring the stuff back, it would just be a bit churny.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 5/9] iommu/arm-smmu: Attach to host1x context device bus

2022-05-16 Thread Will Deacon
On Mon, May 16, 2022 at 11:52:54AM +0300, cyn...@kapsi.fi wrote:
> From: Mikko Perttunen 
> 
> Set itself as the IOMMU for the host1x context device bus, containing
> "dummy" devices used for Host1x context isolation.
> 
> Signed-off-by: Mikko Perttunen 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index 568cce590ccc..9ff54eaecf81 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -39,6 +39,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  #include "arm-smmu.h"
>  
> @@ -2053,8 +2054,20 @@ static int arm_smmu_bus_init(struct iommu_ops *ops)
>   goto err_reset_pci_ops;
>   }
>  #endif
> +#ifdef CONFIG_TEGRA_HOST1X_CONTEXT_BUS
> + if (!iommu_present(_context_device_bus_type)) {
> + err = bus_set_iommu(_context_device_bus_type, ops);
> + if (err)
> + goto err_reset_fsl_mc_ops;
> + }
> +#endif
> +
>   return 0;
>  
> +err_reset_fsl_mc_ops: __maybe_unused;
> +#ifdef CONFIG_FSL_MC_BUS
> + bus_set_iommu(_mc_bus_type, NULL);
> +#endif

bus_set_iommu() is going away:

https://lore.kernel.org/r/cover.1650890638.git.robin.mur...@arm.com

Will
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[GIT PULL] iommu/arm-smmu: Updates for 5.19

2022-05-10 Thread Will Deacon
Hi Joerg,

Please pull these Arm SMMU updates for 5.19. The bulk of it is just adding
new device-tree compatible strings for the existing drivers, but there
are some non-critical fixes in here as well. Please see the tag for more
details.

I used the previous fixes pull as a base for this so that you shouldn't
run into any conflicts with upstream.

Cheers,

Will

--->8

The following changes since commit 4a25f2ea0e030b2fc852c4059a50181bfc5b2f57:

  iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu (2022-04-22 
11:21:30 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
tags/arm-smmu-updates

for you to fetch changes up to 628bf55b620497a105f4963ee8fb84769f7e6bb4:

  iommu/arm-smmu: Force identity domains for legacy binding (2022-05-10 
12:01:31 +0100)


Arm SMMU updates for 5.19

- Add new Qualcomm device-tree compatible strings

- Add new Nvidia device-tree compatible string for Tegra234

- Fix UAF in SMMUv3 shared virtual addressing code

- Force identity-mapped domains for users of ye olde SMMU legacy binding

- Minor cleanups


Bjorn Andersson (2):
  dt-bindings: arm-smmu: Add compatible for Qualcomm SC8280XP
  iommu/arm-smmu-qcom: Add SC8280XP support

Jean-Philippe Brucker (1):
  iommu/arm-smmu-v3-sva: Fix mm use-after-free

Robin Murphy (1):
  iommu/arm-smmu: Force identity domains for legacy binding

Rohit Agarwal (1):
  dt-bindings: arm-smmu: Add binding for SDX65 SMMU

Thierry Reding (3):
  dt-bindings: arm-smmu: Document nvidia,memory-controller property
  dt-bindings: arm-smmu: Add compatible for Tegra234 SOC
  iommu/arm-smmu: Support Tegra234 SMMU

Yang Yingliang (2):
  iommu/arm-smmu: fix possible null-ptr-deref in arm_smmu_device_probe()
  iommu/arm-smmu-v3: check return value after calling 
platform_get_resource()

 .../devicetree/bindings/iommu/arm,smmu.yaml| 25 --
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c| 13 +--
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c|  2 ++
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c |  3 ++-
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c |  1 +
 drivers/iommu/arm/arm-smmu/arm-smmu.c  |  8 ---
 6 files changed, 44 insertions(+), 8 deletions(-)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: Force identity domains for legacy binding

2022-05-10 Thread Will Deacon
On Tue, 10 May 2022 09:38:58 +0100, Robin Murphy wrote:
> When using the legacy "mmu-masters" DT binding, we reject DMA domains
> since we have no guarantee of driver probe order and thus can't rely on
> client drivers getting the correct DMA ops. However, we can do better
> than fall back to the old no-default-domain behaviour now, by forcing an
> identity default domain instead. This also means that detaching from a
> VFIO domain can actually work - that looks to have been broken for over
> 6 years, so clearly isn't something that legacy binding users care
> about, but we may as well make the driver code make sense anyway.
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu: Force identity domains for legacy binding
  https://git.kernel.org/will/c/628bf55b6204

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 1/7] iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity

2022-05-10 Thread Will Deacon
On Thu, Apr 07, 2022 at 08:58:35PM +0800, Yicong Yang wrote:
> The DMA operations of HiSilicon PTT device can only work properly with
> identical mappings. So add a quirk for the device to force the domain
> as passthrough.
> 
> Signed-off-by: Yicong Yang 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 
>  1 file changed, 16 insertions(+)

I still don't like this being part of the SMMU driver, but given that
(a) Robin doesn't seem to agree with the objection and (b) you've been
refreshing the patch series:

Acked-by: Will Deacon 

If you do respin, then:

> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 627a3ed5ee8f..5ec15ae2a9b1 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2839,6 +2839,21 @@ static int arm_smmu_dev_disable_feature(struct device 
> *dev,
>   }
>  }

It might be worth adding a brief comment here to explain what this device is
and why it needs an identity mapping.

> +#define IS_HISI_PTT_DEVICE(pdev) ((pdev)->vendor == PCI_VENDOR_ID_HUAWEI 
> && \
> +  (pdev)->device == 0xa12e)

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: fix possible null-ptr-deref in arm_smmu_device_probe()

2022-05-06 Thread Will Deacon
On Mon, 25 Apr 2022 19:41:36 +0800, Yang Yingliang wrote:
> It will cause null-ptr-deref when using 'res', if platform_get_resource()
> returns NULL, so move using 'res' after devm_ioremap_resource() that
> will check it to avoid null-ptr-deref.
> And use devm_platform_get_and_ioremap_resource() to simplify code.
> 
> 

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu: fix possible null-ptr-deref in arm_smmu_device_probe()
  https://git.kernel.org/will/c/d9ed8af1dee3

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 0/3] iommu/arm-smmu: Support Tegra234 SMMU

2022-05-06 Thread Will Deacon
On Fri, 29 Apr 2022 10:22:40 +0200, Thierry Reding wrote:
> From: Thierry Reding 
> 
> Hi Joerg,
> 
> this is essentially a resend of v2 with a Acked-by:s from Robin and Will
> added. These have been on the list for quite a while now, but apparently
> there was a misunderstanding, so neither you nor Will picked this up.
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/3] dt-bindings: arm-smmu: Document nvidia,memory-controller property
  https://git.kernel.org/will/c/c02bda09f91a
[2/3] dt-bindings: arm-smmu: Add compatible for Tegra234 SOC
  https://git.kernel.org/will/c/95d5aeabda00
[3/3] iommu/arm-smmu: Support Tegra234 SMMU
  https://git.kernel.org/will/c/5ca216155b5e

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: check return value after calling platform_get_resource()

2022-05-06 Thread Will Deacon
On Mon, 25 Apr 2022 19:45:25 +0800, Yang Yingliang wrote:
> It will cause null-ptr-deref if platform_get_resource() returns NULL,
> we need check the return value.
> 
> 

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu-v3: check return value after calling 
platform_get_resource()
  https://git.kernel.org/will/c/b131fa8c1d2a

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/7] SDX65 devicetree updates

2022-05-06 Thread Will Deacon
On Mon, 11 Apr 2022 15:20:08 +0530, Rohit Agarwal wrote:
> This series adds devicetree nodes for SDX65. It adds reserved memory
> nodes, SDHCI, smmu and tcsr mutex support.
> 
> Changes from v1:
>  - Addressed Mani's Comments and made necessary.
>  - Rebased on top of v5.18-rc2.
> 
> [...]

Applied SMMU binding patch to will (for-joerg/arm-smmu/updates), thanks!

[4/7] dt-bindings: arm-smmu: Add binding for SDX65 SMMU
  https://git.kernel.org/will/c/5a4eb9163471

However, I must confess that I don't understand the point in updating
the binding documentation but not the driver. Should we be matching on
the new compatible string somewhere?

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/arm-smmu-v3-sva: Fix mm use-after-free

2022-05-06 Thread Will Deacon
On Tue, 26 Apr 2022 14:04:45 +0100, Jean-Philippe Brucker wrote:
> We currently call arm64_mm_context_put() without holding a reference to
> the mm, which can result in use-after-free. Call mmgrab()/mmdrop() to
> ensure the mm only gets freed after we unpinned the ASID.
> 
> 

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu-v3-sva: Fix mm use-after-free
  https://git.kernel.org/will/c/cbd23144f766

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] iommu/arm-smmu-qcom: Add SC8280XP support

2022-05-06 Thread Will Deacon
On Tue, 3 May 2022 09:34:27 -0700, Bjorn Andersson wrote:
> This adds the compatible for the Qualcomm SC8280XP platform and associate the
> Qualcomm impl in the ARM SMMU driver to it.
> 
> Bjorn Andersson (2):
>   dt-bindings: arm-smmu: Add compatible for Qualcomm SC8280XP
>   iommu/arm-smmu-qcom: Add SC8280XP support
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/2] dt-bindings: arm-smmu: Add compatible for Qualcomm SC8280XP
  https://git.kernel.org/will/c/38db6b41b2f4
[2/2] iommu/arm-smmu-qcom: Add SC8280XP support
  https://git.kernel.org/will/c/d044023e219d

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 0/3] iommu/arm-smmu: Support Tegra234 SMMU

2022-05-05 Thread Will Deacon
On Thu, May 05, 2022 at 04:15:29PM +0200, Thierry Reding wrote:
> On Fri, Apr 29, 2022 at 10:22:40AM +0200, Thierry Reding wrote:
> > From: Thierry Reding 
> > 
> > Hi Joerg,
> > 
> > this is essentially a resend of v2 with a Acked-by:s from Robin and Will
> > added. These have been on the list for quite a while now, but apparently
> > there was a misunderstanding, so neither you nor Will picked this up.
> > 
> > Since Will acked these, I think it's probably best for you to pick these
> > up directly. If not, let me know and I'll work with Will to merge via
> > the ARM SMMU tree.
> > 
> > Thanks,
> > Thierry
> > 
> > Thierry Reding (3):
> >   dt-bindings: arm-smmu: Document nvidia,memory-controller property
> >   dt-bindings: arm-smmu: Add compatible for Tegra234 SOC
> >   iommu/arm-smmu: Support Tegra234 SMMU
> > 
> >  .../devicetree/bindings/iommu/arm,smmu.yaml   | 23 +--
> >  drivers/iommu/arm/arm-smmu/arm-smmu-impl.c|  3 ++-
> >  2 files changed, 23 insertions(+), 3 deletions(-)
> 
> Joerg,
> 
> anything left to do on this from your perspective, or can this go into
> v5.19?

I'll pick them up in the Arm SMMU queue, as there are some other SMMU
patches kicking around and we may as well keep them all together.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[GIT PULL] iommu/arm-smmu: Fixes for 5.18

2022-04-22 Thread Will Deacon
Hi Joerg,

Unusually, we've got some SMMU driver fixes this time around. Summary in
the tag -- please can you pull these for 5.18?

Cheers,

Will

--->8

The following changes since commit 3123109284176b1532874591f7c81f3837bbdc17:

  Linux 5.18-rc1 (2022-04-03 14:08:21 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
tags/arm-smmu-fixes

for you to fetch changes up to 4a25f2ea0e030b2fc852c4059a50181bfc5b2f57:

  iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu (2022-04-22 
11:21:30 +0100)


Arm SMMU fixes for 5.18

- Fix off-by-one in SMMUv3 SVA TLB invalidation

- Disable large mappings to workaround nvidia erratum


Ashish Mhetre (1):
  iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu

Nicolin Chen (1):
  iommu/arm-smmu-v3: Fix size calculation in arm_smmu_mm_invalidate_range()

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c |  9 +++-
 drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c| 30 +
 2 files changed, 38 insertions(+), 1 deletion(-)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Patch v2] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu

2022-04-22 Thread Will Deacon
On Thu, 21 Apr 2022 13:45:04 +0530, Ashish Mhetre wrote:
> Tegra194 and Tegra234 SoCs have the erratum that causes walk cache
> entries to not be invalidated correctly. The problem is that the walk
> cache index generated for IOVA is not same across translation and
> invalidation requests. This is leading to page faults when PMD entry is
> released during unmap and populated with new PTE table during subsequent
> map request. Disabling large page mappings avoids the release of PMD
> entry and avoid translations seeing stale PMD entry in walk cache.
> Fix this by limiting the page mappings to PAGE_SIZE for Tegra194 and
> Tegra234 devices. This is recommended fix from Tegra hardware design
> team.
> 
> [...]

Applied to will (for-joerg/arm-smmu/fixes), thanks!

[1/1] iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu
  https://git.kernel.org/will/c/4a25f2ea0e03

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 04/13] iommu/arm-smmu: Clean up bus_set_iommu()

2022-04-21 Thread Will Deacon
On Wed, Apr 20, 2022 at 05:05:03PM +0100, Robin Murphy wrote:
> On 2022-04-19 15:40, Will Deacon wrote:
> > On Thu, Apr 14, 2022 at 01:42:33PM +0100, Robin Murphy wrote:
> > > Stop calling bus_set_iommu() since it's now unnecessary. With device
> > > probes now replayed for every IOMMU instance registration, the whole
> > > sorry ordering workaround for legacy DT bindings goes too, hooray!
> > 
> > Ha, I hope you tested this!
> 
> Oh alright then, since it's you... :)
> 
> I've hacked up a Juno DT with the old bindings, and (after needing a while
> to remember that they're fundamentally incompatible with disable_bypass),
> can confirm that with my whole dev branch including this series applied, it
> boots and creates IOMMU groups as expected. I then made the mistake of
> trying without the branch to check whether the squawks from
> iommu_setup_dma_ops() were new or not, and... well... plain rc3 doesn't even
> boot on the same setup - it's somehow blowing up in the failure cleanup path
> of iommu_bus_init(), apparently calling iommu_release_device() on something
> where dev->iommu->iommu_dev is NULL, for reasons that are far from clear and
> I'm not sure I can really be bothered to debug further... :/

Great, so your series is a fix!

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: Fix size calculation in arm_smmu_mm_invalidate_range()

2022-04-20 Thread Will Deacon
On Tue, 19 Apr 2022 14:01:58 -0700, Nicolin Chen wrote:
> The arm_smmu_mm_invalidate_range function is designed to be called
> by mm core for Shared Virtual Addressing purpose between IOMMU and
> CPU MMU. However, the ways of two subsystems defining their "end"
> addresses are slightly different. IOMMU defines its "end" address
> using the last address of an address range, while mm core defines
> that using the following address of an address range:
> 
> [...]

Applied to will (for-joerg/arm-smmu/fixes), thanks!

[1/1] iommu/arm-smmu-v3: Fix size calculation in arm_smmu_mm_invalidate_range()
  https://git.kernel.org/will/c/95d4782c34a6

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 05/13] iommu/arm-smmu-v3: Clean up bus_set_iommu()

2022-04-19 Thread Will Deacon
On Thu, Apr 14, 2022 at 01:42:34PM +0100, Robin Murphy wrote:
> Stop calling bus_set_iommu() since it's now unnecessary, and simplify
> the probe failure path accordingly.
> 
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 53 +
>  1 file changed, 2 insertions(+), 51 deletions(-)

Acked-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 04/13] iommu/arm-smmu: Clean up bus_set_iommu()

2022-04-19 Thread Will Deacon
On Thu, Apr 14, 2022 at 01:42:33PM +0100, Robin Murphy wrote:
> Stop calling bus_set_iommu() since it's now unnecessary. With device
> probes now replayed for every IOMMU instance registration, the whole
> sorry ordering workaround for legacy DT bindings goes too, hooray!

Ha, I hope you tested this!

> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 84 +--
>  1 file changed, 2 insertions(+), 82 deletions(-)

Assuming it works,

Acked-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/4] dt-bindings: arm-smmu: Document nvidia,memory-controller property

2022-03-09 Thread Will Deacon
On Tue, Mar 08, 2022 at 04:59:05PM +0100, Thierry Reding wrote:
> On Wed, Feb 16, 2022 at 02:25:20PM +0100, Thierry Reding wrote:
> > On Thu, Dec 09, 2021 at 05:35:57PM +0100, Thierry Reding wrote:
> > > From: Thierry Reding 
> > > 
> > > On NVIDIA SoC's the ARM SMMU needs to interact with the memory
> > > controller in order to map memory clients to the corresponding stream
> > > IDs. Document how the nvidia,memory-controller property can be used to
> > > achieve this.
> > > 
> > > Note that this is a backwards-incompatible change that is, however,
> > > necessary to ensure correctness. Without the new property, most of the
> > > devices would still work but it is not guaranteed that all will.
> > > 
> > > Signed-off-by: Thierry Reding 
> > > ---
> > > Changes in v2:
> > > - clarify why the new nvidia,memory-controller property is required
> > > 
> > >  .../devicetree/bindings/iommu/arm,smmu.yaml | 17 +
> > >  1 file changed, 17 insertions(+)
> > 
> > Hi Joerg,
> > 
> > can you pick up patches 1-3 of this series? DT bindings have been
> > reviewed by Rob and Will acked the ARM SMMU change. I can take the
> > device tree changes (patch 4) through the Tegra tree.
> 
> Will, Robin, Joerg,
> 
> I haven't seen this show up in linux-next yet but was hoping to see this
> go in for v5.18. Anything I can do to help this move along?

Hmm, I guess I could've taken 1-3, but after your message to Joerg I just
Acked the driver change on the assumption that was a dependency or
something.

In any case, I'm happy for Joerg to pick these three up directly if he has
time this late in the cycle.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[GIT PULL] iommu/arm-smmu: Updates for 5.18

2022-03-08 Thread Will Deacon
Hi Joerg,

Please pull this handful of Arm SMMU updates for 5.18. Summary in the tag.

Cheers,

Will

--->8

The following changes since commit 26291c54e111ff6ba87a164d85d4a4e134b7315c:

  Linux 5.17-rc2 (2022-01-30 15:37:07 +0200)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
tags/arm-smmu-updates

for you to fetch changes up to 97dfad194ca8de04c7292d4f4c8dc493c0d20f85:

  iommu/arm-smmu: Account for PMU interrupts (2022-03-07 21:09:39 +)


Arm SMMU updates for 5.18

- Fix SMMUv3 soft lockup during continuous stream of events

- Fix error path for Qualcomm SMMU probe()

- Rework SMMU IRQ setup to prepare the ground for PMU support

- Minor cleanups and refactoring


Andy Shevchenko (1):
  perf/smmuv3: Don't cast parameter in bit operations

Christophe JAILLET (2):
  iommu/arm-smmu-v3: Avoid open coded arithmetic in memory allocation
  iommu/arm-smmu-v3: Simplify memory allocation

Miaoqian Lin (1):
  iommu/arm-smmu: Add missing pm_runtime_disable() in 
qcom_iommu_device_probe

Robin Murphy (1):
  iommu/arm-smmu: Account for PMU interrupts

Zhou Guanghui (1):
  iommu/arm-smmu-v3: fix event handling soft lockup

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 27 +++--
 drivers/iommu/arm/arm-smmu/arm-smmu.c   | 93 ++---
 drivers/iommu/arm/arm-smmu/arm-smmu.h   |  5 +-
 drivers/iommu/arm/arm-smmu/qcom_iommu.c | 10 +++-
 drivers/perf/arm_smmuv3_pmu.c   |  4 +-
 5 files changed, 64 insertions(+), 75 deletions(-)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 8/8] iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity

2022-02-15 Thread Will Deacon
On Tue, Feb 15, 2022 at 01:30:26PM +, Robin Murphy wrote:
> On 2022-02-15 13:00, Will Deacon wrote:
> > On Mon, Feb 14, 2022 at 08:55:20PM +0800, Yicong Yang wrote:
> > > On 2022/1/24 21:11, Yicong Yang wrote:
> > > > The DMA of HiSilicon PTT device can only work with identical
> > > > mapping. So add a quirk for the device to force the domain
> > > > passthrough.
> > > > 
> > > > Signed-off-by: Yicong Yang 
> > > > ---
> > > >   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 
> > > >   1 file changed, 16 insertions(+)
> > > > 
> > > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> > > > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > index 6dc6d8b6b368..6f67a2b1dd27 100644
> > > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > @@ -2838,6 +2838,21 @@ static int arm_smmu_dev_disable_feature(struct 
> > > > device *dev,
> > > > }
> > > >   }
> > > > +#define IS_HISI_PTT_DEVICE(pdev)   ((pdev)->vendor == 
> > > > PCI_VENDOR_ID_HUAWEI && \
> > > > +(pdev)->device == 0xa12e)
> > > > +
> > > > +static int arm_smmu_def_domain_type(struct device *dev)
> > > > +{
> > > > +   if (dev_is_pci(dev)) {
> > > > +   struct pci_dev *pdev = to_pci_dev(dev);
> > > > +
> > > > +   if (IS_HISI_PTT_DEVICE(pdev))
> > > > +   return IOMMU_DOMAIN_IDENTITY;
> > > > +   }
> > > > +
> > > > +   return 0;
> > > > +}
> > > > +
> > > >   static struct iommu_ops arm_smmu_ops = {
> > > > .capable= arm_smmu_capable,
> > > > .domain_alloc   = arm_smmu_domain_alloc,
> > > > @@ -2863,6 +2878,7 @@ static struct iommu_ops arm_smmu_ops = {
> > > > .sva_unbind = arm_smmu_sva_unbind,
> > > > .sva_get_pasid  = arm_smmu_sva_get_pasid,
> > > > .page_response  = arm_smmu_page_response,
> > > > +   .def_domain_type= arm_smmu_def_domain_type,
> > > > .pgsize_bitmap  = -1UL, /* Restricted during device 
> > > > attach */
> > > > .owner  = THIS_MODULE,
> > > >   };
> > > > 
> > > 
> > > Is this quirk ok with the SMMU v3 driver? Just want to confirm that I'm 
> > > on the
> > > right way to dealing with the issue of our device.
> > 
> > I don't think the quirk should be in the SMMUv3 driver. Assumedly, you would
> > have the exact same problem if you stuck the PTT device behind a different
> > type of IOMMU, and so the quirk should be handled by a higher level of the
> > stack.
> 
> Conceptually, yes, but I'm inclined to be pragmatic here. Default domain
> quirks could only move out as far as the other end of the call from
> iommu_get_def_domain_type() - it's not like we could rely on some flag in a
> driver which may not even be loaded yet, let alone matched to the device.
> And even then there's an equal and opposite argument for why the core code
> should have to maintain a list of platform-specific quirks rather than code
> specific to the relevant platforms. The fact is that a HiSilicon RCiEP is
> not going to end up behind anything other than a HiSilicon IOMMU, and if
> those ever stop being SMMUv3 *and* such a quirk still exists we can worry
> about it then.

Perhaps, but you know that by adding this hook it's only a matter of time
before we get random compatible string matches in there, so I'd rather keep
the flood gates closed as long as we can.

Given that this is a PCI device, why can't we have a PCI quirk for devices
which require an identity mapping and then handle that in the IOMMU core?

> Ugly as it is, this is the status quo. I don't recall anyone ever arguing
> that the equivalent quirks for Intel integrated graphics should be made
> generic ;)

I don't know anything about Intel integrated graphics. Have they solved this
problem in a better way, or could they equally make use of a generic quirk?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 8/8] iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity

2022-02-15 Thread Will Deacon
On Mon, Feb 14, 2022 at 08:55:20PM +0800, Yicong Yang wrote:
> On 2022/1/24 21:11, Yicong Yang wrote:
> > The DMA of HiSilicon PTT device can only work with identical
> > mapping. So add a quirk for the device to force the domain
> > passthrough.
> > 
> > Signed-off-by: Yicong Yang 
> > ---
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 
> >  1 file changed, 16 insertions(+)
> > 
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index 6dc6d8b6b368..6f67a2b1dd27 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -2838,6 +2838,21 @@ static int arm_smmu_dev_disable_feature(struct 
> > device *dev,
> > }
> >  }
> >  
> > +#define IS_HISI_PTT_DEVICE(pdev)   ((pdev)->vendor == PCI_VENDOR_ID_HUAWEI 
> > && \
> > +(pdev)->device == 0xa12e)
> > +
> > +static int arm_smmu_def_domain_type(struct device *dev)
> > +{
> > +   if (dev_is_pci(dev)) {
> > +   struct pci_dev *pdev = to_pci_dev(dev);
> > +
> > +   if (IS_HISI_PTT_DEVICE(pdev))
> > +   return IOMMU_DOMAIN_IDENTITY;
> > +   }
> > +
> > +   return 0;
> > +}
> > +
> >  static struct iommu_ops arm_smmu_ops = {
> > .capable= arm_smmu_capable,
> > .domain_alloc   = arm_smmu_domain_alloc,
> > @@ -2863,6 +2878,7 @@ static struct iommu_ops arm_smmu_ops = {
> > .sva_unbind = arm_smmu_sva_unbind,
> > .sva_get_pasid  = arm_smmu_sva_get_pasid,
> > .page_response  = arm_smmu_page_response,
> > +   .def_domain_type= arm_smmu_def_domain_type,
> > .pgsize_bitmap  = -1UL, /* Restricted during device attach */
> > .owner  = THIS_MODULE,
> >  };
> > 
>
> Is this quirk ok with the SMMU v3 driver? Just want to confirm that I'm on the
> right way to dealing with the issue of our device.

I don't think the quirk should be in the SMMUv3 driver. Assumedly, you would
have the exact same problem if you stuck the PTT device behind a different
type of IOMMU, and so the quirk should be handled by a higher level of the
stack.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: fix event handling soft lockup

2022-02-08 Thread Will Deacon
On Wed, 19 Jan 2022 07:07:54 +, Zhou Guanghui wrote:
> During event processing, events are read from the event queue one
> by one until the queue is empty.If the master device continuously
> requests address access at the same time and the SMMU generates
> events, the cyclic processing of the event takes a long time and
> softlockup warnings may be reported.
> 
> arm-smmu-v3 arm-smmu-v3.34.auto: event 0x0a received:
> arm-smmu-v3 arm-smmu-v3.34.auto:  0x7f22280a
> arm-smmu-v3 arm-smmu-v3.34.auto:  0x107e
> arm-smmu-v3 arm-smmu-v3.34.auto:  0x034e8670
> watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [irq/268-arm-smm:247]
> Call trace:
>  _dev_info+0x7c/0xa0
>  arm_smmu_evtq_thread+0x1c0/0x230
>  irq_thread_fn+0x30/0x80
>  irq_thread+0x128/0x210
>  kthread+0x134/0x138
>  ret_from_fork+0x10/0x1c
> Kernel panic - not syncing: softlockup: hung tasks
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu-v3: fix event handling soft lockup
  https://git.kernel.org/will/c/30de2b541af9

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu: Add missing pm_runtime_disable() in qcom_iommu_device_probe

2022-02-08 Thread Will Deacon
On Wed, 5 Jan 2022 10:16:19 +, Miaoqian Lin wrote:
> If the probe fails, we should use pm_runtime_disable() to balance
> pm_runtime_enable().
> Add missing pm_runtime_disable() for error handling.
> 
> 

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu: Add missing pm_runtime_disable() in 
qcom_iommu_device_probe
  https://git.kernel.org/will/c/93665e0275a2

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/2] iommu/arm-smmu-v3: Avoid open coded arithmetic in memory allocation

2022-02-08 Thread Will Deacon
On Mon, 7 Feb 2022 23:50:48 +0100, Christophe JAILLET wrote:
> kmalloc_array()/kcalloc() should be used to avoid potential overflow when
> a multiplication is needed to compute the size of the requested memory.
> 
> So turn a devm_kzalloc()+explicit size computation into an equivalent
> devm_kcalloc().
> 
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/2] iommu/arm-smmu-v3: Avoid open coded arithmetic in memory allocation
  https://git.kernel.org/will/c/98b64741d611
[2/2] iommu/arm-smmu-v3: Simplify memory allocation
  https://git.kernel.org/will/c/fcdeb8c34043

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/arm-smmu: Use platform_irq_count() to get the interrupt count

2022-02-08 Thread Will Deacon
On Tue, Feb 08, 2022 at 03:28:50PM +, Robin Murphy wrote:
> On 2022-02-08 15:19, Will Deacon wrote:
> > On Thu, Dec 23, 2021 at 02:14:35PM +, Robin Murphy wrote:
> > > On 2021-12-23 13:00, Lad Prabhakar wrote:
> > > > platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static
> > > > allocation of IRQ resources in DT core code, this causes an issue
> > > > when using hierarchical interrupt domains using "interrupts" property
> > > > in the node as this bypasses the hierarchical setup and messes up the
> > > > irq chaining.
> > > > 
> > > > In preparation for removal of static setup of IRQ resource from DT core
> > > > code use platform_get_irq_count().
> > > 
> > > Nit: platform_irq_count()
> > > 
> > > > Signed-off-by: Lad Prabhakar 
> > > > ---
> > > >drivers/iommu/arm/arm-smmu/arm-smmu.c | 12 ++--
> > > >1 file changed, 6 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
> > > > b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > > index 4bc75c4ce402..4844cd075644 100644
> > > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > > @@ -2105,12 +2105,12 @@ static int arm_smmu_device_probe(struct 
> > > > platform_device *pdev)
> > > > if (IS_ERR(smmu))
> > > > return PTR_ERR(smmu);
> > > > -   num_irqs = 0;
> > > > -   while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, 
> > > > num_irqs))) {
> > > > -   num_irqs++;
> > > > -   if (num_irqs > smmu->num_global_irqs)
> > > > -   smmu->num_context_irqs++;
> > > > -   }
> > > > +   num_irqs = platform_irq_count(pdev);
> > > > +   if (num_irqs < 0)
> > > > +   return num_irqs;
> > > > +
> > > > +   if (num_irqs > smmu->num_global_irqs)
> > > > +   smmu->num_context_irqs += (num_irqs - 
> > > > smmu->num_global_irqs);
> > > 
> > > This seems a bit overcomplicated. I reckon:
> > > 
> > >   smmu->num_context_irqs = num_irqs - smmu->num_global_irqs;
> > >   if (num_irqs <= smmu->num_global_irqs) {
> > >   dev_err(...
> > > 
> > > should do it.
> > > 
> > > However, FYI I have some patches refactoring most of the IRQ stuff here 
> > > that
> > > I plan to post next cycle (didn't quite have time to get them done for 
> > > 5.17
> > > as I'd hoped...), so unless this needs to go in right now as an urgent 
> > > fix,
> > > I'm happy to take care of removing platform_get_resource() as part of that
> > > if it's easier.
> > 
> > Did you get anywhere with this? December 23rd is long forgotten by now ;)
> 
> Yup: 
> https://gitlab.arm.com/linux-arm/linux-rm/-/commit/b2a40caaf1622eb35c555074a0d72f4f0513cff9
> 
> I'm still cleaning up the PMU driver that justifies the whole thing, but I
> can send that patch now if you reckon it's not complete nonsense out of
> context. Otherwise, I'll aim to get the whole lot out next week.

Next week is fine, no rush. I was just trying to work out what to do with
Lad's patches in this thread and it sounds like I should ignore them and
wait for your series instead.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/arm-smmu: Use platform_irq_count() to get the interrupt count

2022-02-08 Thread Will Deacon
On Thu, Dec 23, 2021 at 02:14:35PM +, Robin Murphy wrote:
> On 2021-12-23 13:00, Lad Prabhakar wrote:
> > platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static
> > allocation of IRQ resources in DT core code, this causes an issue
> > when using hierarchical interrupt domains using "interrupts" property
> > in the node as this bypasses the hierarchical setup and messes up the
> > irq chaining.
> > 
> > In preparation for removal of static setup of IRQ resource from DT core
> > code use platform_get_irq_count().
> 
> Nit: platform_irq_count()
> 
> > Signed-off-by: Lad Prabhakar 
> > ---
> >   drivers/iommu/arm/arm-smmu/arm-smmu.c | 12 ++--
> >   1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
> > b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > index 4bc75c4ce402..4844cd075644 100644
> > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > @@ -2105,12 +2105,12 @@ static int arm_smmu_device_probe(struct 
> > platform_device *pdev)
> > if (IS_ERR(smmu))
> > return PTR_ERR(smmu);
> > -   num_irqs = 0;
> > -   while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, num_irqs))) {
> > -   num_irqs++;
> > -   if (num_irqs > smmu->num_global_irqs)
> > -   smmu->num_context_irqs++;
> > -   }
> > +   num_irqs = platform_irq_count(pdev);
> > +   if (num_irqs < 0)
> > +   return num_irqs;
> > +
> > +   if (num_irqs > smmu->num_global_irqs)
> > +   smmu->num_context_irqs += (num_irqs - smmu->num_global_irqs);
> 
> This seems a bit overcomplicated. I reckon:
> 
>   smmu->num_context_irqs = num_irqs - smmu->num_global_irqs;
>   if (num_irqs <= smmu->num_global_irqs) {
>   dev_err(...
> 
> should do it.
> 
> However, FYI I have some patches refactoring most of the IRQ stuff here that
> I plan to post next cycle (didn't quite have time to get them done for 5.17
> as I'd hoped...), so unless this needs to go in right now as an urgent fix,
> I'm happy to take care of removing platform_get_resource() as part of that
> if it's easier.

Did you get anywhere with this? December 23rd is long forgotten by now ;)

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[GIT PULL] iommu/arm-smmu: Updates for 5.17

2021-12-15 Thread Will Deacon
Hi Joerg,

Please pull these Arm SMMU updates for 5.17.

Once again, there's not a lot here. In fact, it's mostly a combination
of non-critical fixes and DT compatible string additions. Summary in
the tag.

Cheers,

Will

--->8

The following changes since commit 0fcfb00b28c0b7884635dacf38e46d60bf3d4eb1:

  Linux 5.16-rc4 (2021-12-05 14:08:22 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
tags/arm-smmu-updates

for you to fetch changes up to 477436699e7801276fa7306e20318156cb535249:

  Revert "iommu/arm-smmu-v3: Decrease the queue size of evtq and priq" 
(2021-12-15 10:14:06 +)


Arm SMMU updates for 5.17

- Revert evtq and priq back to their former sizes

- Return early on short-descriptor page-table allocation failure

- Fix page fault reporting for Adreno GPU on SMMUv2

- Make SMMUv3 MMU notifier ops 'const'

- Numerous new compatible strings for Qualcomm SMMUv2 implementations


David Heidelberg (1):
  dt-bindings: arm-smmu: Add compatible for the SDX55 SoC

Rikard Falkeborn (1):
  iommu/arm-smmu-v3: Constify arm_smmu_mmu_notifier_ops

Rob Clark (1):
  iommu/arm-smmu-qcom: Fix TTBR0 read

Vinod Koul (2):
  dt-bindings: arm-smmu: Add compatible for SM8450 SoC
  iommu: arm-smmu-impl: Add SM8450 qcom iommu implementation

Yunfei Wang (1):
  iommu/io-pgtable-arm-v7s: Add error handle for page table allocation 
failure

Zhou Wang (1):
  Revert "iommu/arm-smmu-v3: Decrease the queue size of evtq and priq"

 Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 2 ++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   | 2 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   | 5 ++---
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c| 3 ++-
 drivers/iommu/io-pgtable-arm-v7s.c| 6 +-
 5 files changed, 12 insertions(+), 6 deletions(-)

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] Revert "iommu/arm-smmu-v3: Decrease the queue size of evtq and priq"

2021-12-15 Thread Will Deacon
On Tue, 7 Dec 2021 14:32:48 +0800, Zhou Wang wrote:
> The commit f115f3c0d5d8 ("iommu/arm-smmu-v3: Decrease the queue size of
> evtq and priq") decreases evtq and priq, which may lead evtq/priq to be
> full with fault events, e.g HiSilicon ZIP/SEC/HPRE have maximum 1024 queues
> in one device, every queue could be binded with one process and trigger a
> fault event. So let's revert f115f3c0d5d8.
> 
> In fact, if an implementation of SMMU really does not need so long evtq
> and priq, value of IDR1_EVTQS and IDR1_PRIQS can be set to proper ones.
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] Revert "iommu/arm-smmu-v3: Decrease the queue size of evtq and priq"
  https://git.kernel.org/will/c/477436699e78

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/3] iommu/io-pgtable-arm: Add way to debug pgtable walk

2021-12-14 Thread Will Deacon
On Tue, Oct 05, 2021 at 08:16:25AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> Add an io-pgtable method to retrieve the raw PTEs that would be
> traversed for a given iova access.
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/iommu/io-pgtable-arm.c | 40 +++---
>  include/linux/io-pgtable.h |  9 
>  2 files changed, 41 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index dd9e47189d0d..c470fc0b3c2b 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -700,38 +700,61 @@ static size_t arm_lpae_unmap(struct io_pgtable_ops 
> *ops, unsigned long iova,
>   return arm_lpae_unmap_pages(ops, iova, size, 1, gather);
>  }
>  
> -static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
> -  unsigned long iova)
> +static int arm_lpae_pgtable_walk(struct io_pgtable_ops *ops, unsigned long 
> iova,
> +  void *_ptes, int *num_ptes)
>  {
>   struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
>   arm_lpae_iopte pte, *ptep = data->pgd;
> + arm_lpae_iopte *ptes = _ptes;
> + int max_ptes = *num_ptes;
>   int lvl = data->start_level;
>  
> + *num_ptes = 0;
> +
>   do {
> + if (*num_ptes >= max_ptes)
> + return -ENOSPC;
> +
>   /* Valid IOPTE pointer? */
>   if (!ptep)
> - return 0;
> + return -EFAULT;
>  
>   /* Grab the IOPTE we're interested in */
>   ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
>   pte = READ_ONCE(*ptep);
>  
> + ptes[(*num_ptes)++] = pte;
> +
>   /* Valid entry? */
>   if (!pte)
> - return 0;
> + return -EFAULT;
>  
>   /* Leaf entry? */
>   if (iopte_leaf(pte, lvl, data->iop.fmt))
> - goto found_translation;
> + return 0;
>  
>   /* Take it to the next level */
>   ptep = iopte_deref(pte, data);
>   } while (++lvl < ARM_LPAE_MAX_LEVELS);
>  
> - /* Ran out of page tables to walk */
> - return 0;
> + return -EFAULT;
> +}
> +
> +static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
> +  unsigned long iova)
> +{
> + struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
> + arm_lpae_iopte pte, ptes[ARM_LPAE_MAX_LEVELS];
> + int lvl, num_ptes = ARM_LPAE_MAX_LEVELS;
> + int ret;
> +
> + ret = arm_lpae_pgtable_walk(ops, iova, ptes, _ptes);
> + if (ret)
> + return 0;
> +
> + pte = ptes[num_ptes - 1];
> + lvl = num_ptes - 1 + data->start_level;
>  
> -found_translation:
>   iova &= (ARM_LPAE_BLOCK_SIZE(lvl, data) - 1);
>   return iopte_to_paddr(pte, data) | iova;
>  }
> @@ -816,6 +839,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
>   .unmap  = arm_lpae_unmap,
>   .unmap_pages= arm_lpae_unmap_pages,
>   .iova_to_phys   = arm_lpae_iova_to_phys,
> + .pgtable_walk   = arm_lpae_pgtable_walk,
>   };
>  
>   return data;
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index 86af6f0a00a2..501f362a929c 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -148,6 +148,13 @@ struct io_pgtable_cfg {
>   * @unmap:Unmap a physically contiguous memory region.
>   * @unmap_pages:  Unmap a range of virtually contiguous pages of the same 
> size.
>   * @iova_to_phys: Translate iova to physical address.
> + * @pgtable_walk: Return details of a page table walk for a given iova.
> + *This returns the array of PTEs in a format that is
> + *specific to the page table format.  The number of
> + *PTEs can be format specific.  The num_ptes parameter
> + *on input specifies the size of the ptes array, and
> + *on output the number of PTEs filled in (which depends
> + *on the number of PTEs walked to resolve the iova)

I think this would be a fair bit cleaner if the interface instead took a
callback function to invoke at each page-table level. It would be invoked
with the pte value and the level. Depending on its return value the walk
could be terminated early. That would also potentially scale to walking
ranges of iovas as well if we ever need it and it may be more readily
implementable by other formats too.

>   *
>   * These functions map directly onto the iommu_ops member functions with
>   * the same names.

This bit of the comment is no longer true with your change.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3] iommu/io-pgtable-arm-v7s: Add error handle for page table allocation failure

2021-12-14 Thread Will Deacon
On Tue, 7 Dec 2021 19:33:15 +0800, yf.w...@mediatek.com wrote:
> From: Yunfei Wang 
> 
> In __arm_v7s_alloc_table function:
> iommu call kmem_cache_alloc to allocate page table, this function
> allocate memory may fail, when kmem_cache_alloc fails to allocate
> table, call virt_to_phys will be abnomal and return unexpected phys
> and goto out_free, then call kmem_cache_free to release table will
> trigger KE, __get_free_pages and free_pages have similar problem,
> so add error handle for page table allocation failure.
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/io-pgtable-arm-v7s: Add error handle for page table allocation 
failure
  https://git.kernel.org/will/c/a556cfe4cabc

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] Add support from SM8450 IOMMU

2021-12-14 Thread Will Deacon
On Wed, 1 Dec 2021 13:09:41 +0530, Vinod Koul wrote:
> This adds the binding and support for IOMMU found in SM8450 SoC
> 
> Vinod Koul (2):
>   dt-bindings: arm-smmu: Add compatible for SM8450 SoC
>   iommu: arm-smmu-impl: Add SM8450 qcom iommu implementation
> 
> Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 1 +
>  drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c| 1 +
>  2 files changed, 2 insertions(+)
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/2] dt-bindings: arm-smmu: Add compatible for SM8450 SoC
  https://git.kernel.org/will/c/810d8cabaab5
[2/2] iommu: arm-smmu-impl: Add SM8450 qcom iommu implementation
  https://git.kernel.org/will/c/cd76990c94bb

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-qcom: Fix TTBR0 read

2021-12-14 Thread Will Deacon
On Mon, 8 Nov 2021 09:17:23 -0800, Rob Clark wrote:
> From: Rob Clark 
> 
> It is a 64b register, lets not lose the upper bits.
> 
> 

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu-qcom: Fix TTBR0 read
  https://git.kernel.org/will/c/c31112fbd407

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: Constify arm_smmu_mmu_notifier_ops

2021-12-14 Thread Will Deacon
On Sat, 4 Dec 2021 23:33:01 +0100, Rikard Falkeborn wrote:
> The only usage of arm_smmu_mmu_notifier_ops is to assign its address to
> the ops field in the mmu_notifier struct, which is a pointer to const
> struct mmu_notifier_ops. Make it const to allow the compiler to put it
> in read-only memory.
> 
> 

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu-v3: Constify arm_smmu_mmu_notifier_ops
  https://git.kernel.org/will/c/17d9a4b43b28

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dt-bindings: arm-smmu: Add compatible for the SDX55 SoC

2021-12-14 Thread Will Deacon
On Thu, 21 Oct 2021 01:17:00 +0200, David Heidelberg wrote:
> Add missing compatible for the SDX55 SoC.
> 
> 

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] dt-bindings: arm-smmu: Add compatible for the SDX55 SoC
  https://git.kernel.org/will/c/ae377d342006

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] Revert "iommu/arm-smmu-v3: Decrease the queue size of evtq and priq"

2021-12-14 Thread Will Deacon
On Tue, Dec 07, 2021 at 02:32:48PM +0800, Zhou Wang wrote:
> The commit f115f3c0d5d8 ("iommu/arm-smmu-v3: Decrease the queue size of
> evtq and priq") decreases evtq and priq, which may lead evtq/priq to be
> full with fault events, e.g HiSilicon ZIP/SEC/HPRE have maximum 1024 queues
> in one device, every queue could be binded with one process and trigger a
> fault event. So let's revert f115f3c0d5d8.
> 
> In fact, if an implementation of SMMU really does not need so long evtq
> and priq, value of IDR1_EVTQS and IDR1_PRIQS can be set to proper ones.
> 
> Signed-off-by: Zhou Wang 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)

I'd like an Ack from Zhen Lei on this, as the aim of the original patch
was to reduce memory consumption.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/3] perf/smmuv3: Support devicetree

2021-12-14 Thread Will Deacon
On Wed, 17 Nov 2021 14:48:42 +, Jean-Philippe Brucker wrote:
> Add devicetree binding for the SMMUv3 PMU, called Performance Monitoring
> Counter Group (PMCG) in the spec. Each SMMUv3 implementation can have
> multiple independent PMCGs, for example one for the Translation Control
> Unit (TCU) and one per Translation Buffer Unit (TBU).
> 
> Since v1 [1]:
> * Fixed warnings in the binding doc
> * Removed hip08 support
> * Merged Robin's version. I took the liberty of splitting the driver
>   patch into 2 and 3. One fix in patch 3, and whitespace changes (the
>   driver uses spaces instead of tabs to align #define values, which I
>   was going to fix but actually seems more common across the tree.)
> 
> [...]

Applied to arm64 (for-next/perf-smmu), thanks!

[1/3] dt-bindings: Add Arm SMMUv3 PMCG binding
  https://git.kernel.org/arm64/c/2704e7594383
[2/3] perf/smmuv3: Add devicetree support
  https://git.kernel.org/arm64/c/3f7be4356176
[3/3] perf/smmuv3: Synthesize IIDR from CoreSight ID registers
  https://git.kernel.org/arm64/c/df457ca973fe

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/4] iommu/arm-smmu: Support Tegra234 SMMU

2021-12-13 Thread Will Deacon
On Thu, Dec 09, 2021 at 05:35:59PM +0100, Thierry Reding wrote:
> From: Thierry Reding 
> 
> Allow the NVIDIA-specific ARM SMMU implementation to bind to the SMMU
> instances found on Tegra234.
> 
> Signed-off-by: Thierry Reding 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> index 2c25cce38060..658f3cc83278 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> @@ -211,7 +211,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
> arm_smmu_device *smmu)
>   if (of_property_read_bool(np, "calxeda,smmu-secure-config-access"))
>   smmu->impl = _impl;
>  
> - if (of_device_is_compatible(np, "nvidia,tegra194-smmu") ||
> + if (of_device_is_compatible(np, "nvidia,tegra234-smmu") ||
> + of_device_is_compatible(np, "nvidia,tegra194-smmu") ||
>   of_device_is_compatible(np, "nvidia,tegra186-smmu"))
>   return nvidia_smmu_impl_init(smmu);

Acked-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/io-pgtable-arm-v7s: Add error handle for page table allocation failure

2021-12-07 Thread Will Deacon
On Tue, Dec 07, 2021 at 10:47:22AM +0800, yf.w...@mediatek.com wrote:
> From: Yunfei Wang 
> 
> In __arm_v7s_alloc_table function:
> iommu call kmem_cache_alloc to allocate page table, this function
> allocate memory may fail, when kmem_cache_alloc fails to allocate
> table, call virt_to_phys will be abnomal and return unexpected phys
> and goto out_free, then call kmem_cache_free to release table will
> trigger KE, __get_free_pages and free_pages have similar problem,
> so add error handle for page table allocation failure.
> 
> Signed-off-by: Yunfei Wang 
> ---
>  drivers/iommu/io-pgtable-arm-v7s.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
> b/drivers/iommu/io-pgtable-arm-v7s.c
> index bfb6acb651e5..d84240308f4b 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -246,6 +246,12 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
>   __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
>   else if (lvl == 2)
>   table = kmem_cache_zalloc(data->l2_tables, gfp);
> +
> + if (!table) {
> + dev_err(dev, "Page table allocation failure lvl:%d\n", lvl);

I'd expect the allocator to shout loudly on failure anyway, so I don't think
we need to print another message here.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch 33/37] iommu/arm-smmu-v3: Use msi_get_virq()

2021-11-30 Thread Will Deacon
On Mon, Nov 29, 2021 at 02:54:18PM +, Robin Murphy wrote:
> On 2021-11-29 14:42, Thomas Gleixner wrote:
> > On Mon, Nov 29 2021 at 13:13, Robin Murphy wrote:
> > > On 2021-11-29 10:55, Will Deacon wrote:
> > > > > - }
> > > > > + smmu->evtq.q.irq = msi_get_virq(dev, EVTQ_MSI_INDEX);
> > > > > + smmu->gerr_irq = msi_get_virq(dev, GERROR_MSI_INDEX);
> > > > > + smmu->priq.q.irq = msi_get_virq(dev, PRIQ_MSI_INDEX);
> > > > 
> > > > Prviously, if retrieval of the MSI failed then we'd fall back to wired
> > > > interrupts. Now, I think we'll clobber the interrupt with 0 instead. Can
> > > > we make the assignments to smmu->*irq here conditional on the MSI being
> > > > valid, please?
> > > 
> > > I was just looking at that too, but reached the conclusion that it's
> > > probably OK, since consumption of this value later is gated on
> > > ARM_SMMU_FEAT_PRI, so the fact that it changes from 0 to an error value
> > > in the absence of PRI should make no practical difference.
> > 
> > It's actually 0 when the vector cannot be found.
> 
> Oh, -1 for my reading comprehension but +1 for my confidence in the patch
> then :)
> 
> I'll let Will have the final say over how cautious we really want to be
> here, but as far as I'm concerned it's a welcome cleanup as-is. Ditto for
> patch #32 based on the same reasoning, although I don't have a suitable test
> platform on-hand to sanity-check that one.

If, as it appears, msi_get_virq() isn't going to fail meaningfully after
we've successfully called platform_msi_domain_alloc_irqs() then it sounds
like the patch is fine. Just wanted to check though, as Spring cleaning at
the end of November raised an eyebrow over here :)

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch 33/37] iommu/arm-smmu-v3: Use msi_get_virq()

2021-11-29 Thread Will Deacon
Hi Thomas,

On Sat, Nov 27, 2021 at 02:20:59AM +0100, Thomas Gleixner wrote:
> Let the core code fiddle with the MSI descriptor retrieval.
> 
> Signed-off-by: Thomas Gleixner 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |   19 +++
>  1 file changed, 3 insertions(+), 16 deletions(-)
> 
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -3154,7 +3154,6 @@ static void arm_smmu_write_msi_msg(struc
>  
>  static void arm_smmu_setup_msis(struct arm_smmu_device *smmu)
>  {
> - struct msi_desc *desc;
>   int ret, nvec = ARM_SMMU_MAX_MSIS;
>   struct device *dev = smmu->dev;
>  
> @@ -3182,21 +3181,9 @@ static void arm_smmu_setup_msis(struct a
>   return;
>   }
>  
> - for_each_msi_entry(desc, dev) {
> - switch (desc->msi_index) {
> - case EVTQ_MSI_INDEX:
> - smmu->evtq.q.irq = desc->irq;
> - break;
> - case GERROR_MSI_INDEX:
> - smmu->gerr_irq = desc->irq;
> - break;
> - case PRIQ_MSI_INDEX:
> - smmu->priq.q.irq = desc->irq;
> - break;
> - default:/* Unknown */
> - continue;
> - }
> - }
> + smmu->evtq.q.irq = msi_get_virq(dev, EVTQ_MSI_INDEX);
> + smmu->gerr_irq = msi_get_virq(dev, GERROR_MSI_INDEX);
> + smmu->priq.q.irq = msi_get_virq(dev, PRIQ_MSI_INDEX);

Prviously, if retrieval of the MSI failed then we'd fall back to wired
interrupts. Now, I think we'll clobber the interrupt with 0 instead. Can
we make the assignments to smmu->*irq here conditional on the MSI being
valid, please?

Cheers,

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[GIT PULL] iommu/arm-smmu: Updates for 5.16

2021-10-19 Thread Will Deacon
Hi Joerg,

Please pull this tiny batch of Arm SMMU updates for 5.16. It's dominated
by compatible string additions for Qualcomm SMMUv2 implementations, but
there's a bit of cleanup on the SMMUv3 command-submission side as well.

Cheers,

Will

--->8

The following changes since commit 5816b3e6577eaa676ceb00a848f0fd65fe2adc29:

  Linux 5.15-rc3 (2021-09-26 14:08:19 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
tags/arm-smmu-updates

for you to fetch changes up to e37f1fe4332491bf2f7b7849d5c3adba0d2a77b3:

  iommu/arm-smmu-qcom: Request direct mapping for modem device (2021-10-08 
09:07:44 +0100)


Arm SMMU updates for 5.16

- Minor optimisations to SMMUv3 command creation and submission

- Numerous new compatible string for Qualcomm SMMUv2 implementations


Konrad Dybcio (2):
  dt-bindings: arm-smmu: Add compatible for SM6350 SoC
  iommu/arm-smmu-qcom: Add SM6350 SMMU compatible

Loic Poulain (2):
  dt-bindings: arm-smmu: Add compatible for QCM2290 SoC
  iommu: arm-smmu-qcom: Add compatible for QCM2290

Sibi Sankar (1):
  iommu/arm-smmu-qcom: Request direct mapping for modem device

Zhen Lei (2):
  iommu/arm-smmu-v3: Stop pre-zeroing batch commands in 
arm_smmu_atc_inv_master()
  iommu/arm-smmu-v3: Properly handle the return value of 
arm_smmu_cmdq_build_cmd()

 .../devicetree/bindings/iommu/arm,smmu.yaml |  2 ++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 21 ++---
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c  |  3 +++
 3 files changed, 19 insertions(+), 7 deletions(-)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm: fix ARM_SMMU_QCOM compilation

2021-10-14 Thread Will Deacon
On Wed, Oct 13, 2021 at 09:31:40PM +0200, Arnd Bergmann wrote:
> On Wed, Oct 13, 2021 at 6:20 PM Will Deacon  wrote:
> > On Wed, Oct 13, 2021 at 10:33:55AM +0200, Arnd Bergmann wrote:
> > > On Wed, Oct 13, 2021 at 9:58 AM Will Deacon  wrote:
> > > > On Tue, Oct 12, 2021 at 05:18:00PM +0200, Arnd Bergmann wrote:
> 
> > > I was hoping you and Joerg could just pick your preferred patch
> > > into the iommu fixes tree for v5.15.
> > >
> > > I currently have nothing else pending for my asm-generic tree that
> > > introduced the regression, but I can take it through there if that helps
> > > you.
> >
> > I also don't have any fixes pending, and I don't see any in Joerg's tree so
> > it's probably quickest if you send it on yourself. Is that ok?
> 
> Sure, no problem. I ended up adding it to the arm/fixes branch of the
> soc tree, as I just merged some other fixes there, and it seems as good
> as any of the other trees.

Thanks, Arnd!

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm: fix ARM_SMMU_QCOM compilation

2021-10-13 Thread Will Deacon
Hi Arnd,

On Wed, Oct 13, 2021 at 10:33:55AM +0200, Arnd Bergmann wrote:
> On Wed, Oct 13, 2021 at 9:58 AM Will Deacon  wrote:
> > On Tue, Oct 12, 2021 at 05:18:00PM +0200, Arnd Bergmann wrote:
> > > From: Arnd Bergmann 
> > >
> > > My previous bugfix ended up making things worse for the QCOM IOMMU
> > > driver when it forgot to add the Kconfig symbol that is getting used to
> > > control the compilation of the SMMU implementation specific code
> > > for Qualcomm.
> > >
> > > Fixes: 424953cf3c66 ("qcom_scm: hide Kconfig symbol")
> > > Reported-by: Daniel Lezcano 
> > > Reported-by: Dmitry Baryshkov 
> > > Reported-by: John Stultz 
> > > Link: 
> > > https://lore.kernel.org/lkml/20211010023350.978638-1-dmitry.barysh...@linaro.org/
> > > Signed-off-by: Arnd Bergmann 
> > > ---
> > > In case we want fix it this way after all, here is the patch
> > > I made. Either this one or Dmitry patch from the link above
> > > is required for v5.15
> > > ---
> > >  drivers/iommu/Kconfig | 8 
> > >  1 file changed, 8 insertions(+)
> > >
> > > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> > > index c5c71b7ab7e8..3eb68fa1b8cc 100644
> > > --- a/drivers/iommu/Kconfig
> > > +++ b/drivers/iommu/Kconfig
> > > @@ -355,6 +355,14 @@ config ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT
> > > 'arm-smmu.disable_bypass' will continue to override this
> > > config.
> > >
> > > +config ARM_SMMU_QCOM
> > > + def_tristate y
> > > + depends on ARM_SMMU && ARCH_QCOM
> > > + select QCOM_SCM
> > > + help
> > > +   When running on a Qualcomm platform that has the custom variant
> > > +   of the ARM SMMU, this needs to be built into the SMMU driver.
> > > +
> >
> > FWIW, I prefer this solution over changing the driver code, so:
> >
> > Acked-by: Will Deacon 
> >
> > I assume you'll be getting this fixed for 5.15?
> 
> I was hoping you and Joerg could just pick your preferred patch
> into the iommu fixes tree for v5.15.
> 
> I currently have nothing else pending for my asm-generic tree that
> introduced the regression, but I can take it through there if that helps
> you.

I also don't have any fixes pending, and I don't see any in Joerg's tree so
it's probably quickest if you send it on yourself. Is that ok?

Cheers,

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm: fix ARM_SMMU_QCOM compilation

2021-10-13 Thread Will Deacon
On Tue, Oct 12, 2021 at 05:18:00PM +0200, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> My previous bugfix ended up making things worse for the QCOM IOMMU
> driver when it forgot to add the Kconfig symbol that is getting used to
> control the compilation of the SMMU implementation specific code
> for Qualcomm.
> 
> Fixes: 424953cf3c66 ("qcom_scm: hide Kconfig symbol")
> Reported-by: Daniel Lezcano 
> Reported-by: Dmitry Baryshkov 
> Reported-by: John Stultz 
> Link: 
> https://lore.kernel.org/lkml/20211010023350.978638-1-dmitry.barysh...@linaro.org/
> Signed-off-by: Arnd Bergmann 
> ---
> In case we want fix it this way after all, here is the patch
> I made. Either this one or Dmitry patch from the link above
> is required for v5.15
> ---
>  drivers/iommu/Kconfig | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index c5c71b7ab7e8..3eb68fa1b8cc 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -355,6 +355,14 @@ config ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT
> 'arm-smmu.disable_bypass' will continue to override this
> config.
>  
> +config ARM_SMMU_QCOM
> + def_tristate y
> + depends on ARM_SMMU && ARCH_QCOM
> + select QCOM_SCM
> + help
> +   When running on a Qualcomm platform that has the custom variant
> +   of the ARM SMMU, this needs to be built into the SMMU driver.
> +

FWIW, I prefer this solution over changing the driver code, so:

Acked-by: Will Deacon 

I assume you'll be getting this fixed for 5.15?

Cheers,

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 2/2] iommu/arm-smmu-v3: Simplify useless instructions in arm_smmu_cmdq_build_cmd()

2021-10-04 Thread Will Deacon
On Wed, Aug 18, 2021 at 04:04:52PM +0800, Zhen Lei wrote:
> Although the parameter 'cmd' is always passed by a local array variable,
> and only this function modifies it, the compiler does not know this. The
> compiler almost always reads the value of cmd[i] from memory rather than
> directly using the value cached in the register. This generates many
> useless instruction operations and affects the performance to some extent.
> 
> To guide the compiler for proper optimization, 'cmd' is defined as a local
> array variable, marked as register, and copied to the output parameter at
> a time when the function is returned.
> 
> The optimization effect can be viewed by running the "size arm-smmu-v3.o"
> command.
> 
> Before:
>textdata bss dec hex
>   269541348  56   283586ec6
> 
> After:
>textdata bss dec hex
>   267621348  56   281666e06
> 
> Signed-off-by: Zhen Lei 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 01e95b56ffa07d1..7cec0c967f71d86 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -234,10 +234,12 @@ static int queue_remove_raw(struct arm_smmu_queue *q, 
> u64 *ent)
>  }
>  
>  /* High-level queue accessors */
> -static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent)
> +static int arm_smmu_cmdq_build_cmd(u64 *out_cmd, struct arm_smmu_cmdq_ent 
> *ent)
>  {
> - memset(cmd, 0, 1 << CMDQ_ENT_SZ_SHIFT);
> - cmd[0] |= FIELD_PREP(CMDQ_0_OP, ent->opcode);
> + register u64 cmd[CMDQ_ENT_DWORDS];

'register' is pretty badly specified outside of a handful of limited uses in
conjunction with inline asm, so I really don't think we should be using it
here.

> + cmd[0] = FIELD_PREP(CMDQ_0_OP, ent->opcode);

This generates a compiler warning about taking the address of a 'register'
variable.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: Stop pre-zeroing batch commands in arm_smmu_atc_inv_master()

2021-10-04 Thread Will Deacon
On Tue, 17 Aug 2021 19:34:11 +0800, Zhen Lei wrote:
> Pre-zeroing the batched commands structure is inefficient, as individual
> commands are zeroed later in arm_smmu_cmdq_build_cmd(). Therefore, only
> the member 'num' needs to be initialized to 0.
> 
> 

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu-v3: Stop pre-zeroing batch commands in 
arm_smmu_atc_inv_master()
  https://git.kernel.org/will/c/93f9f7958f12

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/2] iommu/arm-smmu-v3: Perform some simple optimizations for arm_smmu_cmdq_build_cmd()

2021-10-04 Thread Will Deacon
On Wed, 18 Aug 2021 16:04:50 +0800, Zhen Lei wrote:
> v1 --> v2:
> 1. Add patch 1, Properly handle the return value of arm_smmu_cmdq_build_cmd()
> 2. Remove arm_smmu_cmdq_copy_cmd(). In addition, when build command fails, 
> out_cmd is not filled.
> 
> 
> Zhen Lei (2):
>   iommu/arm-smmu-v3: Properly handle the return value of
> arm_smmu_cmdq_build_cmd()
>   iommu/arm-smmu-v3: Simplify useless instructions in
> arm_smmu_cmdq_build_cmd()
> 
> [...]

Applied first patch only to will (for-joerg/arm-smmu/updates), thanks!

[1/2] iommu/arm-smmu-v3: Properly handle the return value of 
arm_smmu_cmdq_build_cmd()
  https://git.kernel.org/will/c/59d9bd727495

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] dt-bindings: arm-smmu: Add compatible for SM6350 SoC

2021-10-04 Thread Will Deacon
On Fri, 20 Aug 2021 22:29:04 +0200, Konrad Dybcio wrote:
> Add the SoC specific compatible for SM6350 implementing
> arm,mmu-500.
> 
> 

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/2] dt-bindings: arm-smmu: Add compatible for SM6350 SoC
  https://git.kernel.org/will/c/e4a40f15b031
[2/2] iommu/arm-smmu-qcom: Add SM6350 SMMU compatible
  https://git.kernel.org/will/c/bc53c8b8b087

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu: arm-smmu-qcom: Add compatible for qcm2290

2021-10-04 Thread Will Deacon
On Fri, 1 Oct 2021 16:00:31 +0200, Loic Poulain wrote:
> 


Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/2] iommu: arm-smmu-qcom: Add compatible for qcm2290
  https://git.kernel.org/will/c/756a622c8f06
[2/2] dt-bindings: arm-smmu: Add qcm2290 compatible strings
  https://git.kernel.org/will/c/f1edce3db543

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] iommu: Some IOVA code reorganisation

2021-10-04 Thread Will Deacon
On Fri, Sep 24, 2021 at 06:01:52PM +0800, John Garry wrote:
> The IOVA domain structure is a bit overloaded, holding:
> - IOVA tree management
> - FQ control
> - IOVA rcache memories
> 
> Indeed only a couple of IOVA users use the rcache, and only dma-iommu.c
> uses the FQ feature.
> 
> This series separates out that structure. In addition, it moves the FQ
> code into dma-iommu.c . This is not strictly necessary, but it does make
> it easier for the FQ domain lookup the rcache domain.
> 
> The rcache code stays where it is, as it may be reworked in future, so
> there is not much point in relocating and then discarding.
> 
> This topic was initially discussed and suggested (I think) by Robin here:
> https://lore.kernel.org/linux-iommu/1d06eda1-9961-d023-f5e7-fe87e768f...@arm.com/

It would be useful to have Robin's Ack on patches 2-4. The implementation
looks straightforward to me, but the thread above isn't very clear about
what is being suggested.

To play devil's advocate: there aren't many direct users of the iovad code:
either they'll die out entirely (and everybody will use the dma-iommu code)
and it's fine having the flush queue code where it is, or we'll get more
users and the likelihood of somebody else wanting flush queues increases.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/5] iommu/iova: Avoid double-negatives in magazine helpers

2021-10-04 Thread Will Deacon
On Fri, Sep 24, 2021 at 06:01:57PM +0800, John Garry wrote:
> A similar crash to the following could be observed if initial CPU rcache
> magazine allocations fail in init_iova_rcaches():
> 
> Unable to handle kernel NULL pointer dereference at virtual address 
> 
> Mem abort info:
> 
>   free_iova_fast+0xfc/0x280
>   iommu_dma_free_iova+0x64/0x70
>   __iommu_dma_unmap+0x9c/0xf8
>   iommu_dma_unmap_sg+0xa8/0xc8
>   dma_unmap_sg_attrs+0x28/0x50
>   cq_thread_v3_hw+0x2dc/0x528
>   irq_thread_fn+0x2c/0xa0
>   irq_thread+0x130/0x1e0
>   kthread+0x154/0x158
>   ret_from_fork+0x10/0x34
> 
> The issue is that expression !iova_magazine_full(NULL) evaluates true; this
> falls over in __iova_rcache_insert() when we attempt to cache a mag and
> cpu_rcache->loaded == NULL:
> 
> if (!iova_magazine_full(cpu_rcache->loaded)) {
>   can_insert = true;
> ...
> 
> if (can_insert)
>   iova_magazine_push(cpu_rcache->loaded, iova_pfn);
> 
> As above, can_insert is evaluated true, which it shouldn't be, and we try
> to insert pfns in a NULL mag, which is not safe.
> 
> To avoid this, stop using double-negatives, like !iova_magazine_full() and
> !iova_magazine_empty(), and use positive tests, like
> iova_magazine_has_space() and iova_magazine_has_pfns(), respectively; these
> can safely deal with cpu_rcache->{loaded, prev} = NULL.

I don't understand why you're saying that things like !iova_magazine_empty()
are double-negatives. What about e.g. !list_empty() elsewhre in the kernel?

The crux of the fix seems to be:

> @@ -783,8 +787,9 @@ static bool __iova_rcache_insert(struct 
> iova_caching_domain *rcached,
>   if (new_mag) {
>   spin_lock(>lock);
>   if (rcache->depot_size < MAX_GLOBAL_MAGS) {
> - rcache->depot[rcache->depot_size++] =
> - cpu_rcache->loaded;
> + if (cpu_rcache->loaded)
> + rcache->depot[rcache->depot_size++] =
> + cpu_rcache->loaded;

Which could be independent of the renaming?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/5] iova: Move fast alloc size roundup into alloc_iova_fast()

2021-10-04 Thread Will Deacon
On Fri, Sep 24, 2021 at 06:01:53PM +0800, John Garry wrote:
> It really is a property of the IOVA rcache code that we need to alloc a
> power-of-2 size, so relocate the functionality to resize into
> alloc_iova_fast(), rather than the callsites.
> 
> Signed-off-by: John Garry 
> ---
>  drivers/iommu/dma-iommu.c| 8 
>  drivers/iommu/iova.c | 9 +
>  drivers/vdpa/vdpa_user/iova_domain.c | 8 
>  3 files changed, 9 insertions(+), 16 deletions(-)

Acked-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] arm64: retry dropping HAVE_ARCH_PFN_VALID

2021-10-01 Thread Will Deacon
On Thu, 30 Sep 2021 04:30:37 +0300, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> Hi,
> 
> This is a new attempt to drop HAVE_ARCH_PFN_VALID on arm64 and start using
> the generic implementation of pfn_valid().
> 
> [...]

Applied to arm64 (for-next/pfn-valid), thanks!

[1/2] dma-mapping: remove bogus test for pfn_valid from dma_map_resource
  https://git.kernel.org/arm64/c/a9c38c5d267c
[2/2] arm64/mm: drop HAVE_ARCH_PFN_VALID
  https://git.kernel.org/arm64/c/3de360c3fdb3

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] [RFC] qcom_scm: hide Kconfig symbol

2021-09-29 Thread Will Deacon
On Mon, Sep 27, 2021 at 05:22:13PM +0200, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> Now that SCM can be a loadable module, we have to add another
> dependency to avoid link failures when ipa or adreno-gpu are
> built-in:
> 
> aarch64-linux-ld: drivers/net/ipa/ipa_main.o: in function `ipa_probe':
> ipa_main.c:(.text+0xfc4): undefined reference to `qcom_scm_is_available'
> 
> ld.lld: error: undefined symbol: qcom_scm_is_available
> >>> referenced by adreno_gpu.c
> >>>   gpu/drm/msm/adreno/adreno_gpu.o:(adreno_zap_shader_load) in 
> >>> archive drivers/built-in.a
> 
> This can happen when CONFIG_ARCH_QCOM is disabled and we don't select
> QCOM_MDT_LOADER, but some other module selects QCOM_SCM. Ideally we'd
> use a similar dependency here to what we have for QCOM_RPROC_COMMON,
> but that causes dependency loops from other things selecting QCOM_SCM.
> 
> This appears to be an endless problem, so try something different this
> time:
> 
>  - CONFIG_QCOM_SCM becomes a hidden symbol that nothing 'depends on'
>but that is simply selected by all of its users
> 
>  - All the stubs in include/linux/qcom_scm.h can go away
> 
>  - arm-smccc.h needs to provide a stub for __arm_smccc_smc() to
>allow compile-testing QCOM_SCM on all architectures.
> 
>  - To avoid a circular dependency chain involving RESET_CONTROLLER
>and PINCTRL_SUNXI, change the 'depends on RESET_CONTROLLER' in
>the latter one to 'select'.
> 
> The last bit is rather annoying, as drivers should generally never
> 'select' another subsystem, and about half the users of the reset
> controller interface do this anyway.
> 
> Nevertheless, this version seems to pass all my randconfig tests
> and is more robust than any of the prior versions.
> 
> Comments?
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/firmware/Kconfig|  4 +-
>  drivers/gpu/drm/msm/Kconfig |  4 +-
>  drivers/iommu/Kconfig   |  2 +-
>  drivers/media/platform/Kconfig  |  2 +-
>  drivers/mmc/host/Kconfig|  2 +-
>  drivers/net/ipa/Kconfig |  1 +
>  drivers/net/wireless/ath/ath10k/Kconfig |  2 +-
>  drivers/pinctrl/qcom/Kconfig|  3 +-
>  drivers/pinctrl/sunxi/Kconfig   |  6 +--
>  include/linux/arm-smccc.h   | 10 
>  include/linux/qcom_scm.h| 71 -
>  11 files changed, 23 insertions(+), 84 deletions(-)
> 
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index 220a58cf0a44..f7dd82ef0b9c 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -203,9 +203,7 @@ config INTEL_STRATIX10_RSU
> Say Y here if you want Intel RSU support.
>  
>  config QCOM_SCM
> - tristate "Qcom SCM driver"
> - depends on ARM || ARM64
> - depends on HAVE_ARM_SMCCC
> + tristate
>   select RESET_CONTROLLER
>  
>  config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
> diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
> index e9c6af78b1d7..3ddf739a6f9b 100644
> --- a/drivers/gpu/drm/msm/Kconfig
> +++ b/drivers/gpu/drm/msm/Kconfig
> @@ -17,7 +17,7 @@ config DRM_MSM
>   select DRM_SCHED
>   select SHMEM
>   select TMPFS
> - select QCOM_SCM if ARCH_QCOM
> + select QCOM_SCM
>   select WANT_DEV_COREDUMP
>   select SND_SOC_HDMI_CODEC if SND_SOC
>   select SYNC_FILE
> @@ -55,7 +55,7 @@ config DRM_MSM_GPU_SUDO
>  
>  config DRM_MSM_HDMI_HDCP
>   bool "Enable HDMI HDCP support in MSM DRM driver"
> - depends on DRM_MSM && QCOM_SCM
> + depends on DRM_MSM
>   default y
>   help
> Choose this option to enable HDCP state machine
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 124c41adeca1..989c83acbfee 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -308,7 +308,7 @@ config APPLE_DART
>  config ARM_SMMU
>   tristate "ARM Ltd. System MMU (SMMU) Support"
>   depends on ARM64 || ARM || (COMPILE_TEST && !GENERIC_ATOMIC64)
> - depends on QCOM_SCM || !QCOM_SCM #if QCOM_SCM=m this can't be =y
> + select QCOM_SCM
>   select IOMMU_API
>   select IOMMU_IO_PGTABLE_LPAE
>   select ARM_DMA_USE_IOMMU if ARM

I don't want to get in the way of this patch because I'm also tired of the
randconfig failures caused by QCOM_SCM. However, ARM_SMMU is applicable to
a wide variety of (non-qcom) SoCs and so it seems a shame to require the
QCOM_SCM code to be included for all of those when it's not strictly needed
at all.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] swiotlb: use depends on for DMA_RESTRICTED_POOL

2021-08-31 Thread Will Deacon
On Fri, Aug 27, 2021 at 11:48:02AM +0800, Claire Chang wrote:
> Use depends on instead of select for DMA_RESTRICTED_POOL; otherwise it
> will make SWIOTLB user configurable and cause compile errors for some
> arch (e.g. mips).
> 
> Fixes: 0b84e4f8b793 ("swiotlb: Add restricted DMA pool initialization")
> Reported-by: Guenter Roeck 
> Signed-off-by: Claire Chang 
> ---
>  kernel/dma/Kconfig | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
> index fd4db714d86b..1b02179758cb 100644
> --- a/kernel/dma/Kconfig
> +++ b/kernel/dma/Kconfig
> @@ -82,8 +82,7 @@ config SWIOTLB
>  
>  config DMA_RESTRICTED_POOL
>   bool "DMA Restricted Pool"
> - depends on OF && OF_RESERVED_MEM
> - select SWIOTLB
> + depends on OF && OF_RESERVED_MEM && SWIOTLB
>   help
> This enables support for restricted DMA pools which provide a level of
> DMA memory protection on systems with limited hardware protection

Acked-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 01/12] iova: Export alloc_iova_fast() and free_iova_fast()

2021-08-25 Thread Will Deacon
On Tue, Aug 24, 2021 at 02:08:33PM -0400, Michael S. Tsirkin wrote:
> On Wed, Aug 18, 2021 at 08:06:31PM +0800, Xie Yongji wrote:
> > Export alloc_iova_fast() and free_iova_fast() so that
> > some modules can make use of the per-CPU cache to get
> > rid of rbtree spinlock in alloc_iova() and free_iova()
> > during IOVA allocation.
> > 
> > Signed-off-by: Xie Yongji 
> 
> 
> This needs ack from iommu maintainers. Guys?

Looks fine to me:

Acked-by: Will Deacon 

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/2] Don't fail device probing due to of_dma_set_restricted_buffer()

2021-08-23 Thread Will Deacon
On Mon, Aug 23, 2021 at 06:47:19AM -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Aug 16, 2021 at 02:26:15PM +0100, Will Deacon wrote:
> > Hi all,
> > 
> > This is v2 of the patch I previously posted here:
> > 
> >   https://lore.kernel.org/r/20210805094736.902-1-w...@kernel.org
> > 
> > Changes since v1 are:
> > 
> >   * Move of_dma_set_restricted_buffer() into of/device.c (Rob)
> >   * Use IS_ENABLED() instead of 'static inline' stub (Rob)
> > 
> > This applies on Konrad's devel/for-linus-5.15 branch in swiotlb.git
> 
> It should show up later today.

Brill, thanks Konrad!

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH -next] iommu/arm-smmu: Fix missing unlock on error in arm_smmu_device_group()

2021-08-20 Thread Will Deacon
On Fri, Aug 20, 2021 at 03:49:49PM +0800, Yang Yingliang wrote:
> Add the missing unlock before return from function arm_smmu_device_group()
> in the error handling case.
> 
> Fixes: b1a1347912a7 ("iommu/arm-smmu: Fix race condition during iommu_group 
> creation")
> Reported-by: Hulk Robot 
> Signed-off-by: Yang Yingliang 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index 73893180ec7e..4bc75c4ce402 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -1478,8 +1478,10 @@ static struct iommu_group 
> *arm_smmu_device_group(struct device *dev)
>   mutex_lock(>stream_map_mutex);
>   for_each_cfg_sme(cfg, fwspec, i, idx) {
>   if (group && smmu->s2crs[idx].group &&
> - group != smmu->s2crs[idx].group)
> + group != smmu->s2crs[idx].group) {
> + mutex_unlock(>stream_map_mutex);
>   return ERR_PTR(-EINVAL);
> + }

Urgh, I should really have spotted that in review. Thanks:

Acked-by: Will Deacon 

Joerg -- please can you throw this on top?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [GIT PULL] iommu/arm-smmu: Updates for 5.15

2021-08-18 Thread Will Deacon
On Wed, Aug 18, 2021 at 02:08:25PM +0200, Joerg Roedel wrote:
> On Fri, Aug 13, 2021 at 05:47:35PM +0100, Will Deacon wrote:
> > This applies cleanly against iommu/next, but I suspect it will conflict
> > with Robin's series on the list. Please shout if you need anything from
> > me to help with that (e.g. rebase, checking a merge conflict).
> 
> For now there were at least no conflicts which git couldn't resolve
> automatically, but the compile tests are still running :)

Ok, I won't hold my breath!

> > The following changes since commit ff1176468d368232b684f75e82563369208bc371:
> > 
> >   Linux 5.14-rc3 (2021-07-25 15:35:14 -0700)
> > 
> > are available in the Git repository at:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
> > tags/arm-smmu-updates
> 
> So this is pulled now, thanks.

Cheers,

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/2] of: Move of_dma_set_restricted_buffer() into device.c

2021-08-16 Thread Will Deacon
Rob observes that:

  | of_dma_set_restricted_buffer() [...] should also be moved to
  | of/device.c. There's no reason for it to be in of/address.c. It has
  | nothing to do with address parsing.

Move it to of/device.c, as he suggests.

Cc: Claire Chang 
Cc: Konrad Rzeszutek Wilk 
Cc: Christoph Hellwig 
Cc: Robin Murphy 
Suggested-by: Rob Herring 
Link: 
https://lore.kernel.org/r/CAL_JsqJ7ROWWJX84x2kEex9NQ8G+2=ybrunoobx+j8bjzzs...@mail.gmail.com
Signed-off-by: Will Deacon 
---
 drivers/of/address.c| 33 -
 drivers/of/device.c | 34 ++
 drivers/of/of_private.h |  7 ---
 3 files changed, 34 insertions(+), 40 deletions(-)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 973257434398..94f017d808c4 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -8,7 +8,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -996,38 +995,6 @@ int of_dma_get_range(struct device_node *np, const struct 
bus_dma_region **map)
of_node_put(node);
return ret;
 }
-
-int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np)
-{
-   struct device_node *node, *of_node = dev->of_node;
-   int count, i;
-
-   count = of_property_count_elems_of_size(of_node, "memory-region",
-   sizeof(u32));
-   /*
-* If dev->of_node doesn't exist or doesn't contain memory-region, try
-* the OF node having DMA configuration.
-*/
-   if (count <= 0) {
-   of_node = np;
-   count = of_property_count_elems_of_size(
-   of_node, "memory-region", sizeof(u32));
-   }
-
-   for (i = 0; i < count; i++) {
-   node = of_parse_phandle(of_node, "memory-region", i);
-   /*
-* There might be multiple memory regions, but only one
-* restricted-dma-pool region is allowed.
-*/
-   if (of_device_is_compatible(node, "restricted-dma-pool") &&
-   of_device_is_available(node))
-   return of_reserved_mem_device_init_by_idx(dev, of_node,
- i);
-   }
-
-   return 0;
-}
 #endif /* CONFIG_HAS_DMA */
 
 /**
diff --git a/drivers/of/device.c b/drivers/of/device.c
index 2defdca418ec..089c5b4b97fb 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 #include  /* for bus_dma_region */
 #include 
 #include 
@@ -52,6 +53,39 @@ int of_device_add(struct platform_device *ofdev)
return device_add(>dev);
 }
 
+static int
+of_dma_set_restricted_buffer(struct device *dev, struct device_node *np)
+{
+   struct device_node *node, *of_node = dev->of_node;
+   int count, i;
+
+   count = of_property_count_elems_of_size(of_node, "memory-region",
+   sizeof(u32));
+   /*
+* If dev->of_node doesn't exist or doesn't contain memory-region, try
+* the OF node having DMA configuration.
+*/
+   if (count <= 0) {
+   of_node = np;
+   count = of_property_count_elems_of_size(
+   of_node, "memory-region", sizeof(u32));
+   }
+
+   for (i = 0; i < count; i++) {
+   node = of_parse_phandle(of_node, "memory-region", i);
+   /*
+* There might be multiple memory regions, but only one
+* restricted-dma-pool region is allowed.
+*/
+   if (of_device_is_compatible(node, "restricted-dma-pool") &&
+   of_device_is_available(node))
+   return of_reserved_mem_device_init_by_idx(dev, of_node,
+ i);
+   }
+
+   return 0;
+}
+
 /**
  * of_dma_configure_id - Setup DMA configuration
  * @dev:   Device to apply DMA configuration
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index f557bd22b0cf..631489f7f8c0 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -163,19 +163,12 @@ struct bus_dma_region;
 #if defined(CONFIG_OF_ADDRESS) && defined(CONFIG_HAS_DMA)
 int of_dma_get_range(struct device_node *np,
const struct bus_dma_region **map);
-int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np);
 #else
 static inline int of_dma_get_range(struct device_node *np,
const struct bus_dma_region **map)
 {
return -ENODEV;
 }
-static inline int of_dma_set_restricted_buffer(struct device *dev,
-  struct device_node *np)
-{
-   /*

[PATCH v2 2/2] of: restricted dma: Don't fail device probe on rmem init failure

2021-08-16 Thread Will Deacon
If CONFIG_DMA_RESTRICTED_POOL=n then probing a device with a reference
to a "restricted-dma-pool" will fail with a reasonably cryptic error:

  | pci-host-generic: probe of 1.pci failed with error -22

Rework of_dma_set_restricted_buffer() so that it does not cause probing
failure and instead either returns early if CONFIG_DMA_RESTRICTED_POOL=n
or emits a diagnostic if the reserved DMA pool fails to initialise.

Cc: Claire Chang 
Cc: Konrad Rzeszutek Wilk 
Cc: Christoph Hellwig 
Cc: Rob Herring 
Cc: Robin Murphy 
Signed-off-by: Will Deacon 
---
 drivers/of/device.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/of/device.c b/drivers/of/device.c
index 089c5b4b97fb..5b043ee30824 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -53,12 +53,15 @@ int of_device_add(struct platform_device *ofdev)
return device_add(>dev);
 }
 
-static int
+static void
 of_dma_set_restricted_buffer(struct device *dev, struct device_node *np)
 {
struct device_node *node, *of_node = dev->of_node;
int count, i;
 
+   if (!IS_ENABLED(CONFIG_DMA_RESTRICTED_POOL))
+   return;
+
count = of_property_count_elems_of_size(of_node, "memory-region",
sizeof(u32));
/*
@@ -79,11 +82,11 @@ of_dma_set_restricted_buffer(struct device *dev, struct 
device_node *np)
 */
if (of_device_is_compatible(node, "restricted-dma-pool") &&
of_device_is_available(node))
-   return of_reserved_mem_device_init_by_idx(dev, of_node,
- i);
+   break;
}
 
-   return 0;
+   if (i != count && of_reserved_mem_device_init_by_idx(dev, of_node, i))
+   dev_warn(dev, "failed to initialise \"restricted-dma-pool\" 
memory node\n");
 }
 
 /**
@@ -200,7 +203,7 @@ int of_dma_configure_id(struct device *dev, struct 
device_node *np,
arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
 
if (!iommu)
-   return of_dma_set_restricted_buffer(dev, np);
+   of_dma_set_restricted_buffer(dev, np);
 
return 0;
 }
-- 
2.33.0.rc1.237.g0d66db33f3-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/2] Don't fail device probing due to of_dma_set_restricted_buffer()

2021-08-16 Thread Will Deacon
Hi all,

This is v2 of the patch I previously posted here:

  https://lore.kernel.org/r/20210805094736.902-1-w...@kernel.org

Changes since v1 are:

  * Move of_dma_set_restricted_buffer() into of/device.c (Rob)
  * Use IS_ENABLED() instead of 'static inline' stub (Rob)

This applies on Konrad's devel/for-linus-5.15 branch in swiotlb.git

Cheers,

Will

Cc: Claire Chang 
Cc: Konrad Rzeszutek Wilk 
Cc: Christoph Hellwig 
Cc: Rob Herring 
Cc: Robin Murphy 

--->8

Will Deacon (2):
  of: Move of_dma_set_restricted_buffer() into device.c
  of: restricted dma: Don't fail device probe on rmem init failure

 drivers/of/address.c| 33 -
 drivers/of/device.c | 39 ++-
 drivers/of/of_private.h |  7 ---
 3 files changed, 38 insertions(+), 41 deletions(-)

-- 
2.33.0.rc1.237.g0d66db33f3-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/4] iommu/arm-smmu-v3: Use command queue batching helpers to improve performance

2021-08-16 Thread Will Deacon
On Mon, Aug 16, 2021 at 03:47:58PM +0800, Leizhen (ThunderTown) wrote:
> 
> 
> On 2021/8/16 15:24, John Garry wrote:
> >> In addition, I find that function arm_smmu_cmdq_build_cmd() can also be 
> >> optimized
> >> slightly, three useless instructions can be reduced.
> > 
> > I think that you could optimise further by pre-building commonly used 
> > commands.
> > 
> > For example, CMD_SYNC without MSI polling is always the same. And then only 
> > different in 1 field for MSI polling.
> > 
> > But you need to check if the performance gain is worth the change.
> 
> Good advice. I can give it a try.

Please send it as a new patch on top. I've queued the old one and sent
it to Joerg. Since this is just further cleanup, it can be done separately.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[GIT PULL] iommu/arm-smmu: Updates for 5.15

2021-08-13 Thread Will Deacon
Hi Joerg,

Please pull these Arm SMMU updates for 5.15. There's not tonnes here, but
a good mixture of optimisations and cleanups -- summary in the tag.

This applies cleanly against iommu/next, but I suspect it will conflict
with Robin's series on the list. Please shout if you need anything from
me to help with that (e.g. rebase, checking a merge conflict).

Cheers,

Will

--->8

The following changes since commit ff1176468d368232b684f75e82563369208bc371:

  Linux 5.14-rc3 (2021-07-25 15:35:14 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
tags/arm-smmu-updates

for you to fetch changes up to fac956710ab0812f9e395e9f7a27da551412830f:

  iommu/arm-smmu-v3: Stop pre-zeroing batch commands (2021-08-13 14:26:06 +0100)


Arm SMMU updates for 5.15

- SMMUv3

  * Minor optimisation to avoid zeroing struct members on CMD submission

  * Increased use of batched commands to reduce submission latency

  * Refactoring in preparation for ECMDQ support

- SMMUv2

  * Fix races when probing devices with identical StreamIDs

  * Optimise walk cache flushing for Qualcomm implementations

  * Allow deep sleep states for some Qualcomm SoCs with shared clocks


Ashish Mhetre (1):
  iommu: Fix race condition during default domain allocation

John Garry (2):
  iommu/arm-smmu-v3: Remove some unneeded init in 
arm_smmu_cmdq_issue_cmdlist()
  iommu/arm-smmu-v3: Stop pre-zeroing batch commands

Krishna Reddy (1):
  iommu/arm-smmu: Fix race condition during iommu_group creation

Sai Prakash Ranjan (2):
  iommu/arm-smmu: Add clk_bulk_{prepare/unprepare} to system pm callbacks
  iommu/arm-smmu: Optimize ->tlb_flush_walk() for qcom implementation

Zhen Lei (4):
  iommu/arm-smmu-v3: Use command queue batching helpers to improve 
performance
  iommu/arm-smmu-v3: Add and use static helper function 
arm_smmu_cmdq_issue_cmd_with_sync()
  iommu/arm-smmu-v3: Add and use static helper function arm_smmu_get_cmdq()
  iommu/arm-smmu-v3: Extract reusable function __arm_smmu_cmdq_skip_err()

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 89 +
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c  | 11 
 drivers/iommu/arm/arm-smmu/arm-smmu.c   | 45 ---
 drivers/iommu/arm/arm-smmu/arm-smmu.h   |  1 +
 drivers/iommu/iommu.c   |  2 +
 5 files changed, 106 insertions(+), 42 deletions(-)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/4] Prepare for ECMDQ support

2021-08-13 Thread Will Deacon
On Wed, 11 Aug 2021 19:48:48 +0800, Zhen Lei wrote:
> RFC --> v1
> 1. Resend the patches for ECMDQ preparation and remove the patches for ECMDQ 
> implementation.
> 2. Patch 2 is modified. Other patches remain unchanged.
>1) Add static helper __arm_smmu_cmdq_issue_cmd(), and make 
> arm_smmu_cmdq_issue_cmd()
>   and arm_smmu_cmdq_issue_cmd_with_sync() implement based on it.
>2) Remove unused arm_smmu_cmdq_issue_sync().
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/4] iommu/arm-smmu-v3: Use command queue batching helpers to improve 
performance
  https://git.kernel.org/arm64/c/eff19474b1bd
[2/4] iommu/arm-smmu-v3: Add and use static helper function 
arm_smmu_cmdq_issue_cmd_with_sync()
  https://git.kernel.org/arm64/c/4537f6f1e2d8
[3/4] iommu/arm-smmu-v3: Add and use static helper function arm_smmu_get_cmdq()
  https://git.kernel.org/arm64/c/8639cc83aac5
[4/4] iommu/arm-smmu-v3: Extract reusable function __arm_smmu_cmdq_skip_err()
  https://git.kernel.org/arm64/c/2cbeaf3f36eb

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv5] iommu/arm-smmu: Optimize ->tlb_flush_walk() for qcom implementation

2021-08-13 Thread Will Deacon
On Wed, 11 Aug 2021 21:34:26 +0530, Sai Prakash Ranjan wrote:
> Currently for iommu_unmap() of large scatter-gather list with page size
> elements, the majority of time is spent in flushing of partial walks in
> __arm_lpae_unmap() which is a VA based TLB invalidation invalidating
> page-by-page on iommus like arm-smmu-v2 (TLBIVA).
> 
> For example: to unmap a 32MB scatter-gather list with page size elements
> (8192 entries), there are 16->2MB buffer unmaps based on the pgsize (2MB
> for 4K granule) and each of 2MB will further result in 512 TLBIVAs (2MB/4K)
> resulting in a total of 8192 TLBIVAs (512*16) for 16->2MB causing a huge
> overhead.
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu: Optimize ->tlb_flush_walk() for qcom implementation
  https://git.kernel.org/will/c/ef75702d6d65

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/arm-smmu-v3: Stop pre-zeroing batch commands

2021-08-13 Thread Will Deacon
On Wed, 11 Aug 2021 23:49:26 +0800, John Garry wrote:
> Pre-zeroing the batched commands structure is inefficient, as individual
> commands are zeroed later in arm_smmu_cmdq_build_cmd(). The size is quite
> large and commonly most commands won't even be used:
> 
>   struct arm_smmu_cmdq_batch cmds = {};
> 345c: 5281mov w1, #0x0// #0
> 3460: d2808102mov x2, #0x408  // #1032
> 3464: 910143a0add x0, x29, #0x50
> 3468: 9400bl  0 
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu-v3: Stop pre-zeroing batch commands
  https://git.kernel.org/will/c/fac956710ab0

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 2/8] iommu/arm-smmu-v3: Add and use static helper function arm_smmu_cmdq_issue_cmd_with_sync()

2021-08-11 Thread Will Deacon
On Wed, Aug 11, 2021 at 11:31:08AM +0100, John Garry wrote:
> > > > > Obviously, inserting as many commands at a time as possible can 
> > > > > reduce the
> > > > > number of times the mutex contention participates, thereby improving 
> > > > > the
> > > > > overall performance. At least it reduces the number of calls to 
> > > > > function
> > > > > arm_smmu_cmdq_issue_cmdlist().
> > > > > 
> > > > > Therefore, function arm_smmu_cmdq_issue_cmd_with_sync() is added to 
> > > > > insert
> > > > > the 'cmd+sync' commands at a time.
> > > > > 
> > > > > Signed-off-by: Zhen Lei 
> > > > > ---
> > > > >   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 33 
> > > > > +
> > > > >   1 file changed, 21 insertions(+), 12 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> > > > > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > > index 2433d3c29b49ff2..a5361153ca1d6a4 100644
> > > > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > > > @@ -858,11 +858,25 @@ static int arm_smmu_cmdq_issue_cmd(struct 
> > > > > arm_smmu_device *smmu,
> > > > >   return arm_smmu_cmdq_issue_cmdlist(smmu, cmd, 1, false);
> > > > >   }
> > > > > -static int arm_smmu_cmdq_issue_sync(struct arm_smmu_device *smmu)
> > > > > +static int __maybe_unused arm_smmu_cmdq_issue_sync(struct 
> > > > > arm_smmu_device *smmu)
> > > > >   {
> > > > >   return arm_smmu_cmdq_issue_cmdlist(smmu, NULL, 0, true);
> > > > >   }
> > > > > +static int arm_smmu_cmdq_issue_cmd_with_sync(struct arm_smmu_device 
> > > > > *smmu,
> > > > > +  struct arm_smmu_cmdq_ent 
> > > > > *ent)
> > > > > +{
> > > > > + u64 cmd[CMDQ_ENT_DWORDS];
> > > > > +
> > > > > + if (arm_smmu_cmdq_build_cmd(cmd, ent)) {
> > > > > + dev_warn(smmu->dev, "ignoring unknown CMDQ opcode 
> > > > > 0x%x\n",
> > > > > +  ent->opcode);
> > > > > + return -EINVAL;
> 
> Are any of the errors returned from the "issue command" functions actually
> consumed? I couldn't see it on mainline code from a brief browse.

I don't think so. Can we actually propagate them?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv4] iommu/arm-smmu: Optimize ->tlb_flush_walk() for qcom implementation

2021-08-11 Thread Will Deacon
On Wed, Aug 11, 2021 at 11:37:25AM +0530, Sai Prakash Ranjan wrote:
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index f7da8953afbe..3904b598e0f9 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -327,9 +327,16 @@ static void arm_smmu_tlb_inv_range_s2(unsigned long 
> iova, size_t size,
>  static void arm_smmu_tlb_inv_walk_s1(unsigned long iova, size_t size,
>size_t granule, void *cookie)
>  {
> - arm_smmu_tlb_inv_range_s1(iova, size, granule, cookie,
> -   ARM_SMMU_CB_S1_TLBIVA);
> - arm_smmu_tlb_sync_context(cookie);
> + struct arm_smmu_domain *smmu_domain = cookie;
> + struct arm_smmu_cfg *cfg = _domain->cfg;
> +
> + if (cfg->flush_walk_prefer_tlbiasid) {
> + arm_smmu_tlb_inv_context_s1(cookie);

Hmm, this introduces an unconditional wmb() if tlbiasid is preferred. I
think that should be predicated on ARM_SMMU_FEAT_COHERENT_WALK like it is
for the by-VA ops. Worth doing as a separate patch.

> + } else {
> + arm_smmu_tlb_inv_range_s1(iova, size, granule, cookie,
> +   ARM_SMMU_CB_S1_TLBIVA);
> + arm_smmu_tlb_sync_context(cookie);
> + }
>  }
>  
>  static void arm_smmu_tlb_add_page_s1(struct iommu_iotlb_gather *gather,
> @@ -765,8 +772,10 @@ static int arm_smmu_init_domain_context(struct 
> iommu_domain *domain,
>   .iommu_dev  = smmu->dev,
>   };
>  
> - if (!iommu_get_dma_strict(domain))
> + if (!iommu_get_dma_strict(domain)) {
>   pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> + cfg->flush_walk_prefer_tlbiasid = true;

This is going to interact badly with Robin's series to allow dynamic
transition to non-strict mode, as we don't have a mechanism to switch
over to the by-ASID behaviour. Yes, it should _work_, but it's ugly having
different TLBI behaviour just because of the how the domain became
non-strict.

Robin -- I think this originated from your idea at [1]. Any idea how to make
it work with your other series, or shall we drop this part for now and leave
the TLB invalidation behaviour the same for now?

Will

[1] https://lore.kernel.org/r/da62ff1c-9b49-34d3-69a1-1a674e4a3...@arm.com
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 2/8] iommu/arm-smmu-v3: Add and use static helper function arm_smmu_cmdq_issue_cmd_with_sync()

2021-08-11 Thread Will Deacon
On Wed, Aug 11, 2021 at 10:16:39AM +0800, Leizhen (ThunderTown) wrote:
> 
> 
> On 2021/8/11 2:24, Will Deacon wrote:
> > On Sat, Jun 26, 2021 at 07:01:24PM +0800, Zhen Lei wrote:
> >> The obvious key to the performance optimization of commit 587e6c10a7ce
> >> ("iommu/arm-smmu-v3: Reduce contention during command-queue insertion") is
> >> to allow multiple cores to insert commands in parallel after a brief mutex
> >> contention.
> >>
> >> Obviously, inserting as many commands at a time as possible can reduce the
> >> number of times the mutex contention participates, thereby improving the
> >> overall performance. At least it reduces the number of calls to function
> >> arm_smmu_cmdq_issue_cmdlist().
> >>
> >> Therefore, function arm_smmu_cmdq_issue_cmd_with_sync() is added to insert
> >> the 'cmd+sync' commands at a time.
> >>
> >> Signed-off-by: Zhen Lei 
> >> ---
> >>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 33 +
> >>  1 file changed, 21 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> >> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> index 2433d3c29b49ff2..a5361153ca1d6a4 100644
> >> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> @@ -858,11 +858,25 @@ static int arm_smmu_cmdq_issue_cmd(struct 
> >> arm_smmu_device *smmu,
> >>return arm_smmu_cmdq_issue_cmdlist(smmu, cmd, 1, false);
> >>  }
> >>  
> >> -static int arm_smmu_cmdq_issue_sync(struct arm_smmu_device *smmu)
> >> +static int __maybe_unused arm_smmu_cmdq_issue_sync(struct arm_smmu_device 
> >> *smmu)
> >>  {
> >>return arm_smmu_cmdq_issue_cmdlist(smmu, NULL, 0, true);
> >>  }
> >>  
> >> +static int arm_smmu_cmdq_issue_cmd_with_sync(struct arm_smmu_device *smmu,
> >> +   struct arm_smmu_cmdq_ent *ent)
> >> +{
> >> +  u64 cmd[CMDQ_ENT_DWORDS];
> >> +
> >> +  if (arm_smmu_cmdq_build_cmd(cmd, ent)) {
> >> +  dev_warn(smmu->dev, "ignoring unknown CMDQ opcode 0x%x\n",
> >> +   ent->opcode);
> >> +  return -EINVAL;
> >> +  }
> >> +
> >> +  return arm_smmu_cmdq_issue_cmdlist(smmu, cmd, 1, true);
> >> +}
> > 
> > This function is almost identical to arm_smmu_cmdq_issue_cmd(). How about
> > moving the guts out into a helper:
> > 
> > static int __arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu,
> >  struct arm_smmu_cmdq_ent *ent,
> >  bool sync);
> > 
> > and then having arm_smmu_cmdq_issue_cmd_with_sync() and
> > arm_smmu_cmdq_issue_cmd() wrap that?
> 
> OK, I will do it.
> 
> How about remove arm_smmu_cmdq_issue_sync()? It's unused now.

Sure.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv2] iommu/arm-smmu: Add clk_bulk_{prepare/unprepare} to system pm callbacks

2021-08-10 Thread Will Deacon
On Tue, 10 Aug 2021 12:18:08 +0530, Sai Prakash Ranjan wrote:
> Some clocks for SMMU can have parent as XO such as gpu_cc_hub_cx_int_clk
> of GPU SMMU in QTI SC7280 SoC and in order to enter deep sleep states in
> such cases, we would need to drop the XO clock vote in unprepare call and
> this unprepare callback for XO is in RPMh (Resource Power Manager-Hardened)
> clock driver which controls RPMh managed clock resources for new QTI SoCs.
> 
> Given we cannot have a sleeping calls such as clk_bulk_prepare() and
> clk_bulk_unprepare() in arm-smmu runtime pm callbacks since the iommu
> operations like map and unmap can be in atomic context and are in fast
> path, add this prepare and unprepare call to drop the XO vote only for
> system pm callbacks since it is not a fast path and we expect the system
> to enter deep sleep states with system pm as opposed to runtime pm.
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/1] iommu/arm-smmu: Add clk_bulk_{prepare/unprepare} to system pm callbacks
  https://git.kernel.org/will/c/afefe67e0893

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Patch V3 0/2] iommu/arm-smmu: Fix races in iommu domain/group creation

2021-08-10 Thread Will Deacon
On Tue, 10 Aug 2021 10:13:59 +0530, Ashish Mhetre wrote:
> When two devices with same SID are getting probed concurrently through
> iommu_probe_device(), the iommu_group and iommu_domain are allocated more
> than once because they are not protected for concurrency. This is leading
> to context faults when one device is accessing IOVA from other device.
> Fix this by protecting iommu_domain and iommu_group creation with mutexes.
> 
> Changes in v3:
> * Updated commit messages.
> * Added Signed-off-by in patch 2.
> 
> [...]

Applied to will (for-joerg/arm-smmu/updates), thanks!

[1/2] iommu: Fix race condition during default domain allocation
  https://git.kernel.org/will/c/211ff31b3d33
[2/2] iommu/arm-smmu: Fix race condition during iommu_group creation
  https://git.kernel.org/will/c/b1a1347912a7

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 0/8] iommu/arm-smmu-v3: add support for ECMDQ register mode

2021-08-10 Thread Will Deacon
On Sat, Jun 26, 2021 at 07:01:22PM +0800, Zhen Lei wrote:
> SMMU v3.3 added a new feature, which is Enhanced Command queue interface
> for reducing contention when submitting Commands to the SMMU, in this
> patch set, ECMDQ is the abbreviation of Enhanced Command Queue.
> 
> When the hardware supports ECMDQ and each core can exclusively use one ECMDQ,
> each core does not need to compete with other cores when using its own ECMDQ.
> This means that each core can insert commands in parallel. If each ECMDQ can
> execute commands in parallel, the overall performance may be better. However,
> our hardware currently does not support multiple ECMDQ execute commands in
> parallel.
> 
> In order to reuse existing code, I originally still call 
> arm_smmu_cmdq_issue_cmdlist()
> to insert commands. Even so, however, there was a performance improvement of 
> nearly 12%
> in strict mode.
> 
> The test environment is the EMU, which simulates the connection of the 200 
> Gbit/s NIC.
> Number of queues:passthrough   lazy   strict(ECMDQ)  strict(CMDQ)
>   6  188180   162   145--> 
> 11.7% improvement
>   8  188188   184   183--> 
> 0.55% improvement

Sorry, I don't quite follow the numbers here. Why does the number of queues
affect the classic "CMDQ" mode? We only have one queue there, right?

> In recent days, I implemented a new function without competition with other
> cores to replace arm_smmu_cmdq_issue_cmdlist() when a core can have an ECMDQ.
> I'm guessing it might get better performance results. Because the EMU is too
> slow, it will take a while before the relevant data is available.

I'd certainly prefer to wait until we have something we know is
representative. However, I can take the first four prep patches now if you
respin the second one. At least that's then less for you to carry.

I'd also like review from the Arm side on this (and thank you for adopting
the architecture unlike others seem to have done judging by the patches
floating around).

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 2/8] iommu/arm-smmu-v3: Add and use static helper function arm_smmu_cmdq_issue_cmd_with_sync()

2021-08-10 Thread Will Deacon
On Sat, Jun 26, 2021 at 07:01:24PM +0800, Zhen Lei wrote:
> The obvious key to the performance optimization of commit 587e6c10a7ce
> ("iommu/arm-smmu-v3: Reduce contention during command-queue insertion") is
> to allow multiple cores to insert commands in parallel after a brief mutex
> contention.
> 
> Obviously, inserting as many commands at a time as possible can reduce the
> number of times the mutex contention participates, thereby improving the
> overall performance. At least it reduces the number of calls to function
> arm_smmu_cmdq_issue_cmdlist().
> 
> Therefore, function arm_smmu_cmdq_issue_cmd_with_sync() is added to insert
> the 'cmd+sync' commands at a time.
> 
> Signed-off-by: Zhen Lei 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 33 +
>  1 file changed, 21 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 2433d3c29b49ff2..a5361153ca1d6a4 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -858,11 +858,25 @@ static int arm_smmu_cmdq_issue_cmd(struct 
> arm_smmu_device *smmu,
>   return arm_smmu_cmdq_issue_cmdlist(smmu, cmd, 1, false);
>  }
>  
> -static int arm_smmu_cmdq_issue_sync(struct arm_smmu_device *smmu)
> +static int __maybe_unused arm_smmu_cmdq_issue_sync(struct arm_smmu_device 
> *smmu)
>  {
>   return arm_smmu_cmdq_issue_cmdlist(smmu, NULL, 0, true);
>  }
>  
> +static int arm_smmu_cmdq_issue_cmd_with_sync(struct arm_smmu_device *smmu,
> +  struct arm_smmu_cmdq_ent *ent)
> +{
> + u64 cmd[CMDQ_ENT_DWORDS];
> +
> + if (arm_smmu_cmdq_build_cmd(cmd, ent)) {
> + dev_warn(smmu->dev, "ignoring unknown CMDQ opcode 0x%x\n",
> +  ent->opcode);
> + return -EINVAL;
> + }
> +
> + return arm_smmu_cmdq_issue_cmdlist(smmu, cmd, 1, true);
> +}

This function is almost identical to arm_smmu_cmdq_issue_cmd(). How about
moving the guts out into a helper:

static int __arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu,
 struct arm_smmu_cmdq_ent *ent,
 bool sync);

and then having arm_smmu_cmdq_issue_cmd_with_sync() and
arm_smmu_cmdq_issue_cmd() wrap that?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv3] iommu/arm-smmu: Optimize ->tlb_flush_walk() for qcom implementation

2021-08-10 Thread Will Deacon
On Tue, Aug 03, 2021 at 11:09:17AM +0530, Sai Prakash Ranjan wrote:
> On 2021-08-02 21:13, Will Deacon wrote:
> > On Wed, Jun 23, 2021 at 07:12:01PM +0530, Sai Prakash Ranjan wrote:
> > > diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > index d3c6f54110a5..f3845e822565 100644
> > > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > > @@ -341,6 +341,12 @@ static void arm_smmu_tlb_add_page_s1(struct
> > > iommu_iotlb_gather *gather,
> > > ARM_SMMU_CB_S1_TLBIVAL);
> > >  }
> > > 
> > > +static void arm_smmu_tlb_inv_walk_impl_s1(unsigned long iova,
> > > size_t size,
> > > +  size_t granule, void *cookie)
> > > +{
> > > + arm_smmu_tlb_inv_context_s1(cookie);
> > > +}
> > > +
> > >  static void arm_smmu_tlb_inv_walk_s2(unsigned long iova, size_t size,
> > >size_t granule, void *cookie)
> > >  {
> > > @@ -388,6 +394,12 @@ static const struct iommu_flush_ops
> > > arm_smmu_s1_tlb_ops = {
> > >   .tlb_add_page   = arm_smmu_tlb_add_page_s1,
> > >  };
> > > 
> > > +const struct iommu_flush_ops arm_smmu_s1_tlb_impl_ops = {
> > > + .tlb_flush_all  = arm_smmu_tlb_inv_context_s1,
> > > + .tlb_flush_walk = arm_smmu_tlb_inv_walk_impl_s1,
> > > + .tlb_add_page   = arm_smmu_tlb_add_page_s1,
> > > +};
> > 
> > Hmm, dunno about this. Wouldn't it be a lot cleaner if the
> > tlb_flush_walk
> > callbacks just did the right thing based on the smmu_domain (maybe in
> > the
> > arm_smmu_cfg?) rather than having an entirely new set of ops just
> > because
> > they're const and you can't overide the bit you want?
> > 
> > I don't think there's really an awful lot qcom-specific about the
> > principle
> > here -- there's a trade-off between over-invalidation and invalidation
> > latency. That happens on the CPU as well.
> > 
> 
> Sorry didn't understand, based on smmu_domain what? How do we make
> this implementation specific? Do you mean something like a quirk?
> The reason we didn't make this common was because nvidia folks weren't
> so happy with that, you can find the discussion in this thread [1].
> 
> [1] 
> https://lore.kernel.org/lkml/20210609145315.25750-1-saiprakash.ran...@codeaurora.org/

The ->tlb_flush_walk() callbacks take a 'void *cookie' which, for this
driver, is a 'struct arm_smmu_domain *'. From that, you can get to the
'struct arm_smmu_cfg' which could have something as coarse as:

boolflush_walk_prefer_tlbiasid;

which you can set when you initialise the domain (maybe in the
->init_context callback?). It shouldn't affect anybody else.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Freedreno] [PATCH 0/3] iommu/drm/msm: Allow non-coherent masters to use system cache

2021-08-10 Thread Will Deacon
On Mon, Aug 09, 2021 at 11:17:40PM +0530, Sai Prakash Ranjan wrote:
> On 2021-08-09 23:10, Will Deacon wrote:
> > On Mon, Aug 09, 2021 at 10:18:21AM -0700, Rob Clark wrote:
> > > On Mon, Aug 9, 2021 at 10:05 AM Will Deacon  wrote:
> > > > On Mon, Aug 09, 2021 at 09:57:08AM -0700, Rob Clark wrote:
> > > > > But I suppose we could call it instead IOMMU_QCOM_LLC or something
> > > > > like that to make it more clear that it is not necessarily something
> > > > > that would work with a different outer level cache implementation?
> > > >
> > > > ... or we could just deal with the problem so that other people can 
> > > > reuse
> > > > the code. I haven't really understood the reluctance to solve this 
> > > > properly.
> > > >
> > > > Am I missing some reason this isn't solvable?
> > > 
> > > Oh, was there another way to solve it (other than foregoing setting
> > > INC_OCACHE in the pgtables)?  Maybe I misunderstood, is there a
> > > corresponding setting on the MMU pgtables side of things?
> > 
> > Right -- we just need to program the CPU's MMU with the matching memory
> > attributes! It's a bit more fiddly if you're just using ioremap_wc()
> > though, as it's usually the DMA API which handles the attributes under
> > the
> > hood.
> > 
> > Anyway, sorry, I should've said that explicitly earlier on. We've done
> > this
> > sort of thing in the Android tree so I assumed Sai knew what needed to
> > be
> > done and then I didn't think to explain to you :(
> > 
> 
> Right I was aware of that but even in the android tree there is no user :)

I'm assuming there are vendor modules using it there, otherwise we wouldn't
have been asked to put it in. Since you work at Qualcomm, maybe you could
talk to your colleagues (Isaac and Patrick) directly?

> I think we can't have a new memory type without any user right in upstream
> like android tree?

Correct. But I don't think we should be adding IOMMU_* anything upstream
if we don't have a user.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Freedreno] [PATCH 0/3] iommu/drm/msm: Allow non-coherent masters to use system cache

2021-08-09 Thread Will Deacon
On Mon, Aug 09, 2021 at 10:18:21AM -0700, Rob Clark wrote:
> On Mon, Aug 9, 2021 at 10:05 AM Will Deacon  wrote:
> >
> > On Mon, Aug 09, 2021 at 09:57:08AM -0700, Rob Clark wrote:
> > > On Mon, Aug 9, 2021 at 7:56 AM Will Deacon  wrote:
> > > > On Mon, Aug 02, 2021 at 06:36:04PM -0700, Rob Clark wrote:
> > > > > On Mon, Aug 2, 2021 at 8:14 AM Will Deacon  wrote:
> > > > > > On Mon, Aug 02, 2021 at 08:08:07AM -0700, Rob Clark wrote:
> > > > > > > On Mon, Aug 2, 2021 at 3:55 AM Will Deacon  
> > > > > > > wrote:
> > > > > > > > On Thu, Jul 29, 2021 at 10:08:22AM +0530, Sai Prakash Ranjan 
> > > > > > > > wrote:
> > > > > > > > > On 2021-07-28 19:30, Georgi Djakov wrote:
> > > > > > > > > > On Mon, Jan 11, 2021 at 07:45:02PM +0530, Sai Prakash 
> > > > > > > > > > Ranjan wrote:
> > > > > > > > > > > commit ecd7274fb4cd ("iommu: Remove unused 
> > > > > > > > > > > IOMMU_SYS_CACHE_ONLY flag")
> > > > > > > > > > > removed unused IOMMU_SYS_CACHE_ONLY prot flag and along 
> > > > > > > > > > > with it went
> > > > > > > > > > > the memory type setting required for the non-coherent 
> > > > > > > > > > > masters to use
> > > > > > > > > > > system cache. Now that system cache support for GPU is 
> > > > > > > > > > > added, we will
> > > > > > > > > > > need to set the right PTE attribute for GPU buffers to be 
> > > > > > > > > > > sys cached.
> > > > > > > > > > > Without this, the system cache lines are not allocated 
> > > > > > > > > > > for GPU.
> > > > > > > > > > >
> > > > > > > > > > > So the patches in this series introduces a new prot flag 
> > > > > > > > > > > IOMMU_LLC,
> > > > > > > > > > > renames IO_PGTABLE_QUIRK_ARM_OUTER_WBWA to 
> > > > > > > > > > > IO_PGTABLE_QUIRK_PTW_LLC
> > > > > > > > > > > and makes GPU the user of this protection flag.
> > > > > > > > > >
> > > > > > > > > > Thank you for the patchset! Are you planning to refresh it, 
> > > > > > > > > > as it does
> > > > > > > > > > not apply anymore?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I was waiting on Will's reply [1]. If there are no changes 
> > > > > > > > > needed, then
> > > > > > > > > I can repost the patch.
> > > > > > > >
> > > > > > > > I still think you need to handle the mismatched alias, no? 
> > > > > > > > You're adding
> > > > > > > > a new memory type to the SMMU which doesn't exist on the CPU 
> > > > > > > > side. That
> > > > > > > > can't be right.
> > > > > > > >
> > > > > > >
> > > > > > > Just curious, and maybe this is a dumb question, but what is your
> > > > > > > concern about mismatched aliases?  I mean the cache hierarchy on 
> > > > > > > the
> > > > > > > GPU device side (anything beyond the LLC) is pretty different and
> > > > > > > doesn't really care about the smmu pgtable attributes..
> > > > > >
> > > > > > If the CPU accesses a shared buffer with different attributes to 
> > > > > > those which
> > > > > > the device is using then you fall into the "mismatched memory 
> > > > > > attributes"
> > > > > > part of the Arm architecture. It's reasonably unforgiving (you 
> > > > > > should go and
> > > > > > read it) and in some cases can apply to speculative accesses as 
> > > > > > well, but
> > > > > > the end result is typically loss of coherency.
> > > > >
> > > > > Ok, I might have a few other sections to read first to decipher the
> > > > > terminology..
> 

Re: [Freedreno] [PATCH 0/3] iommu/drm/msm: Allow non-coherent masters to use system cache

2021-08-09 Thread Will Deacon
On Mon, Aug 09, 2021 at 09:57:08AM -0700, Rob Clark wrote:
> On Mon, Aug 9, 2021 at 7:56 AM Will Deacon  wrote:
> > On Mon, Aug 02, 2021 at 06:36:04PM -0700, Rob Clark wrote:
> > > On Mon, Aug 2, 2021 at 8:14 AM Will Deacon  wrote:
> > > > On Mon, Aug 02, 2021 at 08:08:07AM -0700, Rob Clark wrote:
> > > > > On Mon, Aug 2, 2021 at 3:55 AM Will Deacon  wrote:
> > > > > > On Thu, Jul 29, 2021 at 10:08:22AM +0530, Sai Prakash Ranjan wrote:
> > > > > > > On 2021-07-28 19:30, Georgi Djakov wrote:
> > > > > > > > On Mon, Jan 11, 2021 at 07:45:02PM +0530, Sai Prakash Ranjan 
> > > > > > > > wrote:
> > > > > > > > > commit ecd7274fb4cd ("iommu: Remove unused 
> > > > > > > > > IOMMU_SYS_CACHE_ONLY flag")
> > > > > > > > > removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with 
> > > > > > > > > it went
> > > > > > > > > the memory type setting required for the non-coherent masters 
> > > > > > > > > to use
> > > > > > > > > system cache. Now that system cache support for GPU is added, 
> > > > > > > > > we will
> > > > > > > > > need to set the right PTE attribute for GPU buffers to be sys 
> > > > > > > > > cached.
> > > > > > > > > Without this, the system cache lines are not allocated for 
> > > > > > > > > GPU.
> > > > > > > > >
> > > > > > > > > So the patches in this series introduces a new prot flag 
> > > > > > > > > IOMMU_LLC,
> > > > > > > > > renames IO_PGTABLE_QUIRK_ARM_OUTER_WBWA to 
> > > > > > > > > IO_PGTABLE_QUIRK_PTW_LLC
> > > > > > > > > and makes GPU the user of this protection flag.
> > > > > > > >
> > > > > > > > Thank you for the patchset! Are you planning to refresh it, as 
> > > > > > > > it does
> > > > > > > > not apply anymore?
> > > > > > > >
> > > > > > >
> > > > > > > I was waiting on Will's reply [1]. If there are no changes 
> > > > > > > needed, then
> > > > > > > I can repost the patch.
> > > > > >
> > > > > > I still think you need to handle the mismatched alias, no? You're 
> > > > > > adding
> > > > > > a new memory type to the SMMU which doesn't exist on the CPU side. 
> > > > > > That
> > > > > > can't be right.
> > > > > >
> > > > >
> > > > > Just curious, and maybe this is a dumb question, but what is your
> > > > > concern about mismatched aliases?  I mean the cache hierarchy on the
> > > > > GPU device side (anything beyond the LLC) is pretty different and
> > > > > doesn't really care about the smmu pgtable attributes..
> > > >
> > > > If the CPU accesses a shared buffer with different attributes to those 
> > > > which
> > > > the device is using then you fall into the "mismatched memory 
> > > > attributes"
> > > > part of the Arm architecture. It's reasonably unforgiving (you should 
> > > > go and
> > > > read it) and in some cases can apply to speculative accesses as well, 
> > > > but
> > > > the end result is typically loss of coherency.
> > >
> > > Ok, I might have a few other sections to read first to decipher the
> > > terminology..
> > >
> > > But my understanding of LLC is that it looks just like system memory
> > > to the CPU and GPU (I think that would make it "the point of
> > > coherence" between the GPU and CPU?)  If that is true, shouldn't it be
> > > invisible from the point of view of different CPU mapping options?
> >
> > You could certainly build a system where mismatched attributes don't cause
> > loss of coherence, but as it's not guaranteed by the architecture and the
> > changes proposed here affect APIs which are exposed across SoCs, then I
> > don't think it helps much.
> >
> 
> Hmm, the description of the new mapping flag is that it applies only
> to transparent outer level cache:
> 
> +/*
> + * Non-coherent masters can use this page protection flag to set cacheable
> + * memory attributes for only a transparent outer level of cache, also known 
> as
> + * the last-level or system cache.
> + */
> +#define IOMMU_LLC  (1 << 6)
> 
> But I suppose we could call it instead IOMMU_QCOM_LLC or something
> like that to make it more clear that it is not necessarily something
> that would work with a different outer level cache implementation?

... or we could just deal with the problem so that other people can reuse
the code. I haven't really understood the reluctance to solve this properly.

Am I missing some reason this isn't solvable?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Freedreno] [PATCH 0/3] iommu/drm/msm: Allow non-coherent masters to use system cache

2021-08-09 Thread Will Deacon
On Mon, Aug 02, 2021 at 06:36:04PM -0700, Rob Clark wrote:
> On Mon, Aug 2, 2021 at 8:14 AM Will Deacon  wrote:
> >
> > On Mon, Aug 02, 2021 at 08:08:07AM -0700, Rob Clark wrote:
> > > On Mon, Aug 2, 2021 at 3:55 AM Will Deacon  wrote:
> > > >
> > > > On Thu, Jul 29, 2021 at 10:08:22AM +0530, Sai Prakash Ranjan wrote:
> > > > > On 2021-07-28 19:30, Georgi Djakov wrote:
> > > > > > On Mon, Jan 11, 2021 at 07:45:02PM +0530, Sai Prakash Ranjan wrote:
> > > > > > > commit ecd7274fb4cd ("iommu: Remove unused IOMMU_SYS_CACHE_ONLY 
> > > > > > > flag")
> > > > > > > removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with it 
> > > > > > > went
> > > > > > > the memory type setting required for the non-coherent masters to 
> > > > > > > use
> > > > > > > system cache. Now that system cache support for GPU is added, we 
> > > > > > > will
> > > > > > > need to set the right PTE attribute for GPU buffers to be sys 
> > > > > > > cached.
> > > > > > > Without this, the system cache lines are not allocated for GPU.
> > > > > > >
> > > > > > > So the patches in this series introduces a new prot flag 
> > > > > > > IOMMU_LLC,
> > > > > > > renames IO_PGTABLE_QUIRK_ARM_OUTER_WBWA to 
> > > > > > > IO_PGTABLE_QUIRK_PTW_LLC
> > > > > > > and makes GPU the user of this protection flag.
> > > > > >
> > > > > > Thank you for the patchset! Are you planning to refresh it, as it 
> > > > > > does
> > > > > > not apply anymore?
> > > > > >
> > > > >
> > > > > I was waiting on Will's reply [1]. If there are no changes needed, 
> > > > > then
> > > > > I can repost the patch.
> > > >
> > > > I still think you need to handle the mismatched alias, no? You're adding
> > > > a new memory type to the SMMU which doesn't exist on the CPU side. That
> > > > can't be right.
> > > >
> > >
> > > Just curious, and maybe this is a dumb question, but what is your
> > > concern about mismatched aliases?  I mean the cache hierarchy on the
> > > GPU device side (anything beyond the LLC) is pretty different and
> > > doesn't really care about the smmu pgtable attributes..
> >
> > If the CPU accesses a shared buffer with different attributes to those which
> > the device is using then you fall into the "mismatched memory attributes"
> > part of the Arm architecture. It's reasonably unforgiving (you should go and
> > read it) and in some cases can apply to speculative accesses as well, but
> > the end result is typically loss of coherency.
> 
> Ok, I might have a few other sections to read first to decipher the
> terminology..
> 
> But my understanding of LLC is that it looks just like system memory
> to the CPU and GPU (I think that would make it "the point of
> coherence" between the GPU and CPU?)  If that is true, shouldn't it be
> invisible from the point of view of different CPU mapping options?

You could certainly build a system where mismatched attributes don't cause
loss of coherence, but as it's not guaranteed by the architecture and the
changes proposed here affect APIs which are exposed across SoCs, then I
don't think it helps much.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Patch V2 0/2] iommu/arm-smmu: Fix races in iommu domain/group creation

2021-08-09 Thread Will Deacon
On Mon, Aug 02, 2021 at 04:46:37PM +0100, Robin Murphy wrote:
> On 2021-08-02 16:16, Will Deacon wrote:
> > On Fri, Jun 18, 2021 at 02:00:35AM +0530, Ashish Mhetre wrote:
> > > Multiple iommu domains and iommu groups are getting created for the 
> > > devices
> > > sharing same SID. It is expected for devices sharing same SID to be in 
> > > same
> > > iommu group and same iommu domain.
> > > This is leading to context faults when one device is accessing IOVA from
> > > other device which shouldn't be the case for devices sharing same SID.
> > > Fix this by protecting iommu domain and iommu group creation with mutexes.
> > 
> > Robin -- any chance you could take a look at these, please? You had some
> > comments on the first version which convinced me that they are needed,
> > but I couldn't tell whether you wanted to solve this a different way or not.
> 
> Sorry, I was lamenting that this came to light due to the
> of_iommu_configure() flow being yucky, but that wasn't meant to imply that
> there aren't - or couldn't be in future - better reasons for
> iommu_probe_device() to be robust against concurrency anyway. I do think
> these are legitimate fixes to make in their own right, even if the current
> need might get swept back under the rug in future.
> 
> I would say, however, that the commit messages seem to focus too much on the
> wrong details and aren't overly useful, and patch #2 is missing Ashish's
> sign-off.

Ashish -- please can you send a v3 fixing these issues?

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 24/25] iommu/dma: Factor out flush queue init

2021-08-09 Thread Will Deacon
On Wed, Aug 04, 2021 at 06:15:52PM +0100, Robin Murphy wrote:
> Factor out flush queue setup from the initial domain init so that we
> can potentially trigger it from sysfs later on in a domain's lifetime.
> 
> Reviewed-by: Lu Baolu 
> Reviewed-by: John Garry 
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/dma-iommu.c | 30 --
>  include/linux/dma-iommu.h |  9 ++---
>  2 files changed, 26 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 2e19505dddf9..f51b8dc99ac6 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -310,6 +310,25 @@ static bool dev_is_untrusted(struct device *dev)
>   return dev_is_pci(dev) && to_pci_dev(dev)->untrusted;
>  }
>  
> +int iommu_dma_init_fq(struct iommu_domain *domain)
> +{
> + struct iommu_dma_cookie *cookie = domain->iova_cookie;
> +
> + if (domain->type != IOMMU_DOMAIN_DMA_FQ)
> + return -EINVAL;
> + if (cookie->fq_domain)
> + return 0;
> +
> + if (init_iova_flush_queue(>iovad, iommu_dma_flush_iotlb_all,
> +   iommu_dma_entry_dtor)) {
> + pr_warn("iova flush queue initialization failed\n");
> + domain->type = IOMMU_DOMAIN_DMA;
> + return -ENODEV;

I do find this a bit odd: we assert that the caller has set domain->type
to IOMMU_DOMAIN_DMA_FQ but then on failure we reset it to IOMMU_DOMAIN_DMA
here. I think it would be less error-prone if the setting of domain->type
was handled in the same function.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


  1   2   3   4   5   6   7   8   9   10   >