Re: [git pull] IOMMU Fixes for Linux v5.11-rc5

2021-01-29 Thread pr-tracker-bot
The pull request you sent on Fri, 29 Jan 2021 17:41:29 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
> tags/iommu-fixes-v5.11-rc5

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/8ef24c2011b77bd6344d16630d3cd95d63de63f8

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 3/3] iommu/arm-smmu-v3: Reserving the entire SMMU register space

2021-01-29 Thread Leizhen (ThunderTown)


On 2021/1/29 23:27, Robin Murphy wrote:
> On 2021-01-27 11:32, Zhen Lei wrote:
>> commit 52f3fab0067d ("iommu/arm-smmu-v3: Don't reserve implementation
>> defined register space") only reserves the basic SMMU register space. So
>> the ECMDQ register space is not covered, it should be mapped again. Due
>> to the size of this ECMDQ resource is not fixed, depending on
>> SMMU_IDR6.CMDQ_CONTROL_PAGE_LOG2NUMQ. Processing its resource reservation
>> to avoid resource conflict with PMCG is a bit more complicated.
>>
>> Therefore, the resources of the PMCG are not reserved, and the entire SMMU
>> resources are reserved.
> 
> This is the opposite of what I suggested. My point was that it will make the 
> most sense to map the ECMDQ pages as a separate request anyway, therefore 
> there is no reason to stop mapping page 0 and page 1 separately either.

I don't understand why the ECMDQ mapping must be performed separately. If the 
conflict with PMCG resources is eliminated. ECMDQ cannot be a separate driver 
like PMCG.

> 
> If we need to expand the page 0 mapping to cover more of page 0 to reach the 
> SMMU_CMDQ_CONTROL_PAGE_* registers, we can do that when we actually add ECMDQ 
> support.
> 
> Robin.
> 
>> Suggested-by: Robin Murphy 
>> Signed-off-by: Zhen Lei 
>> ---
>>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 
>>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 --
>>   2 files changed, 4 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> index f04c55a7503c790..fde24403b06a9e3 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> @@ -3476,14 +3476,6 @@ static int arm_smmu_set_bus_ops(struct iommu_ops *ops)
>>   return err;
>>   }
>>   -static void __iomem *arm_smmu_ioremap(struct device *dev, resource_size_t 
>> start,
>> -  resource_size_t size)
>> -{
>> -    struct resource res = DEFINE_RES_MEM(start, size);
>> -
>> -    return devm_ioremap_resource(dev, );
>> -}
>> -
>>   static int arm_smmu_device_probe(struct platform_device *pdev)
>>   {
>>   int irq, ret;
>> @@ -3519,22 +3511,14 @@ static int arm_smmu_device_probe(struct 
>> platform_device *pdev)
>>   }
>>   ioaddr = res->start;
>>   -    /*
>> - * Don't map the IMPLEMENTATION DEFINED regions, since they may contain
>> - * the PMCG registers which are reserved by the PMU driver.
>> - */
>> -    smmu->base = arm_smmu_ioremap(dev, ioaddr, ARM_SMMU_REG_SZ);
>> +    smmu->base = devm_ioremap_resource(dev, res);
>>   if (IS_ERR(smmu->base))
>>   return PTR_ERR(smmu->base);
>>   -    if (arm_smmu_resource_size(smmu) > SZ_64K) {
>> -    smmu->page1 = arm_smmu_ioremap(dev, ioaddr + SZ_64K,
>> -   ARM_SMMU_REG_SZ);
>> -    if (IS_ERR(smmu->page1))
>> -    return PTR_ERR(smmu->page1);
>> -    } else {
>> +    if (arm_smmu_resource_size(smmu) > SZ_64K)
>> +    smmu->page1 = smmu->base + SZ_64K;
>> +    else
>>   smmu->page1 = smmu->base;
>> -    }
>>     /* Interrupt lines */
>>   diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> index da525f46dab4fc1..63f2b476987d6ae 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
>> @@ -152,8 +152,6 @@
>>   #define ARM_SMMU_PRIQ_IRQ_CFG1    0xd8
>>   #define ARM_SMMU_PRIQ_IRQ_CFG2    0xdc
>>   -#define ARM_SMMU_REG_SZ    0xe00
>> -
>>   /* Common MSI config fields */
>>   #define MSI_CFG0_ADDR_MASK    GENMASK_ULL(51, 2)
>>   #define MSI_CFG2_SH    GENMASK(5, 4)
>>
> 
> .
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v4 0/2] perf/smmuv3: Don't reserve the PMCG register spaces

2021-01-29 Thread Zhen Lei
v3 --> v4:
1. Delete the unnecessary encapsulation function 
smmu_pmu_get_and_ioremap_resource().
2. Discard adding MODULE_SOFTDEP.

v2 --> v3:
Patch 3 is updated because https://lkml.org/lkml/2021/1/22/532 has been queued 
in advance.

v1 --> v2:
According to Robin Murphy's suggestion: https://lkml.org/lkml/2021/1/20/470
Don't reserve the PMCG register spaces, and reserve the entire SMMU register 
space.

v1:
Since the PMCG may implement its resigters space(4KB Page0 and 4KB Page1)
within the SMMUv3 64KB Page0. In this case, when the SMMUv3 driver reserves the
64KB Page0 resource in advance, the PMCG driver try to reserve its Page0 and
Page1 resources, a resource conflict occurs.

commit 52f3fab0067d6fa ("iommu/arm-smmu-v3: Don't reserve implementation
defined register space") reduce the resource reservation range of the SMMUv3
driver, it only reserves the first 0xe00 bytes in the 64KB Page0, to avoid
the above-mentioned resource conflicts.

But the SMMUv3.3 add support for ECMDQ, its registers space is also implemented
in the SMMUv3 64KB Page0. This means we need to build two separate mappings.
New features may be added in the future, and more independent mappings may be
required. The simple problem is complicated because the user expects to map the
entire SMMUv3 64KB Page0.

Therefore, the proper solution is: If the PMCG register resources are located in
the 64KB Page0 of the SMMU, the PMCG driver does not reserve the conflict 
resources
when the SMMUv3 driver has reserved the conflict resources before. Instead, the 
PMCG
driver only performs devm_ioremap() to ensure that it can work properly.

Zhen Lei (2):
  perf/smmuv3: Don't reserve the PMCG register spaces
  iommu/arm-smmu-v3: Reserving the entire SMMU register space

 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 --
 drivers/perf/arm_smmuv3_pmu.c   | 25 +++--
 3 files changed, 23 insertions(+), 28 deletions(-)

-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 1/2] perf/smmuv3: Don't reserve the PMCG register spaces

2021-01-29 Thread Zhen Lei
According to the SMMUv3 specification:
Each PMCG counter group is represented by one 4KB page (Page 0) with one
optional additional 4KB page (Page 1), both of which are at IMPLEMENTATION
DEFINED base addresses.

This means that the PMCG register spaces may be within the 64KB pages of
the SMMUv3 register space. When both the SMMU and PMCG drivers reserve
their own resources, a resource conflict occurs.

To avoid this conflict, don't reserve the PMCG regions.

Suggested-by: Robin Murphy 
Signed-off-by: Zhen Lei 
---
 drivers/perf/arm_smmuv3_pmu.c | 25 +++--
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
index 74474bb322c3f26..5e894f957c7b935 100644
--- a/drivers/perf/arm_smmuv3_pmu.c
+++ b/drivers/perf/arm_smmuv3_pmu.c
@@ -793,17 +793,30 @@ static int smmu_pmu_probe(struct platform_device *pdev)
.capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
};
 
-   smmu_pmu->reg_base = devm_platform_get_and_ioremap_resource(pdev, 0, 
_0);
-   if (IS_ERR(smmu_pmu->reg_base))
-   return PTR_ERR(smmu_pmu->reg_base);
+   /*
+* The register spaces of the PMCG may be in the register space of
+* other devices. For example, SMMU. Therefore, the PMCG resources are
+* not reserved to avoid resource conflicts with other drivers.
+*/
+   res_0 = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+   if (!res_0)
+   return ERR_PTR(-EINVAL);
+   smmu_pmu->reg_base = devm_ioremap(dev, res_0->start, 
resource_size(res_0));
+   if (!smmu_pmu->reg_base)
+   return ERR_PTR(-ENOMEM);
 
cfgr = readl_relaxed(smmu_pmu->reg_base + SMMU_PMCG_CFGR);
 
/* Determine if page 1 is present */
if (cfgr & SMMU_PMCG_CFGR_RELOC_CTRS) {
-   smmu_pmu->reloc_base = devm_platform_ioremap_resource(pdev, 1);
-   if (IS_ERR(smmu_pmu->reloc_base))
-   return PTR_ERR(smmu_pmu->reloc_base);
+   struct resource *res_1;
+
+   res_1 = platform_get_resource(pdev, IORESOURCE_MEM, 1);
+   if (!res_1)
+   return ERR_PTR(-EINVAL);
+   smmu_pmu->reloc_base = devm_ioremap(dev, res_1->start, 
resource_size(res_1));
+   if (!smmu_pmu->reloc_base)
+   return ERR_PTR(-ENOMEM);
} else {
smmu_pmu->reloc_base = smmu_pmu->reg_base;
}
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 2/2] iommu/arm-smmu-v3: Reserving the entire SMMU register space

2021-01-29 Thread Zhen Lei
commit 52f3fab0067d ("iommu/arm-smmu-v3: Don't reserve implementation
defined register space") only reserves the basic SMMU register space. So
the ECMDQ register space is not covered, it should be mapped again. Due
to the size of this ECMDQ resource is not fixed, depending on
SMMU_IDR6.CMDQ_CONTROL_PAGE_LOG2NUMQ. Processing its resource reservation
to avoid resource conflict with PMCG is a bit more complicated.

Therefore, the resources of the PMCG are not reserved, and the entire SMMU
resources are reserved.

Signed-off-by: Zhen Lei 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 --
 2 files changed, 4 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f04c55a7503c790..fde24403b06a9e3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3476,14 +3476,6 @@ static int arm_smmu_set_bus_ops(struct iommu_ops *ops)
return err;
 }
 
-static void __iomem *arm_smmu_ioremap(struct device *dev, resource_size_t 
start,
- resource_size_t size)
-{
-   struct resource res = DEFINE_RES_MEM(start, size);
-
-   return devm_ioremap_resource(dev, );
-}
-
 static int arm_smmu_device_probe(struct platform_device *pdev)
 {
int irq, ret;
@@ -3519,22 +3511,14 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
}
ioaddr = res->start;
 
-   /*
-* Don't map the IMPLEMENTATION DEFINED regions, since they may contain
-* the PMCG registers which are reserved by the PMU driver.
-*/
-   smmu->base = arm_smmu_ioremap(dev, ioaddr, ARM_SMMU_REG_SZ);
+   smmu->base = devm_ioremap_resource(dev, res);
if (IS_ERR(smmu->base))
return PTR_ERR(smmu->base);
 
-   if (arm_smmu_resource_size(smmu) > SZ_64K) {
-   smmu->page1 = arm_smmu_ioremap(dev, ioaddr + SZ_64K,
-  ARM_SMMU_REG_SZ);
-   if (IS_ERR(smmu->page1))
-   return PTR_ERR(smmu->page1);
-   } else {
+   if (arm_smmu_resource_size(smmu) > SZ_64K)
+   smmu->page1 = smmu->base + SZ_64K;
+   else
smmu->page1 = smmu->base;
-   }
 
/* Interrupt lines */
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index da525f46dab4fc1..63f2b476987d6ae 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -152,8 +152,6 @@
 #define ARM_SMMU_PRIQ_IRQ_CFG1 0xd8
 #define ARM_SMMU_PRIQ_IRQ_CFG2 0xdc
 
-#define ARM_SMMU_REG_SZ0xe00
-
 /* Common MSI config fields */
 #define MSI_CFG0_ADDR_MASK GENMASK_ULL(51, 2)
 #define MSI_CFG2_SHGENMASK(5, 4)
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 1/2] perf/smmuv3: Don't reserve the PMCG register spaces

2021-01-29 Thread Leizhen (ThunderTown)
Hi, Robin:
  Can you review this patch again?


On 2021/1/30 15:14, Zhen Lei wrote:
> According to the SMMUv3 specification:
> Each PMCG counter group is represented by one 4KB page (Page 0) with one
> optional additional 4KB page (Page 1), both of which are at IMPLEMENTATION
> DEFINED base addresses.
> 
> This means that the PMCG register spaces may be within the 64KB pages of
> the SMMUv3 register space. When both the SMMU and PMCG drivers reserve
> their own resources, a resource conflict occurs.
> 
> To avoid this conflict, don't reserve the PMCG regions.
> 
> Suggested-by: Robin Murphy 
> Signed-off-by: Zhen Lei 
> ---
>  drivers/perf/arm_smmuv3_pmu.c | 25 +++--
>  1 file changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
> index 74474bb322c3f26..5e894f957c7b935 100644
> --- a/drivers/perf/arm_smmuv3_pmu.c
> +++ b/drivers/perf/arm_smmuv3_pmu.c
> @@ -793,17 +793,30 @@ static int smmu_pmu_probe(struct platform_device *pdev)
>   .capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
>   };
>  
> - smmu_pmu->reg_base = devm_platform_get_and_ioremap_resource(pdev, 0, 
> _0);
> - if (IS_ERR(smmu_pmu->reg_base))
> - return PTR_ERR(smmu_pmu->reg_base);
> + /*
> +  * The register spaces of the PMCG may be in the register space of
> +  * other devices. For example, SMMU. Therefore, the PMCG resources are
> +  * not reserved to avoid resource conflicts with other drivers.
> +  */
> + res_0 = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> + if (!res_0)
> + return ERR_PTR(-EINVAL);
> + smmu_pmu->reg_base = devm_ioremap(dev, res_0->start, 
> resource_size(res_0));
> + if (!smmu_pmu->reg_base)
> + return ERR_PTR(-ENOMEM);
>  
>   cfgr = readl_relaxed(smmu_pmu->reg_base + SMMU_PMCG_CFGR);
>  
>   /* Determine if page 1 is present */
>   if (cfgr & SMMU_PMCG_CFGR_RELOC_CTRS) {
> - smmu_pmu->reloc_base = devm_platform_ioremap_resource(pdev, 1);
> - if (IS_ERR(smmu_pmu->reloc_base))
> - return PTR_ERR(smmu_pmu->reloc_base);
> + struct resource *res_1;
> +
> + res_1 = platform_get_resource(pdev, IORESOURCE_MEM, 1);
> + if (!res_1)
> + return ERR_PTR(-EINVAL);
> + smmu_pmu->reloc_base = devm_ioremap(dev, res_1->start, 
> resource_size(res_1));
> + if (!smmu_pmu->reloc_base)
> + return ERR_PTR(-ENOMEM);
>   } else {
>   smmu_pmu->reloc_base = smmu_pmu->reg_base;
>   }
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 1/3] perf/smmuv3: Don't reserve the PMCG register spaces

2021-01-29 Thread Leizhen (ThunderTown)


On 2021/1/29 23:06, Robin Murphy wrote:
> On 2021-01-27 11:32, Zhen Lei wrote:
>> According to the SMMUv3 specification:
>> Each PMCG counter group is represented by one 4KB page (Page 0) with one
>> optional additional 4KB page (Page 1), both of which are at IMPLEMENTATION
>> DEFINED base addresses.
>>
>> This means that the PMCG register spaces may be within the 64KB pages of
>> the SMMUv3 register space. When both the SMMU and PMCG drivers reserve
>> their own resources, a resource conflict occurs.
>>
>> To avoid this conflict, don't reserve the PMCG regions.
> 
> I'm still not a fan of this get_and_ioremap notion in general, especially 
> when the "helper" function ends up over twice the size of all the code it 
> replaces[1], but for the actual functional change here,

OK,I'll change it to [1] and add some comments.

> 
> Reviewed-by: Robin Murphy 
> 
>> Suggested-by: Robin Murphy 
>> Signed-off-by: Zhen Lei 
>> ---
>>   drivers/perf/arm_smmuv3_pmu.c | 27 +--
>>   1 file changed, 25 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
>> index 74474bb322c3f26..e5e505a0804fe53 100644
>> --- a/drivers/perf/arm_smmuv3_pmu.c
>> +++ b/drivers/perf/arm_smmuv3_pmu.c
>> @@ -761,6 +761,29 @@ static void smmu_pmu_get_acpi_options(struct smmu_pmu 
>> *smmu_pmu)
>>   dev_notice(smmu_pmu->dev, "option mask 0x%x\n", smmu_pmu->options);
>>   }
>>   +static void __iomem *
>> +smmu_pmu_get_and_ioremap_resource(struct platform_device *pdev,
>> +  unsigned int index,
>> +  struct resource **res)
>> +{
>> +    void __iomem *base;
>> +    struct resource *r;
>> +
>> +    r = platform_get_resource(pdev, IORESOURCE_MEM, index);
>> +    if (!r) {
>> +    dev_err(>dev, "invalid resource\n");
>> +    return ERR_PTR(-EINVAL);
>> +    }
>> +    if (res)
>> +    *res = r;
>> +
>> +    base = devm_ioremap(>dev, r->start, resource_size(r));
>> +    if (!base)
>> +    return ERR_PTR(-ENOMEM);
>> +
>> +    return base;
>> +}
>> +
>>   static int smmu_pmu_probe(struct platform_device *pdev)
>>   {
>>   struct smmu_pmu *smmu_pmu;
>> @@ -793,7 +816,7 @@ static int smmu_pmu_probe(struct platform_device *pdev)
>>   .capabilities    = PERF_PMU_CAP_NO_EXCLUDE,
>>   };
>>   -    smmu_pmu->reg_base = devm_platform_get_and_ioremap_resource(pdev, 0, 
>> _0);
>> +    smmu_pmu->reg_base = smmu_pmu_get_and_ioremap_resource(pdev, 0, _0);
>>   if (IS_ERR(smmu_pmu->reg_base))
>>   return PTR_ERR(smmu_pmu->reg_base);
>>   @@ -801,7 +824,7 @@ static int smmu_pmu_probe(struct platform_device *pdev)
>>     /* Determine if page 1 is present */
>>   if (cfgr & SMMU_PMCG_CFGR_RELOC_CTRS) {
>> -    smmu_pmu->reloc_base = devm_platform_ioremap_resource(pdev, 1);
>> +    smmu_pmu->reloc_base = smmu_pmu_get_and_ioremap_resource(pdev, 1, 
>> NULL);
>>   if (IS_ERR(smmu_pmu->reloc_base))
>>   return PTR_ERR(smmu_pmu->reloc_base);
>>   } else {
>>
> 
> [1]
> ->8-
>  drivers/perf/arm_smmuv3_pmu.c | 35 +--
>  1 file changed, 9 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
> index e5e505a0804f..c9adbc7b55a1 100644
> --- a/drivers/perf/arm_smmuv3_pmu.c
> +++ b/drivers/perf/arm_smmuv3_pmu.c
> @@ -761,33 +761,10 @@ static void smmu_pmu_get_acpi_options(struct smmu_pmu 
> *smmu_pmu)
>  dev_notice(smmu_pmu->dev, "option mask 0x%x\n", smmu_pmu->options);
>  }
> 
> -static void __iomem *
> -smmu_pmu_get_and_ioremap_resource(struct platform_device *pdev,
> -  unsigned int index,
> -  struct resource **res)
> -{
> -    void __iomem *base;
> -    struct resource *r;
> -
> -    r = platform_get_resource(pdev, IORESOURCE_MEM, index);
> -    if (!r) {
> -    dev_err(>dev, "invalid resource\n");
> -    return ERR_PTR(-EINVAL);
> -    }
> -    if (res)
> -    *res = r;
> -
> -    base = devm_ioremap(>dev, r->start, resource_size(r));
> -    if (!base)
> -    return ERR_PTR(-ENOMEM);
> -
> -    return base;
> -}
> -
>  static int smmu_pmu_probe(struct platform_device *pdev)
>  {
>  struct smmu_pmu *smmu_pmu;
> -    struct resource *res_0;
> +    struct resource *res_0, *res_1;
>  u32 cfgr, reg_size;
>  u64 ceid_64[2];
>  int irq, err;
> @@ -816,7 +793,10 @@ static int smmu_pmu_probe(struct platform_device *pdev)
>  .capabilities    = PERF_PMU_CAP_NO_EXCLUDE,
>  };
> 
> -    smmu_pmu->reg_base = smmu_pmu_get_and_ioremap_resource(pdev, 0, _0);
> +    res_0 = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +    if (!res_0)
> +    return ERR_PTR(-EINVAL);
> +    smmu_pmu->reg_base = devm_ioremap(dev, res_0->start, 
> resource_size(res_0));
>  if (IS_ERR(smmu_pmu->reg_base))
>  return PTR_ERR(smmu_pmu->reg_base);
> 
> @@ -824,7 +804,10 @@ static int smmu_pmu_probe(struct platform_device 

Re: [PATCH v3 2/3] perf/smmuv3: Add a MODULE_SOFTDEP() to indicate dependency on SMMU

2021-01-29 Thread Leizhen (ThunderTown)



On 2021/1/30 1:03, Robin Murphy wrote:
> On 2021-01-29 15:34, John Garry wrote:
>> On 29/01/2021 15:12, Robin Murphy wrote:
>>> On 2021-01-27 11:32, Zhen Lei wrote:
 The MODULE_SOFTDEP() gives user space a hint of the loading sequence. And
 when command "modprobe arm_smmuv3_pmu" is executed, the arm_smmu_v3.ko is
 automatically loaded in advance.
>>>
>>> Why do we need this? If probe order doesn't matter when both drivers are 
>>> built-in, why should module load order?
>>>
>>> TBH I'm not sure why we even have a Kconfig dependency on ARM_SMMU_V3, 
>>> given that the drivers operate completely independently :/
>>
>> Can that Kconfig dependency just be removed? I think that it was added under 
>> the idea that there is no point in having the SMMUv3 PMU driver without the 
>> SMMUv3 driver.
> 
> A PMCG *might* be usable for simply counting transactions to measure device 
> activity regardless of its associated SMMU being enabled.

If that's the case, the SOFTDEP really shouldn't be added. I wasn't trying to 
make sure they were loaded in order, just to make sure that the SMMU was not 
forgotten to load.

> Either way, it's not really Kconfig's job to decide what makes sense (beyond 
> the top-level "can this driver *ever* be used on this platform" visibility 
> choices). Imagine if we gave every PCI/USB/etc. device driver an explicit 
> ?dependency on at least one host controller driver being enabled...
> 
> Robin.
> 
> .
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu/io-pgtable-arm: Add IOMMU_LLC page protection flag

2021-01-29 Thread Sai Prakash Ranjan

On 2021-01-29 14:35, Will Deacon wrote:

On Mon, Jan 11, 2021 at 07:45:04PM +0530, Sai Prakash Ranjan wrote:

Add a new page protection flag IOMMU_LLC which can be used
by non-coherent masters to set cacheable memory attributes
for an outer level of cache called as last-level cache or
system cache. Initial user of this page protection flag is
the adreno gpu and then can later be used by other clients
such as video where this can be used for per-buffer based
mapping.

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/iommu/io-pgtable-arm.c | 3 +++
 include/linux/iommu.h  | 6 ++
 2 files changed, 9 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c 
b/drivers/iommu/io-pgtable-arm.c

index 7439ee7fdcdb..ebe653ef601b 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -415,6 +415,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct 
arm_lpae_io_pgtable *data,

else if (prot & IOMMU_CACHE)
pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
<< ARM_LPAE_PTE_ATTRINDX_SHIFT);
+   else if (prot & IOMMU_LLC)
+   pte |= (ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE
+   << ARM_LPAE_PTE_ATTRINDX_SHIFT);
}

if (prot & IOMMU_CACHE)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ffaa389ea128..1f82057df531 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -31,6 +31,12 @@
  * if the IOMMU page table format is equivalent.
  */
 #define IOMMU_PRIV (1 << 5)
+/*
+ * Non-coherent masters can use this page protection flag to set 
cacheable
+ * memory attributes for only a transparent outer level of cache, 
also known as

+ * the last-level or system cache.
+ */
+#define IOMMU_LLC  (1 << 6)


On reflection, I'm a bit worried about exposing this because I think it 
will
introduce a mismatched virtual alias with the CPU (we don't even have a 
MAIR
set up for this memory type). Now, we also have that issue for the PTW, 
but

since we always use cache maintenance (i.e. the streaming API) for
publishing the page-tables to a non-coheren walker, it works out. 
However,

if somebody expects IOMMU_LLC to be coherent with a DMA API coherent
allocation, then they're potentially in for a nasty surprise due to the
mismatched outer-cacheability attributes.



Can't we add the syscached memory type similar to what is done on 
android?


So I can take patch (1) as a trivial rename, but unfortunately I think 
this

needs more thought before exposing it beyond the PTW.



That wouldn't be of much use, would it :) , we would be losing on
perf gain for GPU usecases without the rest of the patches.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 07/33] iommu: Avoid reallocate default domain for a group

2021-01-29 Thread Will Deacon
On Fri, Jan 29, 2021 at 09:52:42AM +0800, Yong Wu wrote:
> On Thu, 2021-01-28 at 21:14 +, Will Deacon wrote:
> > On Thu, Jan 28, 2021 at 09:10:20PM +, Will Deacon wrote:
> > > On Wed, Jan 27, 2021 at 05:39:16PM +0800, Yong Wu wrote:
> > > > On Tue, 2021-01-26 at 22:23 +, Will Deacon wrote:
> > > > > On Mon, Jan 11, 2021 at 07:18:48PM +0800, Yong Wu wrote:
> > > > > > If group->default_domain exists, avoid reallocate it.
> > > > > > 
> > > > > > In some iommu drivers, there may be several devices share a group. 
> > > > > > Avoid
> > > > > > realloc the default domain for this case.
> > > > > > 
> > > > > > Signed-off-by: Yong Wu 
> > > > > > ---
> > > > > >  drivers/iommu/iommu.c | 3 ++-
> > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > > > > > index 3d099a31ddca..f4b87e6abe80 100644
> > > > > > --- a/drivers/iommu/iommu.c
> > > > > > +++ b/drivers/iommu/iommu.c
> > > > > > @@ -266,7 +266,8 @@ int iommu_probe_device(struct device *dev)
> > > > > >  * support default domains, so the return value is not yet
> > > > > >  * checked.
> > > > > >  */
> > > > > > -   iommu_alloc_default_domain(group, dev);
> > > > > > +   if (!group->default_domain)
> > > > > > +   iommu_alloc_default_domain(group, dev);
> > > > > 
> > > > > I don't really get what this achieves, since 
> > > > > iommu_alloc_default_domain()
> > > > > looks like this:
> > > > > 
> > > > > static int iommu_alloc_default_domain(struct iommu_group *group,
> > > > > struct device *dev)
> > > > > {
> > > > >   unsigned int type;
> > > > > 
> > > > >   if (group->default_domain)
> > > > >   return 0;
> > > > > 
> > > > >   ...
> > > > > 
> > > > > in which case, it should be fine?
> > > > 
> > > > oh. sorry, I overlooked this. the current code is enough.
> > > > I will remove this patch. and send the next version in this week.
> > > > Thanks very much.
> > > 
> > > Actually, looking at this again, if we're dropping this patch and patch 6
> > > just needs the kfree() moving about, then there's no need to repost. The
> > > issue that Robin and Paul are discussing can be handled separately.
> > 
> > Argh, except that this version of the patches doesn't apply :)
> > 
> > So after all that, please go ahead and post a v7 ASAP based on this branch:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=for-joerg/mtk
> 
> After confirm with Tomasz, He still need some time to take a look at v6.
> 
> thus I need wait some time to send v7 after his feedback.
> 
> Thanks for your comment. and Thanks Tomasz for the review.

Ok, but please be aware that I'm planning to send my queue to Joerg on
Monday, so if it doesn't show up today then it will miss 5.12.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/2] iommu/ipmmu-vmsa: refactoring and allow SDHI devices

2021-01-29 Thread Joerg Roedel
On Thu, Jan 28, 2021 at 10:02:58PM +0900, Yoshihiro Shimoda wrote:
> Yoshihiro Shimoda (2):
>   iommu/ipmmu-vmsa: refactor ipmmu_of_xlate()
>   iommu/ipmmu-vmsa: Allow SDHI devices
> 
>  drivers/iommu/ipmmu-vmsa.c | 53 
> +++---
>  1 file changed, 22 insertions(+), 31 deletions(-)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] iommu/drm/msm: Allow non-coherent masters to use system cache

2021-01-29 Thread Sai Prakash Ranjan

On 2021-01-20 10:48, Sai Prakash Ranjan wrote:

On 2021-01-11 19:45, Sai Prakash Ranjan wrote:

commit ecd7274fb4cd ("iommu: Remove unused IOMMU_SYS_CACHE_ONLY flag")
removed unused IOMMU_SYS_CACHE_ONLY prot flag and along with it went
the memory type setting required for the non-coherent masters to use
system cache. Now that system cache support for GPU is added, we will
need to set the right PTE attribute for GPU buffers to be sys cached.
Without this, the system cache lines are not allocated for GPU.

So the patches in this series introduces a new prot flag IOMMU_LLC,
renames IO_PGTABLE_QUIRK_ARM_OUTER_WBWA to IO_PGTABLE_QUIRK_PTW_LLC
and makes GPU the user of this protection flag.

The series slightly depends on following 2 patches posted earlier and
is based on msm-next branch:
 * https://lore.kernel.org/patchwork/patch/1363008/
 * https://lore.kernel.org/patchwork/patch/1363010/

Sai Prakash Ranjan (3):
  iommu/io-pgtable: Rename last-level cache quirk to
IO_PGTABLE_QUIRK_PTW_LLC
  iommu/io-pgtable-arm: Add IOMMU_LLC page protection flag
  drm/msm: Use IOMMU_LLC page protection flag to map gpu buffers

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 3 +++
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 2 +-
 drivers/gpu/drm/msm/msm_iommu.c | 3 +++
 drivers/gpu/drm/msm/msm_mmu.h   | 4 
 drivers/iommu/io-pgtable-arm.c  | 9 ++---
 include/linux/io-pgtable.h  | 6 +++---
 include/linux/iommu.h   | 6 ++
 7 files changed, 26 insertions(+), 7 deletions(-)


base-commit: 00fd44a1a4700718d5d962432b55c09820f7e709



Gentle Ping!



Gentle Ping!!

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu/io-pgtable-arm: Add IOMMU_LLC page protection flag

2021-01-29 Thread Will Deacon
On Mon, Jan 11, 2021 at 07:45:04PM +0530, Sai Prakash Ranjan wrote:
> Add a new page protection flag IOMMU_LLC which can be used
> by non-coherent masters to set cacheable memory attributes
> for an outer level of cache called as last-level cache or
> system cache. Initial user of this page protection flag is
> the adreno gpu and then can later be used by other clients
> such as video where this can be used for per-buffer based
> mapping.
> 
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/iommu/io-pgtable-arm.c | 3 +++
>  include/linux/iommu.h  | 6 ++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 7439ee7fdcdb..ebe653ef601b 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -415,6 +415,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct 
> arm_lpae_io_pgtable *data,
>   else if (prot & IOMMU_CACHE)
>   pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
>   << ARM_LPAE_PTE_ATTRINDX_SHIFT);
> + else if (prot & IOMMU_LLC)
> + pte |= (ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE
> + << ARM_LPAE_PTE_ATTRINDX_SHIFT);
>   }
>  
>   if (prot & IOMMU_CACHE)
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index ffaa389ea128..1f82057df531 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -31,6 +31,12 @@
>   * if the IOMMU page table format is equivalent.
>   */
>  #define IOMMU_PRIV   (1 << 5)
> +/*
> + * Non-coherent masters can use this page protection flag to set cacheable
> + * memory attributes for only a transparent outer level of cache, also known 
> as
> + * the last-level or system cache.
> + */
> +#define IOMMU_LLC(1 << 6)

On reflection, I'm a bit worried about exposing this because I think it will
introduce a mismatched virtual alias with the CPU (we don't even have a MAIR
set up for this memory type). Now, we also have that issue for the PTW, but
since we always use cache maintenance (i.e. the streaming API) for
publishing the page-tables to a non-coheren walker, it works out. However,
if somebody expects IOMMU_LLC to be coherent with a DMA API coherent
allocation, then they're potentially in for a nasty surprise due to the
mismatched outer-cacheability attributes.

So I can take patch (1) as a trivial rename, but unfortunately I think this
needs more thought before exposing it beyond the PTW.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/1] Revert "iommu/iova: Retry from last rb tree node if iova search fails"

2021-01-29 Thread Zhen Lei
This reverts commit 4e89dce725213d3d0b0475211b500eda4ef4bf2f.

We find that this patch has a great impact on performance. According to
our test: the iops decreases from 1655.6K to 893.5K, about half.

Hardware: 1 SAS expander with 12 SAS SSD
Command:  Only the main parameters are listed.
  fio bs=4k rw=read iodepth=128 cpus_allowed=0-127

Fixes: 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search 
fails")
Tested-by: Xiang Chen 
Signed-off-by: Zhen Lei 
---
 drivers/iommu/iova.c | 23 ++-
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index d20b8b333d30d17..f840c7207efbced 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -185,9 +185,8 @@ static int __alloc_and_insert_iova_range(struct iova_domain 
*iovad,
struct rb_node *curr, *prev;
struct iova *curr_iova;
unsigned long flags;
-   unsigned long new_pfn, retry_pfn;
+   unsigned long new_pfn;
unsigned long align_mask = ~0UL;
-   unsigned long high_pfn = limit_pfn, low_pfn = iovad->start_pfn;
 
if (size_aligned)
align_mask <<= fls_long(size - 1);
@@ -200,25 +199,15 @@ static int __alloc_and_insert_iova_range(struct 
iova_domain *iovad,
 
curr = __get_cached_rbnode(iovad, limit_pfn);
curr_iova = rb_entry(curr, struct iova, node);
-   retry_pfn = curr_iova->pfn_hi + 1;
-
-retry:
do {
-   high_pfn = min(high_pfn, curr_iova->pfn_lo);
-   new_pfn = (high_pfn - size) & align_mask;
+   limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
+   new_pfn = (limit_pfn - size) & align_mask;
prev = curr;
curr = rb_prev(curr);
curr_iova = rb_entry(curr, struct iova, node);
-   } while (curr && new_pfn <= curr_iova->pfn_hi && new_pfn >= low_pfn);
-
-   if (high_pfn < size || new_pfn < low_pfn) {
-   if (low_pfn == iovad->start_pfn && retry_pfn < limit_pfn) {
-   high_pfn = limit_pfn;
-   low_pfn = retry_pfn;
-   curr = >anchor.node;
-   curr_iova = rb_entry(curr, struct iova, node);
-   goto retry;
-   }
+   } while (curr && new_pfn <= curr_iova->pfn_hi);
+
+   if (limit_pfn < size || new_pfn < iovad->start_pfn) {
iovad->max32_alloc_size = size;
goto iova32_full;
}
-- 
1.8.3


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 0/2] iommu/vt-d: Some misc tweaks in SVA

2021-01-29 Thread Joerg Roedel
On Tue, Jan 26, 2021 at 04:07:28PM +0800, Lu Baolu wrote:
> Lu Baolu (2):
>   iommu/vt-d: Clear PRQ overflow only when PRQ is empty
>   iommu/vt-d: Use INVALID response code instead of FAILURE
> 
>  drivers/iommu/intel/svm.c | 18 --
>  1 file changed, 12 insertions(+), 6 deletions(-)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH 1/3] iommu/arm-smmu-v3: Export cd/ste get functions

2021-01-29 Thread Zhou Wang
Export arm_smmu_get_cd_ptr and arm_smmu_get_step_for_sid to let debug
interface to get related cd and ste.

Signed-off-by: Zhou Wang 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 7 ---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 2 ++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 8ca7415..b65f63e2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -947,8 +947,7 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst,
WRITE_ONCE(*dst, cpu_to_le64(val));
 }
 
-static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain,
-  u32 ssid)
+__le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain, u32 ssid)
 {
__le64 *l1ptr;
unsigned int idx;
@@ -973,6 +972,7 @@ static __le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain 
*smmu_domain,
idx = ssid & (CTXDESC_L2_ENTRIES - 1);
return l1_desc->l2ptr + idx * CTXDESC_CD_DWORDS;
 }
+EXPORT_SYMBOL_GPL(arm_smmu_get_cd_ptr);
 
 int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
struct arm_smmu_ctx_desc *cd)
@@ -2013,7 +2013,7 @@ static int arm_smmu_domain_finalise(struct iommu_domain 
*domain,
return 0;
 }
 
-static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
+__le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
 {
__le64 *step;
struct arm_smmu_strtab_cfg *cfg = >strtab_cfg;
@@ -2034,6 +2034,7 @@ static __le64 *arm_smmu_get_step_for_sid(struct 
arm_smmu_device *smmu, u32 sid)
 
return step;
 }
+EXPORT_SYMBOL_GPL(arm_smmu_get_step_for_sid);
 
 static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
 {
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 96c2e95..3e7af39 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -697,6 +697,8 @@ void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, 
u16 asid);
 bool arm_smmu_free_asid(struct arm_smmu_ctx_desc *cd);
 int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, int ssid,
unsigned long iova, size_t size);
+__le64 *arm_smmu_get_cd_ptr(struct arm_smmu_domain *smmu_domain, u32 ssid);
+__le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid);
 
 #ifdef CONFIG_ARM_SMMU_V3_SVA
 bool arm_smmu_sva_supported(struct arm_smmu_device *smmu);
-- 
2.8.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH 3/3] iommu/arm-smmu-v3: Add debug interfaces for SMMUv3

2021-01-29 Thread Zhou Wang
This patch adds debug interfaces for SMMUv3 driver in sysfs. It adds debug
related files under /sys/kernel/debug/iommu/smmuv3.

User should firstly set device and pasid to pci_dev and pasid by:
(currently only support PCI device)
echo ::. > /sys/kernel/debug/iommu/smmuv3/pci_dev
echo  > /sys/kernel/debug/iommu/smmuv3/pasid

Then value in cd and ste can be got by:
cat /sys/kernel/debug/iommu/smmuv3/ste
cat /sys/kernel/debug/iommu/smmuv3/cd

S1 and S2 page tables can be got by:
cat /sys/kernel/debug/iommu/smmuv3/pt_dump_s1
cat /sys/kernel/debug/iommu/smmuv3/pt_dump_s2

For ste, cd and page table, related device and pasid are set in pci_dev and
pasid files as above.

Signed-off-by: Zhou Wang 
---
 drivers/iommu/Kconfig   |  11 +
 drivers/iommu/arm/arm-smmu-v3/Makefile  |   1 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |   3 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |   8 +
 drivers/iommu/arm/arm-smmu-v3/debugfs.c | 398 
 5 files changed, 421 insertions(+)
 create mode 100644 drivers/iommu/arm/arm-smmu-v3/debugfs.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 192ef8f..4822c88 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -325,6 +325,17 @@ config ARM_SMMU_V3_SVA
  Say Y here if your system supports SVA extensions such as PCIe PASID
  and PRI.
 
+config ARM_SMMU_V3_DEBUGFS
+   bool "Export ARM SMMUv3 internals in Debugfs"
+   depends on ARM_SMMU_V3 && IOMMU_DEBUGFS
+   help
+ DO NOT ENABLE THIS OPTION UNLESS YOU REALLY KNOW WHAT YOU ARE DOING!
+
+ Expose ARM SMMUv3 internals in Debugfs.
+
+ This option is -NOT- intended for production environments, and should
+ only be enabled for debugging ARM SMMUv3.
+
 config S390_IOMMU
def_bool y if S390 && PCI
depends on S390 && PCI
diff --git a/drivers/iommu/arm/arm-smmu-v3/Makefile 
b/drivers/iommu/arm/arm-smmu-v3/Makefile
index 54feb1ec..55b411a 100644
--- a/drivers/iommu/arm/arm-smmu-v3/Makefile
+++ b/drivers/iommu/arm/arm-smmu-v3/Makefile
@@ -3,3 +3,4 @@ obj-$(CONFIG_ARM_SMMU_V3) += arm_smmu_v3.o
 arm_smmu_v3-objs-y += arm-smmu-v3.o
 arm_smmu_v3-objs-$(CONFIG_ARM_SMMU_V3_SVA) += arm-smmu-v3-sva.o
 arm_smmu_v3-objs := $(arm_smmu_v3-objs-y)
+obj-$(CONFIG_ARM_SMMU_V3_DEBUGFS) += debugfs.o
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b65f63e2..aac7fdb 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3602,6 +3602,8 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
return ret;
}
 
+   arm_smmu_debugfs_init();
+
return arm_smmu_set_bus_ops(_smmu_ops);
 }
 
@@ -3610,6 +3612,7 @@ static int arm_smmu_device_remove(struct platform_device 
*pdev)
struct arm_smmu_device *smmu = platform_get_drvdata(pdev);
 
arm_smmu_set_bus_ops(NULL);
+   arm_smmu_debugfs_uninit();
iommu_device_unregister(>iommu);
iommu_device_sysfs_remove(>iommu);
arm_smmu_device_disable(smmu);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 3e7af39..31c4580 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -752,4 +752,12 @@ static inline u32 arm_smmu_sva_get_pasid(struct iommu_sva 
*handle)
 
 static inline void arm_smmu_sva_notifier_synchronize(void) {}
 #endif /* CONFIG_ARM_SMMU_V3_SVA */
+
+#ifdef CONFIG_ARM_SMMU_V3_DEBUGFS
+void arm_smmu_debugfs_init(void);
+void arm_smmu_debugfs_uninit(void);
+#else
+static inline void arm_smmu_debugfs_init(void) {}
+static inline void arm_smmu_debugfs_uninit(void) {}
+#endif /* CONFIG_ARM_SMMU_V3_DEBUGFS */
 #endif /* _ARM_SMMU_V3_H */
diff --git a/drivers/iommu/arm/arm-smmu-v3/debugfs.c 
b/drivers/iommu/arm/arm-smmu-v3/debugfs.c
new file mode 100644
index 000..1af219a
--- /dev/null
+++ b/drivers/iommu/arm/arm-smmu-v3/debugfs.c
@@ -0,0 +1,398 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+#include "arm-smmu-v3.h"
+#include "../../io-pgtable-arm.h"
+
+#undef pr_fmt
+#define pr_fmt(fmt)"SMMUv3 debug: " fmt
+
+#define NAME_BUF_LEN 32
+
+static struct dentry *arm_smmu_debug;
+static char dump_pci_dev[NAME_BUF_LEN];
+static u32 pasid;
+static struct mutex lock;
+
+static ssize_t master_pdev_read(struct file *filp, char __user *buf,
+   size_t count, loff_t *pos)
+{
+   char pdev_name[NAME_BUF_LEN];
+   char name[NAME_BUF_LEN];
+   int ret;
+
+   mutex_lock();
+   strncpy(pdev_name, dump_pci_dev, NAME_BUF_LEN);
+   mutex_unlock();
+
+   if (!strlen(pdev_name)) {
+   pr_err("Please set pci_dev firstly\n");
+   return 0;
+   }
+
+   ret = scnprintf(name, NAME_BUF_LEN, "%s\n", pdev_name);
+   return simple_read_from_buffer(buf, 

[RFC PATCH 2/3] iommu/io-pgtable: Export page table walk needed functions and macros

2021-01-29 Thread Zhou Wang
Export page table walk needed functions and macros, then page table dump
in later debug interface could be used directly.

Signed-off-by: Zhou Wang 
---
 drivers/iommu/io-pgtable-arm.c | 47 +-
 drivers/iommu/io-pgtable-arm.h | 43 ++
 2 files changed, 48 insertions(+), 42 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 87def58..920a92b 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -24,35 +24,12 @@
 
 #define ARM_LPAE_MAX_ADDR_BITS 52
 #define ARM_LPAE_S2_MAX_CONCAT_PAGES   16
-#define ARM_LPAE_MAX_LEVELS4
-
-/* Struct accessors */
-#define io_pgtable_to_data(x)  \
-   container_of((x), struct arm_lpae_io_pgtable, iop)
-
-#define io_pgtable_ops_to_data(x)  \
-   io_pgtable_to_data(io_pgtable_ops_to_pgtable(x))
-
-/*
- * Calculate the right shift amount to get to the portion describing level l
- * in a virtual address mapped by the pagetable in d.
- */
-#define ARM_LPAE_LVL_SHIFT(l,d)
\
-   (((ARM_LPAE_MAX_LEVELS - (l)) * (d)->bits_per_level) +  \
-   ilog2(sizeof(arm_lpae_iopte)))
 
 #define ARM_LPAE_GRANULE(d)\
(sizeof(arm_lpae_iopte) << (d)->bits_per_level)
 #define ARM_LPAE_PGD_SIZE(d)   \
(sizeof(arm_lpae_iopte) << (d)->pgd_bits)
 
-/*
- * Calculate the index at level l used to map virtual address a using the
- * pagetable in d.
- */
-#define ARM_LPAE_PGD_IDX(l,d)  \
-   ((l) == (d)->start_level ? (d)->pgd_bits - (d)->bits_per_level : 0)
-
 #define ARM_LPAE_LVL_IDX(a,l,d)
\
(((u64)(a) >> ARM_LPAE_LVL_SHIFT(l,d)) &\
 ((1 << ((d)->bits_per_level + ARM_LPAE_PGD_IDX(l,d))) - 1))
@@ -127,34 +104,19 @@
 #define ARM_MALI_LPAE_MEMATTR_IMP_DEF  0x88ULL
 #define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x8DULL
 
-/* IOPTE accessors */
-#define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))
-
 #define iopte_type(pte)\
(((pte) >> ARM_LPAE_PTE_TYPE_SHIFT) & ARM_LPAE_PTE_TYPE_MASK)
 
 #define iopte_prot(pte)((pte) & ARM_LPAE_PTE_ATTR_MASK)
 
-struct arm_lpae_io_pgtable {
-   struct io_pgtable   iop;
-
-   int pgd_bits;
-   int start_level;
-   int bits_per_level;
-
-   void*pgd;
-};
-
-typedef u64 arm_lpae_iopte;
-
-static inline bool iopte_leaf(arm_lpae_iopte pte, int lvl,
- enum io_pgtable_fmt fmt)
+bool iopte_leaf(arm_lpae_iopte pte, int lvl, enum io_pgtable_fmt fmt)
 {
if (lvl == (ARM_LPAE_MAX_LEVELS - 1) && fmt != ARM_MALI_LPAE)
return iopte_type(pte) == ARM_LPAE_PTE_TYPE_PAGE;
 
return iopte_type(pte) == ARM_LPAE_PTE_TYPE_BLOCK;
 }
+EXPORT_SYMBOL_GPL(iopte_leaf);
 
 static arm_lpae_iopte paddr_to_iopte(phys_addr_t paddr,
 struct arm_lpae_io_pgtable *data)
@@ -165,8 +127,8 @@ static arm_lpae_iopte paddr_to_iopte(phys_addr_t paddr,
return (pte | (pte >> (48 - 12))) & ARM_LPAE_PTE_ADDR_MASK;
 }
 
-static phys_addr_t iopte_to_paddr(arm_lpae_iopte pte,
- struct arm_lpae_io_pgtable *data)
+phys_addr_t iopte_to_paddr(arm_lpae_iopte pte,
+  struct arm_lpae_io_pgtable *data)
 {
u64 paddr = pte & ARM_LPAE_PTE_ADDR_MASK;
 
@@ -176,6 +138,7 @@ static phys_addr_t iopte_to_paddr(arm_lpae_iopte pte,
/* Rotate the packed high-order bits back to the top */
return (paddr | (paddr << (48 - 12))) & (ARM_LPAE_PTE_ADDR_MASK << 4);
 }
+EXPORT_SYMBOL_GPL(iopte_to_paddr);
 
 static bool selftest_running = false;
 
diff --git a/drivers/iommu/io-pgtable-arm.h b/drivers/iommu/io-pgtable-arm.h
index ba7cfdf..45e6d38 100644
--- a/drivers/iommu/io-pgtable-arm.h
+++ b/drivers/iommu/io-pgtable-arm.h
@@ -2,6 +2,8 @@
 #ifndef IO_PGTABLE_ARM_H_
 #define IO_PGTABLE_ARM_H_
 
+#include 
+
 #define ARM_LPAE_TCR_TG0_4K0
 #define ARM_LPAE_TCR_TG0_64K   1
 #define ARM_LPAE_TCR_TG0_16K   2
@@ -27,4 +29,45 @@
 #define ARM_LPAE_TCR_PS_48_BIT 0x5ULL
 #define ARM_LPAE_TCR_PS_52_BIT 0x6ULL
 
+#define ARM_LPAE_MAX_LEVELS4
+
+struct arm_lpae_io_pgtable {
+   struct io_pgtable   iop;
+
+   int pgd_bits;
+   int start_level;
+   int bits_per_level;
+
+   void*pgd;
+};
+
+/* Struct accessors */
+#define io_pgtable_to_data(x)  \
+   container_of((x), struct arm_lpae_io_pgtable, 

[RFC PATCH 0/3] iommu/arm-smmu-v3: Add debug interfaces for SMMUv3

2021-01-29 Thread Zhou Wang
This RFC series is the followed patch of this discussion:
https://www.spinics.net/lists/arm-kernel/msg866187.html. 

Currently there is no debug interface about SMMUv3 driver, which makes it
not convenient when we want to dump some information, like the value of
CD/STE, S1/S2 page table, SMMU registers or cmd/event/pri queues.

This series tries to add support of dumping CD/STE and page table. The
interface design is that user sets device/pasid firstly by sysfs files
and then read related sysfs file to get information:

 (currently only support PCI device)
 echo ::. > /sys/kernel/debug/iommu/smmuv3/pci_dev
 echo  > /sys/kernel/debug/iommu/smmuv3/pasid
 
 Then value in CD and STE can be got by:
 cat /sys/kernel/debug/iommu/smmuv3/ste
 cat /sys/kernel/debug/iommu/smmuv3/cd
 
 S1 and S2 page tables can be got by:
 cat /sys/kernel/debug/iommu/smmuv3/pt_dump_s1
 cat /sys/kernel/debug/iommu/smmuv3/pt_dump_s2

For STE, CD and page table, related device and pasid are set in pci_dev
and pasid files as above.

First and second patch export some help functions or macros in arm-smmu-v3
and io-pgtable-arm codes, so we can reuse them in debugfs.c. As a RFC, this
series does not go further to dump SMMU registers and cmd/event/pri queues.
I am not sure this series is in the right way, so let's post it out and have a
discussion. Looking forward to any feedback.

Zhou Wang (3):
  iommu/arm-smmu-v3: Export cd/ste get functions
  iommu/io-pgtable: Export page table walk needed functions and macros
  iommu/arm-smmu-v3: Add debug interfaces for SMMUv3

 drivers/iommu/Kconfig   |  11 +
 drivers/iommu/arm/arm-smmu-v3/Makefile  |   1 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  10 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  10 +
 drivers/iommu/arm/arm-smmu-v3/debugfs.c | 398 
 drivers/iommu/io-pgtable-arm.c  |  47 +---
 drivers/iommu/io-pgtable-arm.h  |  43 +++
 7 files changed, 475 insertions(+), 45 deletions(-)
 create mode 100644 drivers/iommu/arm/arm-smmu-v3/debugfs.c

-- 
2.8.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] Revert "iommu/iova: Retry from last rb tree node if iova search fails"

2021-01-29 Thread Leizhen (ThunderTown)


Currently, we are thinking about the solution to the problem. However, because 
the end time of v5.11 is approaching, this patch is sent first.


On 2021/1/29 17:21, Zhen Lei wrote:
> This reverts commit 4e89dce725213d3d0b0475211b500eda4ef4bf2f.
> 
> We find that this patch has a great impact on performance. According to
> our test: the iops decreases from 1655.6K to 893.5K, about half.
> 
> Hardware: 1 SAS expander with 12 SAS SSD
> Command:  Only the main parameters are listed.
>   fio bs=4k rw=read iodepth=128 cpus_allowed=0-127
> 
> Fixes: 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search 
> fails")
> Tested-by: Xiang Chen 
> Signed-off-by: Zhen Lei 
> ---
>  drivers/iommu/iova.c | 23 ++-
>  1 file changed, 6 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index d20b8b333d30d17..f840c7207efbced 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -185,9 +185,8 @@ static int __alloc_and_insert_iova_range(struct 
> iova_domain *iovad,
>   struct rb_node *curr, *prev;
>   struct iova *curr_iova;
>   unsigned long flags;
> - unsigned long new_pfn, retry_pfn;
> + unsigned long new_pfn;
>   unsigned long align_mask = ~0UL;
> - unsigned long high_pfn = limit_pfn, low_pfn = iovad->start_pfn;
>  
>   if (size_aligned)
>   align_mask <<= fls_long(size - 1);
> @@ -200,25 +199,15 @@ static int __alloc_and_insert_iova_range(struct 
> iova_domain *iovad,
>  
>   curr = __get_cached_rbnode(iovad, limit_pfn);
>   curr_iova = rb_entry(curr, struct iova, node);
> - retry_pfn = curr_iova->pfn_hi + 1;
> -
> -retry:
>   do {
> - high_pfn = min(high_pfn, curr_iova->pfn_lo);
> - new_pfn = (high_pfn - size) & align_mask;
> + limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
> + new_pfn = (limit_pfn - size) & align_mask;
>   prev = curr;
>   curr = rb_prev(curr);
>   curr_iova = rb_entry(curr, struct iova, node);
> - } while (curr && new_pfn <= curr_iova->pfn_hi && new_pfn >= low_pfn);
> -
> - if (high_pfn < size || new_pfn < low_pfn) {
> - if (low_pfn == iovad->start_pfn && retry_pfn < limit_pfn) {
> - high_pfn = limit_pfn;
> - low_pfn = retry_pfn;
> - curr = >anchor.node;
> - curr_iova = rb_entry(curr, struct iova, node);
> - goto retry;
> - }
> + } while (curr && new_pfn <= curr_iova->pfn_hi);
> +
> + if (limit_pfn < size || new_pfn < iovad->start_pfn) {
>   iovad->max32_alloc_size = size;
>   goto iova32_full;
>   }
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 06/27] dt-bindings: mediatek: Add binding for mt8192 IOMMU

2021-01-29 Thread Tomasz Figa
On Mon, Jan 25, 2021 at 4:34 PM Yong Wu  wrote:
>
> On Mon, 2021-01-25 at 13:18 +0900, Tomasz Figa wrote:
> > On Wed, Jan 20, 2021 at 4:08 PM Yong Wu  wrote:
> > >
> > > On Wed, 2021-01-20 at 13:15 +0900, Tomasz Figa wrote:
> > > > On Wed, Jan 13, 2021 at 3:45 PM Yong Wu  wrote:
> > > > >
> > > > > On Wed, 2021-01-13 at 14:30 +0900, Tomasz Figa wrote:
> > > > > > On Thu, Dec 24, 2020 at 8:35 PM Yong Wu  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Wed, 2020-12-23 at 17:18 +0900, Tomasz Figa wrote:
> > > > > > > > On Wed, Dec 09, 2020 at 04:00:41PM +0800, Yong Wu wrote:
> > > > > > > > > This patch adds decriptions for mt8192 IOMMU and SMI.
> > > > > > > > >
> > > > > > > > > mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor 
> > > > > > > > > translation
> > > > > > > > > table format. The M4U-SMI HW diagram is as below:
> > > > > > > > >
> > > > > > > > >   EMI
> > > > > > > > >|
> > > > > > > > >   M4U
> > > > > > > > >|
> > > > > > > > >   
> > > > > > > > >SMI Common
> > > > > > > > >   
> > > > > > > > >|
> > > > > > > > >   +---+--+--+--+---+
> > > > > > > > >   |   |  |  |   .. |   |
> > > > > > > > >   |   |  |  |  |   |
> > > > > > > > > larb0   larb1  larb2  larb4 ..  larb19   larb20
> > > > > > > > > disp0   disp1   mdpvdec   IPE  IPE
> > > > > > > > >
> > > > > > > > > All the connections are HW fixed, SW can NOT adjust it.
> > > > > > > > >
> > > > > > > > > mt8192 M4U support 0~16GB iova range. we preassign different 
> > > > > > > > > engines
> > > > > > > > > into different iova ranges:
> > > > > > > > >
> > > > > > > > > domain-id  module iova-range  larbs
> > > > > > > > >0   disp0 ~ 4G  larb0/1
> > > > > > > > >1   vcodec  4G ~ 8G larb4/5/7
> > > > > > > > >2   cam/mdp 8G ~ 12G 
> > > > > > > > > larb2/9/11/13/14/16/17/18/19/20
> > > > > > > >
> > > > > > > > Why do we preassign these addresses in DT? Shouldn't it be a 
> > > > > > > > user's or
> > > > > > > > integrator's decision to split the 16 GB address range into 
> > > > > > > > sub-ranges
> > > > > > > > and define which larbs those sub-ranges are shared with?
> > > > > > >
> > > > > > > The problem is that we can't split the 16GB range with the larb 
> > > > > > > as unit.
> > > > > > > The example is the below ccu0(larb13 port9/10) is a independent
> > > > > > > range(domain), the others ports in larb13 is in another domain.
> > > > > > >
> > > > > > > disp/vcodec/cam/mdp don't have special iova requirement, they 
> > > > > > > could
> > > > > > > access any range. vcodec also can locate 8G~12G. it don't care 
> > > > > > > about
> > > > > > > where its iova locate. here I preassign like this following with 
> > > > > > > our
> > > > > > > internal project setting.
> > > > > >
> > > > > > Let me try to understand this a bit more. Given the split you're
> > > > > > proposing, is there actually any isolation enforced between 
> > > > > > particular
> > > > > > domains? For example, if I program vcodec to with a DMA address from
> > > > > > the 0-4G range, would the IOMMU actually generate a fault, even if
> > > > > > disp had some memory mapped at that address?
> > > > >
> > > > > In this case. we will get fault in current SW setting.
> > > > >
> > > >
> > > > Okay, thanks.
> > > >
> > > > > >
> > > > > > >
> > > > > > > Why set this in DT?, this is only for simplifying the code. 
> > > > > > > Assume we
> > > > > > > put it in the platform data. We have up to 32 larbs, each larb 
> > > > > > > has up to
> > > > > > > 32 ports, each port may be in different iommu domains. we should 
> > > > > > > have a
> > > > > > > big array for this..however we only use a macro to get the domain 
> > > > > > > in the
> > > > > > > DT method.
> > > > > > >
> > > > > > > When replying this mail, I happen to see there is a 
> > > > > > > "dev->dev_range_map"
> > > > > > > which has "dma-range" information, I think I could use this value 
> > > > > > > to get
> > > > > > > which domain the device belong to. then no need put domid in DT. 
> > > > > > > I will
> > > > > > > test this.
> > > > > >
> > > > > > My feeling is that the only part that needs to be enforced 
> > > > > > statically
> > > > > > is the reserved IOVA range for CCUs. The other ranges should be
> > > > > > determined dynamically, although I think I need to understand better
> > > > > > how the hardware and your proposed design work to tell what would be
> > > > > > likely the best choice here.
> > > > >
> > > > > I have removed the domid patch in v6. and get the domain id in [27/33]
> > > > > in 

Re: [PATCH RFC 0/9] Possible set of VT-d optimizations

2021-01-29 Thread Lu Baolu

On 2021/1/28 22:52, Chuck Lever wrote:




On Jan 28, 2021, at 8:59 AM, Robin Murphy  wrote:

On 2021-01-27 20:00, Chuck Lever wrote:

Hi-
This collection of patches seems to get the best throughtput results
so far. The NFS WRITE result is fully restored, and the NFS READ
result is very close to fully restored.
Children see throughput for 12 initial writers  = 5008474.03 kB/sec
Parent sees throughput for 12 initial writers   = 4996927.80 kB/sec
Min throughput per process  = 416956.88 kB/sec
Max throughput per process  = 417910.22 kB/sec
Avg throughput per process  = 417372.84 kB/sec
Min xfer= 1046272.00 kB
CPU Utilization: Wall time2.515CPU time1.996CPU 
utilization  79.37 %
Children see throughput for 12 rewriters= 5020584.59 kB/sec
Parent sees throughput for 12 rewriters = 5012539.29 kB/sec
Min throughput per process  = 417799.00 kB/sec
Max throughput per process  = 419082.22 kB/sec
Avg throughput per process  = 418382.05 kB/sec
Min xfer= 1046528.00 kB
CPU utilization: Wall time2.507CPU time2.024CPU 
utilization  80.73 %
Children see throughput for 12 readers  = 5805484.25 kB/sec
Parent sees throughput for 12 readers   = 5799535.68 kB/sec
Min throughput per process  = 482888.16 kB/sec
Max throughput per process  = 48.16 kB/sec
Avg throughput per process  = 483790.35 kB/sec
Min xfer= 1045760.00 kB
CPU utilization: Wall time2.167CPU time1.964CPU 
utilization  90.63 %
Children see throughput for 12 re-readers   = 5812227.16 kB/sec
Parent sees throughput for 12 re-readers= 5803793.06 kB/sec
Min throughput per process  = 483242.97 kB/sec
Max throughput per process  = 485724.41 kB/sec
Avg throughput per process  = 484352.26 kB/sec
Min xfer= 1043456.00 kB
CPU utilization: Wall time2.161CPU time1.976CPU 
utilization  91.45 %
I've included a simple-minded implementation of a map_sg op for
the Intel IOMMU. This is nothing more than a copy of the loop in
__iommu_map_sg() with the call to __iommu_map() replaced with a
call to intel_iommu_map().


...which is the main reason I continue to strongly dislike patches #4-#9 (#3 definitely 
seems to makes sense either way, now that #1 and #2 are going to land). If a common 
operation is worth optimising anywhere, then it deserves optimising everywhere, so we end 
up with a dozen diverging copies of essentially the same code - particularly when the 
driver-specific functionality *is* already in the drivers, so what gets duplicated is 
solely the "generic" parts.


I don't disagree with that assessment, but I don't immediately see an
alternative API arrangement that would be more successful in the short
term. If 4/9 - 9/9 are not acceptable, then the responsible thing to
do would be to revert:

  - 58a8bb39490d ("iommu/vt-d: Cleanup after converting to dma-iommu ops")
  - c588072bba6b ("iommu/vt-d: Convert intel iommu driver to the iommu ops")

for v5.11, work out the proper API design, and then try the VT-d conversion
again.


Probably we could introduce an iommu_ops->map/unmap_range() callback and
let the iommu driver to select "one call per page" map() or simply map a
range in a loop.

Best regards,
baolu



IMHO.



And if there's justification for pushing iommu_map_sg() entirely into drivers, then it's 
verging on self-contradictory not to do the same for iommu_map() and iommu_unmap(). Some 
IOMMU drivers - mainly intel-iommu, as it happens - are already implementing hacks around 
the "one call per page" interface being inherently inefficient, so the logical 
thing to do here is take a step back and reconsider the fundamental design of the whole 
map/unmap interface. Implementing hacks on top of hacks to make particular things faster 
on particular systems that particular people care about is not going to do us any favours 
in the long run.

As it stands, I can easily see a weird anti-pattern emerging where people start 
adding code to fake up scatterlists in random drivers because they see 
dma_map_sg() performing paradoxically better than dma_map_page().

Robin.


---
Chuck Lever (1):
   iommu/vt-d: Introduce map_sg() for Intel IOMMUs
Isaac J. Manjarres (5):
   iommu/io-pgtable: Introduce map_sg() as a page table op
   iommu/io-pgtable-arm: Hook up map_sg()
   iommu/io-pgtable-arm-v7s: Hook up map_sg()
   iommu: Introduce 

RE: [RFC PATCH v2] uacce: Add uacce_ctrl misc device

2021-01-29 Thread Tian, Kevin
> From: Song Bao Hua (Barry Song)
> Sent: Tuesday, January 26, 2021 9:27 AM
> 
> > -Original Message-
> > From: Jason Gunthorpe [mailto:j...@ziepe.ca]
> > Sent: Tuesday, January 26, 2021 2:13 PM
> > To: Song Bao Hua (Barry Song) 
> > Cc: Wangzhou (B) ; Greg Kroah-Hartman
> > ; Arnd Bergmann ;
> Zhangfei Gao
> > ; linux-accelerat...@lists.ozlabs.org;
> > linux-ker...@vger.kernel.org; iommu@lists.linux-foundation.org;
> > linux...@kvack.org; Liguozhu (Kenneth) ;
> chensihang
> > (A) 
> > Subject: Re: [RFC PATCH v2] uacce: Add uacce_ctrl misc device
> >
> > On Mon, Jan 25, 2021 at 11:35:22PM +, Song Bao Hua (Barry Song)
> wrote:
> >
> > > > On Mon, Jan 25, 2021 at 10:21:14PM +, Song Bao Hua (Barry Song)
> wrote:
> > > > > mlock, while certainly be able to prevent swapping out, it won't
> > > > > be able to stop page moving due to:
> > > > > * memory compaction in alloc_pages()
> > > > > * making huge pages
> > > > > * numa balance
> > > > > * memory compaction in CMA
> > > >
> > > > Enabling those things is a major reason to have SVA device in the
> > > > first place, providing a SW API to turn it all off seems like the
> > > > wrong direction.
> > >
> > > I wouldn't say this is a major reason to have SVA. If we read the
> > > history of SVA and papers, people would think easy programming due
> > > to data struct sharing between cpu and device, and process space
> > > isolation in device would be the major reasons for SVA. SVA also
> > > declares it supports zero-copy while zero-copy doesn't necessarily
> > > depend on SVA.
> >
> > Once you have to explicitly make system calls to declare memory under
> > IO, you loose all of that.
> >
> > Since you've asked the app to be explicit about the DMAs it intends to
> > do, there is not really much reason to use SVA for those DMAs anymore.
> 
> Let's see a non-SVA case. We are not using SVA, we can have
> a memory pool by hugetlb or pin, and app can allocate memory
> from this pool, and get stable I/O performance on the memory
> from the pool. But device has its separate page table which
> is not bound with this process, thus lacking the protection
> of process space isolation. Plus, CPU and device are using
> different address.
> 
> And then we move to SVA case, we can still have a memory pool
> by hugetlb or pin, and app can allocate memory from this pool
> since this pool is mapped to the address space of the process,
> and we are able to get stable I/O performance since it is always
> there. But in this case, device is using the page table of
> process with the full permission control.
> And they are using same address and can possibly enjoy the easy
> programming if HW supports.
> 
> SVA is not doom to work with IO page fault only. If we have SVA+pin,
> we would get both sharing address and stable I/O latency.
> 

Isn't it like a traditional MAP_DMA API (imply pinning) plus specifying 
cpu_va of the memory pool as the iova? 

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH v2] uacce: Add uacce_ctrl misc device

2021-01-29 Thread Song Bao Hua (Barry Song)



> -Original Message-
> From: Tian, Kevin [mailto:kevin.t...@intel.com]
> Sent: Friday, January 29, 2021 11:09 PM
> To: Song Bao Hua (Barry Song) ; Jason Gunthorpe
> 
> Cc: chensihang (A) ; Arnd Bergmann
> ; Greg Kroah-Hartman ;
> linux-ker...@vger.kernel.org; iommu@lists.linux-foundation.org;
> linux...@kvack.org; Zhangfei Gao ; Liguozhu
> (Kenneth) ; linux-accelerat...@lists.ozlabs.org
> Subject: RE: [RFC PATCH v2] uacce: Add uacce_ctrl misc device
> 
> > From: Song Bao Hua (Barry Song)
> > Sent: Tuesday, January 26, 2021 9:27 AM
> >
> > > -Original Message-
> > > From: Jason Gunthorpe [mailto:j...@ziepe.ca]
> > > Sent: Tuesday, January 26, 2021 2:13 PM
> > > To: Song Bao Hua (Barry Song) 
> > > Cc: Wangzhou (B) ; Greg Kroah-Hartman
> > > ; Arnd Bergmann ;
> > Zhangfei Gao
> > > ; linux-accelerat...@lists.ozlabs.org;
> > > linux-ker...@vger.kernel.org; iommu@lists.linux-foundation.org;
> > > linux...@kvack.org; Liguozhu (Kenneth) ;
> > chensihang
> > > (A) 
> > > Subject: Re: [RFC PATCH v2] uacce: Add uacce_ctrl misc device
> > >
> > > On Mon, Jan 25, 2021 at 11:35:22PM +, Song Bao Hua (Barry Song)
> > wrote:
> > >
> > > > > On Mon, Jan 25, 2021 at 10:21:14PM +, Song Bao Hua (Barry Song)
> > wrote:
> > > > > > mlock, while certainly be able to prevent swapping out, it won't
> > > > > > be able to stop page moving due to:
> > > > > > * memory compaction in alloc_pages()
> > > > > > * making huge pages
> > > > > > * numa balance
> > > > > > * memory compaction in CMA
> > > > >
> > > > > Enabling those things is a major reason to have SVA device in the
> > > > > first place, providing a SW API to turn it all off seems like the
> > > > > wrong direction.
> > > >
> > > > I wouldn't say this is a major reason to have SVA. If we read the
> > > > history of SVA and papers, people would think easy programming due
> > > > to data struct sharing between cpu and device, and process space
> > > > isolation in device would be the major reasons for SVA. SVA also
> > > > declares it supports zero-copy while zero-copy doesn't necessarily
> > > > depend on SVA.
> > >
> > > Once you have to explicitly make system calls to declare memory under
> > > IO, you loose all of that.
> > >
> > > Since you've asked the app to be explicit about the DMAs it intends to
> > > do, there is not really much reason to use SVA for those DMAs anymore.
> >
> > Let's see a non-SVA case. We are not using SVA, we can have
> > a memory pool by hugetlb or pin, and app can allocate memory
> > from this pool, and get stable I/O performance on the memory
> > from the pool. But device has its separate page table which
> > is not bound with this process, thus lacking the protection
> > of process space isolation. Plus, CPU and device are using
> > different address.
> >
> > And then we move to SVA case, we can still have a memory pool
> > by hugetlb or pin, and app can allocate memory from this pool
> > since this pool is mapped to the address space of the process,
> > and we are able to get stable I/O performance since it is always
> > there. But in this case, device is using the page table of
> > process with the full permission control.
> > And they are using same address and can possibly enjoy the easy
> > programming if HW supports.
> >
> > SVA is not doom to work with IO page fault only. If we have SVA+pin,
> > we would get both sharing address and stable I/O latency.
> >
> 
> Isn't it like a traditional MAP_DMA API (imply pinning) plus specifying
> cpu_va of the memory pool as the iova?

I think it enjoys the advantage of stable I/O latency of
traditional MAP_DMA, and also uses the process page table
which SVA can provide. The major difference is that in
SVA case, iova totally belongs to process and is as normal
as other heap/stack/data:
p = mmap(.MAP_ANON);
ioctl(/dev/acc, p, PIN);

SVA for itself, provides the ability to guarantee the
address space isolation of multiple processes.  If the
device can access the data struct  such as list, tree
directly, they can further enjoy the convenience of
programming SVA gives.

So we are looking for a combination of stable io latency
of traditional DMA map and the ability of SVA.

> 
> Thanks
> Kevin

Thanks
Barry

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] Revert "iommu/iova: Retry from last rb tree node if iova search fails"

2021-01-29 Thread Robin Murphy

On 2021-01-29 09:48, Leizhen (ThunderTown) wrote:


Currently, we are thinking about the solution to the problem. However, because 
the end time of v5.11 is approaching, this patch is sent first.


However, that commit was made for a reason - how do we justify that one 
thing being slow is more important than another thing being completely 
broken? It's not practical to just keep doing the patch hokey-cokey 
based on whoever shouts loudest :(



On 2021/1/29 17:21, Zhen Lei wrote:

This reverts commit 4e89dce725213d3d0b0475211b500eda4ef4bf2f.

We find that this patch has a great impact on performance. According to
our test: the iops decreases from 1655.6K to 893.5K, about half.

Hardware: 1 SAS expander with 12 SAS SSD
Command:  Only the main parameters are listed.
   fio bs=4k rw=read iodepth=128 cpus_allowed=0-127


FWIW, I'm 99% sure that what you really want is [1], but then you get to 
battle against an unknown quantity of dodgy firmware instead.


Robin.

[1] 
https://lore.kernel.org/linux-iommu/d412c292d222eb36469effd338e985f9d9e24cd6.1594207679.git.robin.mur...@arm.com/



Fixes: 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search 
fails")
Tested-by: Xiang Chen 
Signed-off-by: Zhen Lei 
---
  drivers/iommu/iova.c | 23 ++-
  1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index d20b8b333d30d17..f840c7207efbced 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -185,9 +185,8 @@ static int __alloc_and_insert_iova_range(struct iova_domain 
*iovad,
struct rb_node *curr, *prev;
struct iova *curr_iova;
unsigned long flags;
-   unsigned long new_pfn, retry_pfn;
+   unsigned long new_pfn;
unsigned long align_mask = ~0UL;
-   unsigned long high_pfn = limit_pfn, low_pfn = iovad->start_pfn;
  
  	if (size_aligned)

align_mask <<= fls_long(size - 1);
@@ -200,25 +199,15 @@ static int __alloc_and_insert_iova_range(struct 
iova_domain *iovad,
  
  	curr = __get_cached_rbnode(iovad, limit_pfn);

curr_iova = rb_entry(curr, struct iova, node);
-   retry_pfn = curr_iova->pfn_hi + 1;
-
-retry:
do {
-   high_pfn = min(high_pfn, curr_iova->pfn_lo);
-   new_pfn = (high_pfn - size) & align_mask;
+   limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
+   new_pfn = (limit_pfn - size) & align_mask;
prev = curr;
curr = rb_prev(curr);
curr_iova = rb_entry(curr, struct iova, node);
-   } while (curr && new_pfn <= curr_iova->pfn_hi && new_pfn >= low_pfn);
-
-   if (high_pfn < size || new_pfn < low_pfn) {
-   if (low_pfn == iovad->start_pfn && retry_pfn < limit_pfn) {
-   high_pfn = limit_pfn;
-   low_pfn = retry_pfn;
-   curr = >anchor.node;
-   curr_iova = rb_entry(curr, struct iova, node);
-   goto retry;
-   }
+   } while (curr && new_pfn <= curr_iova->pfn_hi);
+
+   if (limit_pfn < size || new_pfn < iovad->start_pfn) {
iovad->max32_alloc_size = size;
goto iova32_full;
}




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 00/33] MT8192 IOMMU support

2021-01-29 Thread Tomasz Figa
Hi Yong,

On Mon, Jan 11, 2021 at 07:18:41PM +0800, Yong Wu wrote:
> This patch mainly adds support for mt8192 Multimedia IOMMU and SMI.
> 
> mt8192 also is MTK IOMMU gen2 which uses ARM Short-Descriptor translation
> table format. The M4U-SMI HW diagram is as below:
> 
>   EMI
>|
>   M4U
>|
>   
>SMI Common
>   
>|
>   +---+--+--+--+---+
>   |   |  |  |   .. |   |
>   |   |  |  |  |   |
> larb0   larb1  larb2  larb4 ..  larb19   larb20
> disp0   disp1   mdpvdec   IPE  IPE
> 
> All the connections are HW fixed, SW can NOT adjust it.
> 
> Comparing with the preview SoC, this patchset mainly adds two new functions:
> a) add iova 34 bits support.
> b) add multi domains support since several HW has the special iova
> region requirement.
> 
> change note:
> v6:a) base on v5.11-rc1. and tlb v4:
>   
> https://lore.kernel.org/linux-mediatek/20210107122909.16317-1-yong...@mediatek.com/T/#t
>  
>b) Remove the "domain id" definition in the binding header file.
>   Get the domain from dev->dma_range_map.
>   After this, Change many codes flow.
>c) the patchset adds a new common file(mtk_smi-larb-port.h).
>   This version changes that name into mtk-memory-port.h which reflect 
>   its file path. This only changes the file name. no other change.
>   thus I keep all the Reviewed-by Tags.
>   (another reason is that we will add some iommu ports unrelated with
>smi-larb)
>d) Refactor the power-domain flow suggestted by Tomasz.
>e) Some other small fix. use different oas for different soc; Change the
>macro for 34bit iova tlb flush.
> 

Thanks for the fixes.

I still think the concept of dma-ranges is not quire right for the
problem we need to solve here, but it certainly works for the time being
and it's possible to remove it in a follow up patch, so I'm fine with
merging this as is.

Reviewed-by: Tomasz Figa 

I'll comment on my suggestion for a replacement for the dma-ranges that
doesn't need hardcoding arbitrary address ranges in DT in a separate
reply.

Best regards,
Tomasz
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] Revert "iommu/iova: Retry from last rb tree node if iova search fails"

2021-01-29 Thread chenxiang (M)

Hi Robin,


在 2021/1/29 20:03, Robin Murphy 写道:

On 2021-01-29 09:48, Leizhen (ThunderTown) wrote:


Currently, we are thinking about the solution to the problem. 
However, because the end time of v5.11 is approaching, this patch is 
sent first.


However, that commit was made for a reason - how do we justify that 
one thing being slow is more important than another thing being 
completely broken? It's not practical to just keep doing the patch 
hokey-cokey based on whoever shouts loudest :(



On 2021/1/29 17:21, Zhen Lei wrote:

This reverts commit 4e89dce725213d3d0b0475211b500eda4ef4bf2f.

We find that this patch has a great impact on performance. According to
our test: the iops decreases from 1655.6K to 893.5K, about half.

Hardware: 1 SAS expander with 12 SAS SSD
Command:  Only the main parameters are listed.
   fio bs=4k rw=read iodepth=128 cpus_allowed=0-127


FWIW, I'm 99% sure that what you really want is [1], but then you get 
to battle against an unknown quantity of dodgy firmware instead.


Robin.

[1] 
https://lore.kernel.org/linux-iommu/d412c292d222eb36469effd338e985f9d9e24cd6.1594207679.git.robin.mur...@arm.com/


Thank you for pointing that out. I have tested it, and it solves the 
performance drop issue mentioned above.
I noticed that you sent it July 2020, and do you have a plan to merge it 
recently?





Fixes: 4e89dce72521 ("iommu/iova: Retry from last rb tree node if 
iova search fails")

Tested-by: Xiang Chen 
Signed-off-by: Zhen Lei 
---
  drivers/iommu/iova.c | 23 ++-
  1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index d20b8b333d30d17..f840c7207efbced 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -185,9 +185,8 @@ static int __alloc_and_insert_iova_range(struct 
iova_domain *iovad,

  struct rb_node *curr, *prev;
  struct iova *curr_iova;
  unsigned long flags;
-unsigned long new_pfn, retry_pfn;
+unsigned long new_pfn;
  unsigned long align_mask = ~0UL;
-unsigned long high_pfn = limit_pfn, low_pfn = iovad->start_pfn;
if (size_aligned)
  align_mask <<= fls_long(size - 1);
@@ -200,25 +199,15 @@ static int 
__alloc_and_insert_iova_range(struct iova_domain *iovad,

curr = __get_cached_rbnode(iovad, limit_pfn);
  curr_iova = rb_entry(curr, struct iova, node);
-retry_pfn = curr_iova->pfn_hi + 1;
-
-retry:
  do {
-high_pfn = min(high_pfn, curr_iova->pfn_lo);
-new_pfn = (high_pfn - size) & align_mask;
+limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
+new_pfn = (limit_pfn - size) & align_mask;
  prev = curr;
  curr = rb_prev(curr);
  curr_iova = rb_entry(curr, struct iova, node);
-} while (curr && new_pfn <= curr_iova->pfn_hi && new_pfn >= 
low_pfn);

-
-if (high_pfn < size || new_pfn < low_pfn) {
-if (low_pfn == iovad->start_pfn && retry_pfn < limit_pfn) {
-high_pfn = limit_pfn;
-low_pfn = retry_pfn;
-curr = >anchor.node;
-curr_iova = rb_entry(curr, struct iova, node);
-goto retry;
-}
+} while (curr && new_pfn <= curr_iova->pfn_hi);
+
+if (limit_pfn < size || new_pfn < iovad->start_pfn) {
  iovad->max32_alloc_size = size;
  goto iova32_full;
  }





.




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] perf/smmuv3: Don't reserve the PMCG register spaces

2021-01-29 Thread Robin Murphy

On 2021-01-27 11:32, Zhen Lei wrote:

According to the SMMUv3 specification:
Each PMCG counter group is represented by one 4KB page (Page 0) with one
optional additional 4KB page (Page 1), both of which are at IMPLEMENTATION
DEFINED base addresses.

This means that the PMCG register spaces may be within the 64KB pages of
the SMMUv3 register space. When both the SMMU and PMCG drivers reserve
their own resources, a resource conflict occurs.

To avoid this conflict, don't reserve the PMCG regions.


I'm still not a fan of this get_and_ioremap notion in general, 
especially when the "helper" function ends up over twice the size of all 
the code it replaces[1], but for the actual functional change here,


Reviewed-by: Robin Murphy 


Suggested-by: Robin Murphy 
Signed-off-by: Zhen Lei 
---
  drivers/perf/arm_smmuv3_pmu.c | 27 +--
  1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
index 74474bb322c3f26..e5e505a0804fe53 100644
--- a/drivers/perf/arm_smmuv3_pmu.c
+++ b/drivers/perf/arm_smmuv3_pmu.c
@@ -761,6 +761,29 @@ static void smmu_pmu_get_acpi_options(struct smmu_pmu 
*smmu_pmu)
dev_notice(smmu_pmu->dev, "option mask 0x%x\n", smmu_pmu->options);
  }
  
+static void __iomem *

+smmu_pmu_get_and_ioremap_resource(struct platform_device *pdev,
+ unsigned int index,
+ struct resource **res)
+{
+   void __iomem *base;
+   struct resource *r;
+
+   r = platform_get_resource(pdev, IORESOURCE_MEM, index);
+   if (!r) {
+   dev_err(>dev, "invalid resource\n");
+   return ERR_PTR(-EINVAL);
+   }
+   if (res)
+   *res = r;
+
+   base = devm_ioremap(>dev, r->start, resource_size(r));
+   if (!base)
+   return ERR_PTR(-ENOMEM);
+
+   return base;
+}
+
  static int smmu_pmu_probe(struct platform_device *pdev)
  {
struct smmu_pmu *smmu_pmu;
@@ -793,7 +816,7 @@ static int smmu_pmu_probe(struct platform_device *pdev)
.capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
};
  
-	smmu_pmu->reg_base = devm_platform_get_and_ioremap_resource(pdev, 0, _0);

+   smmu_pmu->reg_base = smmu_pmu_get_and_ioremap_resource(pdev, 0, _0);
if (IS_ERR(smmu_pmu->reg_base))
return PTR_ERR(smmu_pmu->reg_base);
  
@@ -801,7 +824,7 @@ static int smmu_pmu_probe(struct platform_device *pdev)
  
  	/* Determine if page 1 is present */

if (cfgr & SMMU_PMCG_CFGR_RELOC_CTRS) {
-   smmu_pmu->reloc_base = devm_platform_ioremap_resource(pdev, 1);
+   smmu_pmu->reloc_base = smmu_pmu_get_and_ioremap_resource(pdev, 
1, NULL);
if (IS_ERR(smmu_pmu->reloc_base))
return PTR_ERR(smmu_pmu->reloc_base);
} else {



[1]
->8-
 drivers/perf/arm_smmuv3_pmu.c | 35 +--
 1 file changed, 9 insertions(+), 26 deletions(-)

diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
index e5e505a0804f..c9adbc7b55a1 100644
--- a/drivers/perf/arm_smmuv3_pmu.c
+++ b/drivers/perf/arm_smmuv3_pmu.c
@@ -761,33 +761,10 @@ static void smmu_pmu_get_acpi_options(struct 
smmu_pmu *smmu_pmu)

dev_notice(smmu_pmu->dev, "option mask 0x%x\n", smmu_pmu->options);
 }

-static void __iomem *
-smmu_pmu_get_and_ioremap_resource(struct platform_device *pdev,
- unsigned int index,
- struct resource **res)
-{
-   void __iomem *base;
-   struct resource *r;
-
-   r = platform_get_resource(pdev, IORESOURCE_MEM, index);
-   if (!r) {
-   dev_err(>dev, "invalid resource\n");
-   return ERR_PTR(-EINVAL);
-   }
-   if (res)
-   *res = r;
-
-   base = devm_ioremap(>dev, r->start, resource_size(r));
-   if (!base)
-   return ERR_PTR(-ENOMEM);
-
-   return base;
-}
-
 static int smmu_pmu_probe(struct platform_device *pdev)
 {
struct smmu_pmu *smmu_pmu;
-   struct resource *res_0;
+   struct resource *res_0, *res_1;
u32 cfgr, reg_size;
u64 ceid_64[2];
int irq, err;
@@ -816,7 +793,10 @@ static int smmu_pmu_probe(struct platform_device *pdev)
.capabilities   = PERF_PMU_CAP_NO_EXCLUDE,
};

-   smmu_pmu->reg_base = smmu_pmu_get_and_ioremap_resource(pdev, 0, _0);
+   res_0 = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+   if (!res_0)
+   return ERR_PTR(-EINVAL);
+	smmu_pmu->reg_base = devm_ioremap(dev, res_0->start, 
resource_size(res_0));

if (IS_ERR(smmu_pmu->reg_base))
return PTR_ERR(smmu_pmu->reg_base);

@@ -824,7 +804,10 @@ static int smmu_pmu_probe(struct platform_device *pdev)

/* Determine if page 1 is present */
if (cfgr & SMMU_PMCG_CFGR_RELOC_CTRS) {
-   smmu_pmu->reloc_base = 

Re: [PATCH v3 3/3] iommu/arm-smmu-v3: Reserving the entire SMMU register space

2021-01-29 Thread Robin Murphy

On 2021-01-27 11:32, Zhen Lei wrote:

commit 52f3fab0067d ("iommu/arm-smmu-v3: Don't reserve implementation
defined register space") only reserves the basic SMMU register space. So
the ECMDQ register space is not covered, it should be mapped again. Due
to the size of this ECMDQ resource is not fixed, depending on
SMMU_IDR6.CMDQ_CONTROL_PAGE_LOG2NUMQ. Processing its resource reservation
to avoid resource conflict with PMCG is a bit more complicated.

Therefore, the resources of the PMCG are not reserved, and the entire SMMU
resources are reserved.


This is the opposite of what I suggested. My point was that it will make 
the most sense to map the ECMDQ pages as a separate request anyway, 
therefore there is no reason to stop mapping page 0 and page 1 
separately either.


If we need to expand the page 0 mapping to cover more of page 0 to reach 
the SMMU_CMDQ_CONTROL_PAGE_* registers, we can do that when we actually 
add ECMDQ support.


Robin.


Suggested-by: Robin Murphy 
Signed-off-by: Zhen Lei 
---
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 24 
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 --
  2 files changed, 4 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index f04c55a7503c790..fde24403b06a9e3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3476,14 +3476,6 @@ static int arm_smmu_set_bus_ops(struct iommu_ops *ops)
return err;
  }
  
-static void __iomem *arm_smmu_ioremap(struct device *dev, resource_size_t start,

- resource_size_t size)
-{
-   struct resource res = DEFINE_RES_MEM(start, size);
-
-   return devm_ioremap_resource(dev, );
-}
-
  static int arm_smmu_device_probe(struct platform_device *pdev)
  {
int irq, ret;
@@ -3519,22 +3511,14 @@ static int arm_smmu_device_probe(struct platform_device 
*pdev)
}
ioaddr = res->start;
  
-	/*

-* Don't map the IMPLEMENTATION DEFINED regions, since they may contain
-* the PMCG registers which are reserved by the PMU driver.
-*/
-   smmu->base = arm_smmu_ioremap(dev, ioaddr, ARM_SMMU_REG_SZ);
+   smmu->base = devm_ioremap_resource(dev, res);
if (IS_ERR(smmu->base))
return PTR_ERR(smmu->base);
  
-	if (arm_smmu_resource_size(smmu) > SZ_64K) {

-   smmu->page1 = arm_smmu_ioremap(dev, ioaddr + SZ_64K,
-  ARM_SMMU_REG_SZ);
-   if (IS_ERR(smmu->page1))
-   return PTR_ERR(smmu->page1);
-   } else {
+   if (arm_smmu_resource_size(smmu) > SZ_64K)
+   smmu->page1 = smmu->base + SZ_64K;
+   else
smmu->page1 = smmu->base;
-   }
  
  	/* Interrupt lines */
  
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h

index da525f46dab4fc1..63f2b476987d6ae 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -152,8 +152,6 @@
  #define ARM_SMMU_PRIQ_IRQ_CFG10xd8
  #define ARM_SMMU_PRIQ_IRQ_CFG20xdc
  
-#define ARM_SMMU_REG_SZ			0xe00

-
  /* Common MSI config fields */
  #define MSI_CFG0_ADDR_MASKGENMASK_ULL(51, 2)
  #define MSI_CFG2_SH   GENMASK(5, 4)


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


linux-next-20210129: drivers/iommu/intel/dmar.c

2021-01-29 Thread Randy Dunlap
on x86_64:

../drivers/iommu/intel/dmar.c: In function 'qi_submit_sync':
../drivers/iommu/intel/dmar.c:1311:3: error: implicit declaration of function 
'trace_qi_submit'; did you mean 'ftrace_nmi_exit'? 
[-Werror=implicit-function-declaration]
   trace_qi_submit(iommu, desc[i].qw0, desc[i].qw1,
   ^~~
   ftrace_nmi_exit

Full randconfig file is attached.

-- 
~Randy
Reported-by: Randy Dunlap 



config-r7511.gz
Description: application/gzip
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] perf/smmuv3: Add a MODULE_SOFTDEP() to indicate dependency on SMMU

2021-01-29 Thread Robin Murphy

On 2021-01-29 15:34, John Garry wrote:

On 29/01/2021 15:12, Robin Murphy wrote:

On 2021-01-27 11:32, Zhen Lei wrote:
The MODULE_SOFTDEP() gives user space a hint of the loading sequence. 
And
when command "modprobe arm_smmuv3_pmu" is executed, the 
arm_smmu_v3.ko is

automatically loaded in advance.


Why do we need this? If probe order doesn't matter when both drivers 
are built-in, why should module load order?


TBH I'm not sure why we even have a Kconfig dependency on ARM_SMMU_V3, 
given that the drivers operate completely independently :/


Can that Kconfig dependency just be removed? I think that it was added 
under the idea that there is no point in having the SMMUv3 PMU driver 
without the SMMUv3 driver.


A PMCG *might* be usable for simply counting transactions to measure 
device activity regardless of its associated SMMU being enabled. Either 
way, it's not really Kconfig's job to decide what makes sense (beyond 
the top-level "can this driver *ever* be used on this platform" 
visibility choices). Imagine if we gave every PCI/USB/etc. device driver 
an explicit dependency on at least one host controller driver being 
enabled...


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 0/3] perf/smmuv3: Don't reserve the PMCG register spaces

2021-01-29 Thread Leizhen (ThunderTown)



On 2021/1/29 4:31, Will Deacon wrote:
> On Wed, Jan 27, 2021 at 07:32:55PM +0800, Zhen Lei wrote:
>> v2 --> v3:
>> Patch 3 is updated because https://lkml.org/lkml/2021/1/22/532 has been 
>> queued in advance.
>>
>> v1 --> v2:
>> According to Robin Murphy's suggestion: https://lkml.org/lkml/2021/1/20/470
>> Don't reserve the PMCG register spaces, and reserve the entire SMMU register 
>> space.
>>
>> v1:
>> Since the PMCG may implement its resigters space(4KB Page0 and 4KB Page1)
>> within the SMMUv3 64KB Page0. In this case, when the SMMUv3 driver reserves 
>> the
>> 64KB Page0 resource in advance, the PMCG driver try to reserve its Page0 and
>> Page1 resources, a resource conflict occurs.
>>
>> commit 52f3fab0067d6fa ("iommu/arm-smmu-v3: Don't reserve implementation
>> defined register space") reduce the resource reservation range of the SMMUv3
>> driver, it only reserves the first 0xe00 bytes in the 64KB Page0, to avoid
>> the above-mentioned resource conflicts.
>>
>> But the SMMUv3.3 add support for ECMDQ, its registers space is also 
>> implemented
>> in the SMMUv3 64KB Page0. This means we need to build two separate mappings.
>> New features may be added in the future, and more independent mappings may be
>> required. The simple problem is complicated because the user expects to map 
>> the
>> entire SMMUv3 64KB Page0.
>>
>> Therefore, the proper solution is: If the PMCG register resources are 
>> located in
>> the 64KB Page0 of the SMMU, the PMCG driver does not reserve the conflict 
>> resources
>> when the SMMUv3 driver has reserved the conflict resources before. Instead, 
>> the PMCG
>> driver only performs devm_ioremap() to ensure that it can work properly.
>>
>> Zhen Lei (3):
>>   perf/smmuv3: Don't reserve the PMCG register spaces
>>   perf/smmuv3: Add a MODULE_SOFTDEP() to indicate dependency on SMMU
>>   iommu/arm-smmu-v3: Reserving the entire SMMU register space
> 
> I'll need Robin's ack on these.

Hi, Robin:
  What's your opinion?
  In fact, I have written the patches that SMMU and PMCG whoever come first
will reserve the resources, and whoever comes next does not reserve. However,
the processing logic is relatively complex.

> 
> Will
> 
> .
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 2/3] perf/smmuv3: Add a MODULE_SOFTDEP() to indicate dependency on SMMU

2021-01-29 Thread Robin Murphy

On 2021-01-27 11:32, Zhen Lei wrote:

The MODULE_SOFTDEP() gives user space a hint of the loading sequence. And
when command "modprobe arm_smmuv3_pmu" is executed, the arm_smmu_v3.ko is
automatically loaded in advance.


Why do we need this? If probe order doesn't matter when both drivers are 
built-in, why should module load order?


TBH I'm not sure why we even have a Kconfig dependency on ARM_SMMU_V3, 
given that the drivers operate completely independently :/


Robin.


Signed-off-by: Zhen Lei 
---
  drivers/perf/arm_smmuv3_pmu.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c
index e5e505a0804fe53..9a305ac51208cd2 100644
--- a/drivers/perf/arm_smmuv3_pmu.c
+++ b/drivers/perf/arm_smmuv3_pmu.c
@@ -950,6 +950,7 @@ static void __exit arm_smmu_pmu_exit(void)
  module_exit(arm_smmu_pmu_exit);
  
  MODULE_DESCRIPTION("PMU driver for ARM SMMUv3 Performance Monitors Extension");

+MODULE_SOFTDEP("pre: arm_smmu_v3");
  MODULE_AUTHOR("Neil Leeder ");
  MODULE_AUTHOR("Shameer Kolothum ");
  MODULE_LICENSE("GPL v2");


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 2/3] perf/smmuv3: Add a MODULE_SOFTDEP() to indicate dependency on SMMU

2021-01-29 Thread John Garry

On 29/01/2021 15:12, Robin Murphy wrote:

On 2021-01-27 11:32, Zhen Lei wrote:

The MODULE_SOFTDEP() gives user space a hint of the loading sequence. And
when command "modprobe arm_smmuv3_pmu" is executed, the arm_smmu_v3.ko is
automatically loaded in advance.


Why do we need this? If probe order doesn't matter when both drivers are 
built-in, why should module load order?


TBH I'm not sure why we even have a Kconfig dependency on ARM_SMMU_V3, 
given that the drivers operate completely independently :/


Can that Kconfig dependency just be removed? I think that it was added 
under the idea that there is no point in having the SMMUv3 PMU driver 
without the SMMUv3 driver.


However even on that basis it seems broken, as we cannot have 
ARM_SMMU_V3=m + ARM_SMMU_V3_PMU=y.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[git pull] IOMMU Fixes for Linux v5.11-rc5

2021-01-29 Thread Joerg Roedel
Hi Linus,

The following changes since commit 6ee1d745b7c9fd573fba142a2efdad76a9f1cb04:

  Linux 5.11-rc5 (2021-01-24 16:47:14 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git 
tags/iommu-fixes-v5.11-rc5

for you to fetch changes up to 29b32839725f8c89a41cb6ee054c85f3116ea8b5:

  iommu/vt-d: Do not use flush-queue when caching-mode is on (2021-01-28 
13:59:02 +0100)


IOMMU Fixes for Linux v5.11-rc5

Including:

- AMD IOMMU Fix to make sure features are detected before they
  are queried.

- Intel IOMMU address alignment check fix for an IOLTB flushing
  command.

- Performance fix for Intel IOMMU to make sure the code does not
  do full IOTLB flushes all the time. Those flushes are very
  expensive on emulated IOMMUs.


Lu Baolu (1):
  iommu/vt-d: Correctly check addr alignment in qi_flush_dev_iotlb_pasid()

Nadav Amit (1):
  iommu/vt-d: Do not use flush-queue when caching-mode is on

Suravee Suthikulpanit (1):
  iommu/amd: Use IVHD EFR for early initialization of IOMMU features

 drivers/iommu/amd/amd_iommu.h   |  7 ++---
 drivers/iommu/amd/amd_iommu_types.h |  4 +++
 drivers/iommu/amd/init.c| 56 +++--
 drivers/iommu/intel/dmar.c  |  2 +-
 drivers/iommu/intel/iommu.c | 32 -
 5 files changed, 92 insertions(+), 9 deletions(-)

Please pull.

Thanks,

Joerg


signature.asc
Description: Digital signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu