On 04/11/16 09:55, Nipun Gupta wrote: > The SMTNMB_TLBEN in the Auxiliary Configuration Register (ACR) provides an > option to enable the updation of TLB in case of bypass transactions due to > no stream match in the stream match table. This reduces the latencies of > the subsequent transactions with the same stream-id which bypasses the SMMU. > This provides a significant performance benefit for certain networking > workloads. > > With this change substantial performance improvement of ~9% is observed with > DPDK l3fwd application > (http://dpdk.org/doc/guides/sample_app_ug/l3_forward.html) > on NXP's LS2088a platform.
Reviewed-by: Robin Murphy <[email protected]> > Signed-off-by: Nipun Gupta <[email protected]> > --- > Changes for v2: > - Incorporated Robin's comments on v1 related to > Setting SMTNMB_TLBEN in ACR only for MMU-500 as ACR is implementation > dependent > Code comments and Naming convention > Changes for v3: > - Added correct patch version > > drivers/iommu/arm-smmu.c | 25 ++++++++++++++++--------- > 1 file changed, 16 insertions(+), 9 deletions(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index ce2a9d4..05901be 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -247,6 +247,7 @@ enum arm_smmu_s2cr_privcfg { > #define ARM_MMU500_ACTLR_CPRE (1 << 1) > > #define ARM_MMU500_ACR_CACHE_LOCK (1 << 26) > +#define ARM_MMU500_ACR_SMTNMB_TLBEN (1 << 8) > > #define CB_PAR_F (1 << 0) > > @@ -1569,16 +1570,22 @@ static void arm_smmu_device_reset(struct > arm_smmu_device *smmu) > for (i = 0; i < smmu->num_mapping_groups; ++i) > arm_smmu_write_sme(smmu, i); > > - /* > - * Before clearing ARM_MMU500_ACTLR_CPRE, need to > - * clear CACHE_LOCK bit of ACR first. And, CACHE_LOCK > - * bit is only present in MMU-500r2 onwards. > - */ > - reg = readl_relaxed(gr0_base + ARM_SMMU_GR0_ID7); > - major = (reg >> ID7_MAJOR_SHIFT) & ID7_MAJOR_MASK; > - if ((smmu->model == ARM_MMU500) && (major >= 2)) { > + if (smmu->model == ARM_MMU500) { > + /* > + * Before clearing ARM_MMU500_ACTLR_CPRE, need to > + * clear CACHE_LOCK bit of ACR first. And, CACHE_LOCK > + * bit is only present in MMU-500r2 onwards. > + */ > + reg = readl_relaxed(gr0_base + ARM_SMMU_GR0_ID7); > + major = (reg >> ID7_MAJOR_SHIFT) & ID7_MAJOR_MASK; > reg = readl_relaxed(gr0_base + ARM_SMMU_GR0_sACR); > - reg &= ~ARM_MMU500_ACR_CACHE_LOCK; > + if (major >= 2) > + reg &= ~ARM_MMU500_ACR_CACHE_LOCK; > + /* > + * Allow unmatched Stream IDs to allocate bypass > + * TLB entries for reduced latency. > + */ > + reg |= ARM_MMU500_ACR_SMTNMB_TLBEN; > writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_sACR); > } > > _______________________________________________ iommu mailing list [email protected] https://lists.linuxfoundation.org/mailman/listinfo/iommu
