[PATCH RFC 2/8] iommu/arm-smmu-v3: Add and use static helper function arm_smmu_cmdq_issue_cmd_with_sync()

2021-06-26 Thread Zhen Lei
The obvious key to the performance optimization of commit 587e6c10a7ce ("iommu/arm-smmu-v3: Reduce contention during command-queue insertion") is to allow multiple cores to insert commands in parallel after a brief mutex contention. Obviously, inserting as many commands at a time as possible can

[PATCH RFC 7/8] iommu/arm-smmu-v3: Add arm_smmu_ecmdq_issue_cmdlist() for non-shared ECMDQ

2021-06-26 Thread Zhen Lei
When a core can exclusively own an ECMDQ, competition with other cores does not need to be considered during command insertion. Therefore, we can delete the part of arm_smmu_cmdq_issue_cmdlist() that deals with multi-core contention and generate a more efficient ECMDQ-specific function

[PATCH RFC 4/8] iommu/arm-smmu-v3: Extract reusable function __arm_smmu_cmdq_skip_err()

2021-06-26 Thread Zhen Lei
When SMMU_GERROR.CMDQP_ERR is different to SMMU_GERRORN.CMDQP_ERR, it indicates that one or more errors have been encountered on a command queue control page interface. We need to traverse all ECMDQs in that control page to find all errors. For each ECMDQ error handling, it is much the same as the

[PATCH RFC 5/8] iommu/arm-smmu-v3: Add support for ECMDQ register mode

2021-06-26 Thread Zhen Lei
Ensure that each core exclusively occupies an ECMDQ and all of them are enabled during initialization. During this initialization process, any errors will result in a fallback to using normal CMDQ. When GERROR is triggered by ECMDQ, all ECMDQs need to be traversed: the ECMDQs with errors will be

[PATCH RFC 8/8] iommu/arm-smmu-v3: Add support for less than one ECMDQ per core

2021-06-26 Thread Zhen Lei
Due to limited hardware resources, the number of ECMDQs may be less than the number of cores. If the number of ECMDQs is greater than the number of numa nodes, ensure that each node has at least one ECMDQ. This is because ECMDQ queue memory is requested from the NUMA node where it resides, which

[PATCH RFC 3/8] iommu/arm-smmu-v3: Add and use static helper function arm_smmu_get_cmdq()

2021-06-26 Thread Zhen Lei
One SMMU has only one normal CMDQ. Therefore, this CMDQ is used regardless of the core on which the command is inserted. It can be referenced directly through "smmu->cmdq". However, one SMMU has multiple ECMDQs, and the ECMDQ used by the core on which the command insertion is executed may be

[PATCH RFC 6/8] iommu/arm-smmu-v3: Ensure that a set of associated commands are inserted in the same ECMDQ

2021-06-26 Thread Zhen Lei
The SYNC command only ensures that the command that precedes it in the same ECMDQ must be executed, but cannot synchronize the commands in other ECMDQs. If an unmap involves multiple commands, some commands are executed on one core, and the other commands are executed on another core. In this

[PATCH RFC 1/8] iommu/arm-smmu-v3: Use command queue batching helpers to improve performance

2021-06-26 Thread Zhen Lei
The obvious key to the performance optimization of commit 587e6c10a7ce ("iommu/arm-smmu-v3: Reduce contention during command-queue insertion") is to allow multiple cores to insert commands in parallel after a brief mutex contention. Obviously, inserting as many commands at a time as possible can

[PATCH RFC 0/8] iommu/arm-smmu-v3: add support for ECMDQ register mode

2021-06-26 Thread Zhen Lei
SMMU v3.3 added a new feature, which is Enhanced Command queue interface for reducing contention when submitting Commands to the SMMU, in this patch set, ECMDQ is the abbreviation of Enhanced Command Queue. When the hardware supports ECMDQ and each core can exclusively use one ECMDQ, each core

[PATCH 9/9] dma-debug: Use memory_intersects() directly

2021-06-26 Thread Kefeng Wang
Use memory_intersects() directly instead of private overlap() function. Cc: Christoph Hellwig Cc: Marek Szyprowski Cc: Robin Murphy Cc: iommu@lists.linux-foundation.org Signed-off-by: Kefeng Wang --- kernel/dma/debug.c | 14 ++ 1 file changed, 2 insertions(+), 12 deletions(-)

[PATCH 0/9] sections: Unify kernel sections range check and use

2021-06-26 Thread Kefeng Wang
There are three head files(kallsyms.h, kernel.h and sections.h) which include the kernel sections range check, let's make some cleanup and unify them. 1. cleanup arch specific text/data check and fix address boundary check in kallsyms.h 2. make all the basic kernel range check function into