NOTICE: this series are based on the arm64 for-next/tlbi branch:
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/tlbi

--
ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
range of input addresses. This series add support for this feature.

I tested this feature on a FPGA machine whose cpus support the tlbi range.
As the page num increases, the performance is improved significantly. When
page num = 256, the performance is improved by about 10 times.

Below is the test data when the stride = PTE:

        [page num]      [classic]               [tlbi range]
        1               16051                   13524
        2               11366                   11146
        3               11582                   12171
        4               11694                   11101
        5               12138                   12267
        6               12290                   11105
        7               12400                   12002
        8               12837                   11097
        9               14791                   12140
        10              15461                   11087
        16              18233                   11094
        32              26983                   11079
        64              43840                   11092
        128             77754                   11098
        256             145514                  11089
        512             280932                  11111

See more details in:

https://lore.kernel.org/linux-arm-kernel/[email protected]/

--
RFC patches:
- Link: 
https://lore.kernel.org/linux-arm-kernel/[email protected]/

Zhenyu Ye (2):
  arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature
  arm64: tlb: Use the TLBI RANGE feature in arm64

 arch/arm64/include/asm/cpucaps.h  |   3 +-
 arch/arm64/include/asm/sysreg.h   |   3 +
 arch/arm64/include/asm/tlbflush.h | 156 ++++++++++++++++++++++++------
 arch/arm64/kernel/cpufeature.c    |  10 ++
 4 files changed, 141 insertions(+), 31 deletions(-)

-- 
2.19.1


Reply via email to