Here is a bit more TLB flush work that mostly attempt to improve range flushes by reducing barriers, and reducing the cases we resort to flushing the entire PID.
I haven't done much benchmarking to get good numbers yet for the exact heuristics settings, just interested in comments for the overall idea. Thanks, Nick Nicholas Piggin (8): powerpc/64s/radix: Fix theoretical process table entry cache invalidation powerpc/64s/radix: tlbie improve preempt handling powerpc/64s/radix: optimize TLB range flush barriers powerpc/64s/radix: Implement _tlbie(l)_va_range flush functions powerpc/64s/radix: Introduce local single page ceiling for TLB range flush powerpc/64s/radix: Optimize flush_tlb_range powerpc/64s/radix: Improve TLB flushing for unmaps that free a page table powerpc/64s/radix: Only flush local TLB for spurious fault flushes .../powerpc/include/asm/book3s/64/tlbflush-radix.h | 7 +- arch/powerpc/include/asm/book3s/64/tlbflush.h | 11 + arch/powerpc/include/asm/mmu_context.h | 4 + arch/powerpc/mm/mmu_context_book3s64.c | 23 +- arch/powerpc/mm/pgtable-book3s64.c | 5 +- arch/powerpc/mm/pgtable.c | 2 +- arch/powerpc/mm/tlb-radix.c | 263 +++++++++++++++------ 7 files changed, 234 insertions(+), 81 deletions(-) -- 2.13.3