Song noticed switch_mm_irqs_off taking a lot of CPU time in recent
kernels,using 1.8% of a 48 CPU system during a netperf to localhost run.
Digging into the profile, we noticed that cpumask_clear_cpu and
cpumask_set_cpu together take about half of the CPU time taken by
switch_mm_irqs_off.

However, the CPUs running netperf end up switching back and forth
between netperf and the idle task, which does not require changes
to the mm_cpumask. Furthermore, the init_mm cpumask ends up being
the most heavily contended one in the system.

Simply skipping changes to mm_cpumask(&init_mm) reduces overhead.

Signed-off-by: Rik van Riel <[email protected]>
Reported-and-tested-by: Song Liu <[email protected]>
---
 arch/x86/mm/tlb.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 5a01fcb22a7e..b55e6b7df7c9 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -311,14 +311,17 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct 
mm_struct *next,
                }
 
                /* Stop remote flushes for the previous mm */
-               VM_WARN_ON_ONCE(!cpumask_test_cpu(cpu, mm_cpumask(real_prev)) &&
-                               real_prev != &init_mm);
-               cpumask_clear_cpu(cpu, mm_cpumask(real_prev));
+               if (real_prev != &init_mm) {
+                       VM_WARN_ON_ONCE(!cpumask_test_cpu(cpu,
+                                               mm_cpumask(real_prev)));
+                       cpumask_clear_cpu(cpu, mm_cpumask(real_prev));
+               }
 
                /*
                 * Start remote flushes and then read tlb_gen.
                 */
-               cpumask_set_cpu(cpu, mm_cpumask(next));
+               if (next != &init_mm)
+                       cpumask_set_cpu(cpu, mm_cpumask(next));
                next_tlb_gen = atomic64_read(&next->context.tlb_gen);
 
                choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush);
-- 
2.14.4

Reply via email to