On 03/10/16 17:40, Anton Blanchard wrote: > From: Anton Blanchard <an...@samba.org> > > During context switch, switch_mm() sets our current CPU in mm_cpumask. > We can avoid this atomic sequence in most cases by checking before > setting the bit. > > Testing on a POWER8 using our context switch microbenchmark: > > tools/testing/selftests/powerpc/benchmarks/context_switch \ > --process --no-fp --no-altivec --no-vector > > Performance improves 2%. > > Signed-off-by: Anton Blanchard <an...@samba.org> > --- > arch/powerpc/include/asm/mmu_context.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/include/asm/mmu_context.h > b/arch/powerpc/include/asm/mmu_context.h > index 475d1be..5c45114 100644 > --- a/arch/powerpc/include/asm/mmu_context.h > +++ b/arch/powerpc/include/asm/mmu_context.h > @@ -72,7 +72,8 @@ static inline void switch_mm(struct mm_struct *prev, struct > mm_struct *next, > struct task_struct *tsk) > { > /* Mark this context has been used on the new CPU */ > - cpumask_set_cpu(smp_processor_id(), mm_cpumask(next)); > + if (!cpumask_test_cpu(smp_processor_id(), mm_cpumask(next))) > + cpumask_set_cpu(smp_processor_id(), mm_cpumask(next)); >
I think this makes sense, in fact I think in the longer term we can even use __set_bit() reorder-able version since we have a sync coming out of schedule(). The read side for TLB flush can use a RMB Acked-by: Balbir Singh <bsinghar...@gmail.com> Balbir Singh.