On Wed, 2017-08-02 at 10:02 +0100, Will Deacon wrote: > > > So flush_tlb_range is actually weaker than smp_mb in some respects, yet > > > the > > > flush_tlb_pending stuff will still work correctly. > > > > So while I think you're right, and we could live with this, after all, > > if we know the mm is CPU local, there shouldn't be any SMP concerns wrt > > its page tables. Do you really want to make this more complicated? > > It gives us a nice performance lift on arm64 and I have a patch...[1]
We do that on powerpc too, though there are ongoing questions a to whether an smp_mb() after setting the mask bit in switch_mm is sufficient vs. prefetch brining entries in the TLB after the context is switched. But that's a powerpc specific issue. Nick Piggin is working on sorting that out. Cheers, Ben.