On Fri, Jun 1, 2018 at 3:13 PM Rik van Riel <r...@surriel.com> wrote: > > On Fri, 1 Jun 2018 14:21:58 -0700 > Andy Lutomirski <l...@kernel.org> wrote: > > > Hmm. I wonder if there's a more clever data structure than a bitmap > > that we could be using here. Each CPU only ever needs to be in one > > mm's cpumask, and each cpu only ever changes its own state in the > > bitmask. And writes are much less common than reads for most > > workloads. > > It would be easy enough to add an mm_struct pointer to the > per-cpu tlbstate struct, and iterate over those. > > However, that would be an orthogonal change to optimizing > lazy TLB mode. > > Does the (untested) patch below make sense as a potential > improvement to the lazy TLB heuristic? > > ---8<--- > Subject: x86,tlb: workload dependent per CPU lazy TLB switch > > Lazy TLB mode is a tradeoff between flushing the TLB and touching > the mm_cpumask(&init_mm) at context switch time, versus potentially > incurring a remote TLB flush IPI while in lazy TLB mode. > > Whether this pays off is likely to be workload dependent more than > anything else. However, the current heuristic keys off hardware type. > > This patch changes the lazy TLB mode heuristic to a dynamic, per-CPU > decision, dependent on whether we recently received a remote TLB > shootdown while in lazy TLB mode. > > This is a very simple heuristic. When a CPU receives a remote TLB > shootdown IPI while in lazy TLB mode, a counter in the same cache > line is set to 16. Every time we skip lazy TLB mode, the counter > is decremented. > > While the counter is zero (no recent TLB flush IPIs), allow lazy TLB mode.
Hmm, cute. That's not a bad idea at all. It would be nice to get some kind of real benchmark on both PCID and !PCID. If nothing else, I would expect the threshold (16 in your patch) to want to be lower on PCID systems. --Andy