Re: [PATCH v5 2/2] x86/mm: Improve TLB flush documentation
On Mon, Jul 24, 2017 at 09:41:39PM -0700, Andy Lutomirski wrote: > + /* > + * Resume remote flushes and then read tlb_gen. The > + * implied barrier in atomic64_read() synchronizes There is no barrier in atomic64_read(). > + * with inc_mm_tlb_gen() like this: > + * > + * switch_mm_irqs_off():flush request: > + * cpumask_set_cpu(...);inc_mm_tlb_gen(); > + * MB MB > + * atomic64_read(.tlb_gen); flush_tlb_others(mm_cpumask()); > + */ > cpumask_set_cpu(cpu, mm_cpumask(next)); > next_tlb_gen = atomic64_read(&next->context.tlb_gen); >
Re: [PATCH v5 2/2] x86/mm: Improve TLB flush documentation
On Mon, Jul 24, 2017 at 9:47 PM, Nadav Amit wrote: > Andy Lutomirski wrote: > >> Improve comments as requested by PeterZ and also add some >> documentation at the top of the file. >> >> Signed-off-by: Andy Lutomirski >> --- >> arch/x86/mm/tlb.c | 43 +-- >> 1 file changed, 33 insertions(+), 10 deletions(-) >> >> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c >> index ce104b962a17..d4ee781ca656 100644 >> --- a/arch/x86/mm/tlb.c >> +++ b/arch/x86/mm/tlb.c >> @@ -15,17 +15,24 @@ >> #include >> >> /* >> - * TLB flushing, formerly SMP-only >> - * c/o Linus Torvalds. >> + * The code in this file handles mm switches and TLB flushes. >> * >> - * These mean you can really definitely utterly forget about >> - * writing to user space from interrupts. (Its not allowed anyway). >> + * An mm's TLB state is logically represented by a totally ordered sequence >> + * of TLB flushes. Each flush increments the mm's tlb_gen. >> * >> - * Optimizations Manfred Spraul >> + * Each CPU that might have an mm in its TLB (and that might ever use >> + * those TLB entries) will have an entry for it in its cpu_tlbstate.ctxs >> + * array. The kernel maintains the following invariant: for each CPU and >> + * for each mm in its cpu_tlbstate.ctxs array, the CPU has performed all >> + * flushes in that mms history up to the tlb_gen in cpu_tlbstate.ctxs >> + * or the CPU has performed an equivalent set of flushes. >> * >> - * More scalable flush, from Andi Kleen >> - * >> - * Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi >> + * For this purpose, an equivalent set is a set that is at least as strong. >> + * So, for example, if the flush history is a full flush at time 1, >> + * a full flush after time 1 is sufficient, but a full flush before time 1 >> + * is not. Similarly, any number of flushes can be replaced by a single >> + * full flush so long as that replacement flush is after all the flushes >> + * that it's replacing. >> */ >> >> atomic64_t last_mm_ctx_id = ATOMIC64_INIT(1); >> @@ -138,7 +145,16 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct >> mm_struct *next, >> return; >> } >> >> - /* Resume remote flushes and then read tlb_gen. */ >> + /* >> + * Resume remote flushes and then read tlb_gen. The >> + * implied barrier in atomic64_read() synchronizes >> + * with inc_mm_tlb_gen() like this: > > You mean the implied memory barrier in cpumask_set_cpu(), no? > Ugh, yes. And I misread PeterZ's email and incorrectly removed the smp_mb__after_atomic(). I'll respin this patch.
Re: [PATCH v5 2/2] x86/mm: Improve TLB flush documentation
Andy Lutomirski wrote: > Improve comments as requested by PeterZ and also add some > documentation at the top of the file. > > Signed-off-by: Andy Lutomirski > --- > arch/x86/mm/tlb.c | 43 +-- > 1 file changed, 33 insertions(+), 10 deletions(-) > > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c > index ce104b962a17..d4ee781ca656 100644 > --- a/arch/x86/mm/tlb.c > +++ b/arch/x86/mm/tlb.c > @@ -15,17 +15,24 @@ > #include > > /* > - * TLB flushing, formerly SMP-only > - * c/o Linus Torvalds. > + * The code in this file handles mm switches and TLB flushes. > * > - * These mean you can really definitely utterly forget about > - * writing to user space from interrupts. (Its not allowed anyway). > + * An mm's TLB state is logically represented by a totally ordered sequence > + * of TLB flushes. Each flush increments the mm's tlb_gen. > * > - * Optimizations Manfred Spraul > + * Each CPU that might have an mm in its TLB (and that might ever use > + * those TLB entries) will have an entry for it in its cpu_tlbstate.ctxs > + * array. The kernel maintains the following invariant: for each CPU and > + * for each mm in its cpu_tlbstate.ctxs array, the CPU has performed all > + * flushes in that mms history up to the tlb_gen in cpu_tlbstate.ctxs > + * or the CPU has performed an equivalent set of flushes. > * > - * More scalable flush, from Andi Kleen > - * > - * Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi > + * For this purpose, an equivalent set is a set that is at least as strong. > + * So, for example, if the flush history is a full flush at time 1, > + * a full flush after time 1 is sufficient, but a full flush before time 1 > + * is not. Similarly, any number of flushes can be replaced by a single > + * full flush so long as that replacement flush is after all the flushes > + * that it's replacing. > */ > > atomic64_t last_mm_ctx_id = ATOMIC64_INIT(1); > @@ -138,7 +145,16 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct > mm_struct *next, > return; > } > > - /* Resume remote flushes and then read tlb_gen. */ > + /* > + * Resume remote flushes and then read tlb_gen. The > + * implied barrier in atomic64_read() synchronizes > + * with inc_mm_tlb_gen() like this: You mean the implied memory barrier in cpumask_set_cpu(), no?
[PATCH v5 2/2] x86/mm: Improve TLB flush documentation
Improve comments as requested by PeterZ and also add some documentation at the top of the file. Signed-off-by: Andy Lutomirski --- arch/x86/mm/tlb.c | 43 +-- 1 file changed, 33 insertions(+), 10 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index ce104b962a17..d4ee781ca656 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -15,17 +15,24 @@ #include /* - * TLB flushing, formerly SMP-only - * c/o Linus Torvalds. + * The code in this file handles mm switches and TLB flushes. * - * These mean you can really definitely utterly forget about - * writing to user space from interrupts. (Its not allowed anyway). + * An mm's TLB state is logically represented by a totally ordered sequence + * of TLB flushes. Each flush increments the mm's tlb_gen. * - * Optimizations Manfred Spraul + * Each CPU that might have an mm in its TLB (and that might ever use + * those TLB entries) will have an entry for it in its cpu_tlbstate.ctxs + * array. The kernel maintains the following invariant: for each CPU and + * for each mm in its cpu_tlbstate.ctxs array, the CPU has performed all + * flushes in that mms history up to the tlb_gen in cpu_tlbstate.ctxs + * or the CPU has performed an equivalent set of flushes. * - * More scalable flush, from Andi Kleen - * - * Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi + * For this purpose, an equivalent set is a set that is at least as strong. + * So, for example, if the flush history is a full flush at time 1, + * a full flush after time 1 is sufficient, but a full flush before time 1 + * is not. Similarly, any number of flushes can be replaced by a single + * full flush so long as that replacement flush is after all the flushes + * that it's replacing. */ atomic64_t last_mm_ctx_id = ATOMIC64_INIT(1); @@ -138,7 +145,16 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, return; } - /* Resume remote flushes and then read tlb_gen. */ + /* +* Resume remote flushes and then read tlb_gen. The +* implied barrier in atomic64_read() synchronizes +* with inc_mm_tlb_gen() like this: +* +* switch_mm_irqs_off():flush request: +* cpumask_set_cpu(...);inc_mm_tlb_gen(); +* MB MB +* atomic64_read(.tlb_gen); flush_tlb_others(mm_cpumask()); +*/ cpumask_set_cpu(cpu, mm_cpumask(next)); next_tlb_gen = atomic64_read(&next->context.tlb_gen); @@ -186,7 +202,14 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, VM_WARN_ON_ONCE(cpumask_test_cpu(cpu, mm_cpumask(next))); /* -* Start remote flushes and then read tlb_gen. +* Start remote flushes and then read tlb_gen. As +* above, the implied barrier in atomic64_read() +* synchronizes with inc_mm_tlb_gen() like this: +* +* switch_mm_irqs_off():flush request: +* cpumask_set_cpu(...);inc_mm_tlb_gen(); +* MB MB +* atomic64_read(.tlb_gen); flush_tlb_others(mm_cpumask()); */ cpumask_set_cpu(cpu, mm_cpumask(next)); next_tlb_gen = atomic64_read(&next->context.tlb_gen); -- 2.9.4