Re: [PATCH v5 2/2] x86/mm: Improve TLB flush documentation

2017-07-25 Thread Peter Zijlstra
On Mon, Jul 24, 2017 at 09:41:39PM -0700, Andy Lutomirski wrote:
> + /*
> +  * Resume remote flushes and then read tlb_gen.  The
> +  * implied barrier in atomic64_read() synchronizes

There is no barrier in atomic64_read().

> +  * with inc_mm_tlb_gen() like this:
> +  *
> +  * switch_mm_irqs_off():flush request:
> +  *  cpumask_set_cpu(...);inc_mm_tlb_gen();
> +  *  MB   MB
> +  *  atomic64_read(.tlb_gen); flush_tlb_others(mm_cpumask());
> +  */
>   cpumask_set_cpu(cpu, mm_cpumask(next));
>   next_tlb_gen = atomic64_read(&next->context.tlb_gen);
>  


Re: [PATCH v5 2/2] x86/mm: Improve TLB flush documentation

2017-07-24 Thread Andy Lutomirski
On Mon, Jul 24, 2017 at 9:47 PM, Nadav Amit  wrote:
> Andy Lutomirski  wrote:
>
>> Improve comments as requested by PeterZ and also add some
>> documentation at the top of the file.
>>
>> Signed-off-by: Andy Lutomirski 
>> ---
>> arch/x86/mm/tlb.c | 43 +--
>> 1 file changed, 33 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
>> index ce104b962a17..d4ee781ca656 100644
>> --- a/arch/x86/mm/tlb.c
>> +++ b/arch/x86/mm/tlb.c
>> @@ -15,17 +15,24 @@
>> #include 
>>
>> /*
>> - *   TLB flushing, formerly SMP-only
>> - *   c/o Linus Torvalds.
>> + * The code in this file handles mm switches and TLB flushes.
>>  *
>> - *   These mean you can really definitely utterly forget about
>> - *   writing to user space from interrupts. (Its not allowed anyway).
>> + * An mm's TLB state is logically represented by a totally ordered sequence
>> + * of TLB flushes.  Each flush increments the mm's tlb_gen.
>>  *
>> - *   Optimizations Manfred Spraul 
>> + * Each CPU that might have an mm in its TLB (and that might ever use
>> + * those TLB entries) will have an entry for it in its cpu_tlbstate.ctxs
>> + * array.  The kernel maintains the following invariant: for each CPU and
>> + * for each mm in its cpu_tlbstate.ctxs array, the CPU has performed all
>> + * flushes in that mms history up to the tlb_gen in cpu_tlbstate.ctxs
>> + * or the CPU has performed an equivalent set of flushes.
>>  *
>> - *   More scalable flush, from Andi Kleen
>> - *
>> - *   Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi
>> + * For this purpose, an equivalent set is a set that is at least as strong.
>> + * So, for example, if the flush history is a full flush at time 1,
>> + * a full flush after time 1 is sufficient, but a full flush before time 1
>> + * is not.  Similarly, any number of flushes can be replaced by a single
>> + * full flush so long as that replacement flush is after all the flushes
>> + * that it's replacing.
>>  */
>>
>> atomic64_t last_mm_ctx_id = ATOMIC64_INIT(1);
>> @@ -138,7 +145,16 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct 
>> mm_struct *next,
>>   return;
>>   }
>>
>> - /* Resume remote flushes and then read tlb_gen. */
>> + /*
>> +  * Resume remote flushes and then read tlb_gen.  The
>> +  * implied barrier in atomic64_read() synchronizes
>> +  * with inc_mm_tlb_gen() like this:
>
> You mean the implied memory barrier in cpumask_set_cpu(), no?
>


Ugh, yes.  And I misread PeterZ's email and incorrectly removed the
smp_mb__after_atomic().  I'll respin this patch.


Re: [PATCH v5 2/2] x86/mm: Improve TLB flush documentation

2017-07-24 Thread Nadav Amit
Andy Lutomirski  wrote:

> Improve comments as requested by PeterZ and also add some
> documentation at the top of the file.
> 
> Signed-off-by: Andy Lutomirski 
> ---
> arch/x86/mm/tlb.c | 43 +--
> 1 file changed, 33 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index ce104b962a17..d4ee781ca656 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -15,17 +15,24 @@
> #include 
> 
> /*
> - *   TLB flushing, formerly SMP-only
> - *   c/o Linus Torvalds.
> + * The code in this file handles mm switches and TLB flushes.
>  *
> - *   These mean you can really definitely utterly forget about
> - *   writing to user space from interrupts. (Its not allowed anyway).
> + * An mm's TLB state is logically represented by a totally ordered sequence
> + * of TLB flushes.  Each flush increments the mm's tlb_gen.
>  *
> - *   Optimizations Manfred Spraul 
> + * Each CPU that might have an mm in its TLB (and that might ever use
> + * those TLB entries) will have an entry for it in its cpu_tlbstate.ctxs
> + * array.  The kernel maintains the following invariant: for each CPU and
> + * for each mm in its cpu_tlbstate.ctxs array, the CPU has performed all
> + * flushes in that mms history up to the tlb_gen in cpu_tlbstate.ctxs
> + * or the CPU has performed an equivalent set of flushes.
>  *
> - *   More scalable flush, from Andi Kleen
> - *
> - *   Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi
> + * For this purpose, an equivalent set is a set that is at least as strong.
> + * So, for example, if the flush history is a full flush at time 1,
> + * a full flush after time 1 is sufficient, but a full flush before time 1
> + * is not.  Similarly, any number of flushes can be replaced by a single
> + * full flush so long as that replacement flush is after all the flushes
> + * that it's replacing.
>  */
> 
> atomic64_t last_mm_ctx_id = ATOMIC64_INIT(1);
> @@ -138,7 +145,16 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct 
> mm_struct *next,
>   return;
>   }
> 
> - /* Resume remote flushes and then read tlb_gen. */
> + /*
> +  * Resume remote flushes and then read tlb_gen.  The
> +  * implied barrier in atomic64_read() synchronizes
> +  * with inc_mm_tlb_gen() like this:

You mean the implied memory barrier in cpumask_set_cpu(), no?



[PATCH v5 2/2] x86/mm: Improve TLB flush documentation

2017-07-24 Thread Andy Lutomirski
Improve comments as requested by PeterZ and also add some
documentation at the top of the file.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/mm/tlb.c | 43 +--
 1 file changed, 33 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index ce104b962a17..d4ee781ca656 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -15,17 +15,24 @@
 #include 
 
 /*
- * TLB flushing, formerly SMP-only
- * c/o Linus Torvalds.
+ * The code in this file handles mm switches and TLB flushes.
  *
- * These mean you can really definitely utterly forget about
- * writing to user space from interrupts. (Its not allowed anyway).
+ * An mm's TLB state is logically represented by a totally ordered sequence
+ * of TLB flushes.  Each flush increments the mm's tlb_gen.
  *
- * Optimizations Manfred Spraul 
+ * Each CPU that might have an mm in its TLB (and that might ever use
+ * those TLB entries) will have an entry for it in its cpu_tlbstate.ctxs
+ * array.  The kernel maintains the following invariant: for each CPU and
+ * for each mm in its cpu_tlbstate.ctxs array, the CPU has performed all
+ * flushes in that mms history up to the tlb_gen in cpu_tlbstate.ctxs
+ * or the CPU has performed an equivalent set of flushes.
  *
- * More scalable flush, from Andi Kleen
- *
- * Implement flush IPI by CALL_FUNCTION_VECTOR, Alex Shi
+ * For this purpose, an equivalent set is a set that is at least as strong.
+ * So, for example, if the flush history is a full flush at time 1,
+ * a full flush after time 1 is sufficient, but a full flush before time 1
+ * is not.  Similarly, any number of flushes can be replaced by a single
+ * full flush so long as that replacement flush is after all the flushes
+ * that it's replacing.
  */
 
 atomic64_t last_mm_ctx_id = ATOMIC64_INIT(1);
@@ -138,7 +145,16 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct 
mm_struct *next,
return;
}
 
-   /* Resume remote flushes and then read tlb_gen. */
+   /*
+* Resume remote flushes and then read tlb_gen.  The
+* implied barrier in atomic64_read() synchronizes
+* with inc_mm_tlb_gen() like this:
+*
+* switch_mm_irqs_off():flush request:
+*  cpumask_set_cpu(...);inc_mm_tlb_gen();
+*  MB   MB
+*  atomic64_read(.tlb_gen); flush_tlb_others(mm_cpumask());
+*/
cpumask_set_cpu(cpu, mm_cpumask(next));
next_tlb_gen = atomic64_read(&next->context.tlb_gen);
 
@@ -186,7 +202,14 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct 
mm_struct *next,
VM_WARN_ON_ONCE(cpumask_test_cpu(cpu, mm_cpumask(next)));
 
/*
-* Start remote flushes and then read tlb_gen.
+* Start remote flushes and then read tlb_gen.  As
+* above, the implied barrier in atomic64_read()
+* synchronizes with inc_mm_tlb_gen() like this:
+*
+* switch_mm_irqs_off():flush request:
+*  cpumask_set_cpu(...);inc_mm_tlb_gen();
+*  MB   MB
+*  atomic64_read(.tlb_gen); flush_tlb_others(mm_cpumask());
 */
cpumask_set_cpu(cpu, mm_cpumask(next));
next_tlb_gen = atomic64_read(&next->context.tlb_gen);
-- 
2.9.4