[PATCH 3/7] smp: use __cpumask_set_cpu in on_each_cpu_cond

2018-09-25 Thread Rik van Riel
The code in on_each_cpu_cond sets CPUs in a locally allocated bitmask, which should never be used by other CPUs simultaneously. There is no need to use locked memory accesses to set the bits in this bitmap. Switch to __cpumask_set_cpu. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel

[PATCH 2/7] x86/mm/tlb: Restructure switch_mm_irqs_off()

2018-09-25 Thread Rik van Riel
Move some code that will be needed for the lazy -> !lazy state transition when a lazy TLB CPU has gotten out of date. No functional changes, since the if (real_prev == next) branch always returns. Suggested-by: Andy Lutomirski Signed-off-by: Rik van Riel Acked-by: Dave Hansen Cc: Li

[PATCH 1/7] x86/mm/tlb: Always use lazy TLB mode

2018-09-25 Thread Rik van Riel
Now that CPUs in lazy TLB mode no longer receive TLB shootdown IPIs, except at page table freeing time, and idle CPUs will no longer get shootdown IPIs for things like mprotect and madvise, we can always use lazy TLB mode. Tested-by: Song Liu Signed-off-by: Rik van Riel Acked-by: Dave Hansen

[PATCH 3/7] smp: use __cpumask_set_cpu in on_each_cpu_cond

2018-09-25 Thread Rik van Riel
The code in on_each_cpu_cond sets CPUs in a locally allocated bitmask, which should never be used by other CPUs simultaneously. There is no need to use locked memory accesses to set the bits in this bitmap. Switch to __cpumask_set_cpu. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel

[PATCH 2/7] x86/mm/tlb: Restructure switch_mm_irqs_off()

2018-09-25 Thread Rik van Riel
Move some code that will be needed for the lazy -> !lazy state transition when a lazy TLB CPU has gotten out of date. No functional changes, since the if (real_prev == next) branch always returns. Suggested-by: Andy Lutomirski Signed-off-by: Rik van Riel Acked-by: Dave Hansen Cc: Li

[PATCH 1/7] x86/mm/tlb: Always use lazy TLB mode

2018-09-25 Thread Rik van Riel
Now that CPUs in lazy TLB mode no longer receive TLB shootdown IPIs, except at page table freeing time, and idle CPUs will no longer get shootdown IPIs for things like mprotect and madvise, we can always use lazy TLB mode. Tested-by: Song Liu Signed-off-by: Rik van Riel Acked-by: Dave Hansen

[PATCH 6/7] Add freed_tables element to flush_tlb_info

2018-09-25 Thread Rik van Riel
Pass the information on to native_flush_tlb_others. No functional changes. Signed-off-by: Rik van Riel --- arch/x86/include/asm/tlbflush.h | 1 + arch/x86/mm/tlb.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm

[PATCH 6/7] Add freed_tables element to flush_tlb_info

2018-09-25 Thread Rik van Riel
Pass the information on to native_flush_tlb_others. No functional changes. Signed-off-by: Rik van Riel --- arch/x86/include/asm/tlbflush.h | 1 + arch/x86/mm/tlb.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm

[PATCH v2 0/7] x86/mm/tlb: make lazy TLB mode even lazier

2018-09-25 Thread Rik van Riel
Linus asked me to come up with a smaller patch set to get the benefits of lazy TLB mode, so I spent some time trying out various permutations of the code, with a few workloads that do lots of context switches, and also happen to have a fair number of TLB flushes a second. Both of the workloads

[PATCH v2 0/7] x86/mm/tlb: make lazy TLB mode even lazier

2018-09-25 Thread Rik van Riel
Linus asked me to come up with a smaller patch set to get the benefits of lazy TLB mode, so I spent some time trying out various permutations of the code, with a few workloads that do lots of context switches, and also happen to have a fair number of TLB flushes a second. Both of the workloads

Re: [PATCH 0/7] x86/mm/tlb: make lazy TLB mode even lazier

2018-09-24 Thread Rik van Riel
On Mon, 2018-09-24 at 14:37 -0400, Rik van Riel wrote: > Linus asked me to come up with a smaller patch set to get the > benefits > of lazy TLB mode, so I spent some time trying out various > permutations > of the code, with a few workloads that do lots of context switches, >

Re: [PATCH 0/7] x86/mm/tlb: make lazy TLB mode even lazier

2018-09-24 Thread Rik van Riel
On Mon, 2018-09-24 at 14:37 -0400, Rik van Riel wrote: > Linus asked me to come up with a smaller patch set to get the > benefits > of lazy TLB mode, so I spent some time trying out various > permutations > of the code, with a few workloads that do lots of context switches, >

[PATCH 5/7] Add freed_tables argument to flush_tlb_mm_range

2018-09-24 Thread Rik van Riel
Add an argument to flush_tlb_mm_range to indicate whether page tables are about to be freed after this TLB flush. This allows for an optimization of flush_tlb_mm_range to skip CPUs in lazy TLB mode. No functional changes. Signed-off-by: Rik van Riel --- arch/x86/include/asm/tlb.h | 6

[PATCH 4/7] smp,cpumask: introduce on_each_cpu_cond_mask

2018-09-24 Thread Rik van Riel
Introduce a variant of on_each_cpu_cond that iterates only over the CPUs in a cpumask, in order to avoid making callbacks for every single CPU in the system when we only need to test a subset. Signed-off-by: Rik van Riel --- include/linux/smp.h | 4 kernel/smp.c| 17

[PATCH 3/7] smp: use __cpumask_set_cpu in on_each_cpu_cond

2018-09-24 Thread Rik van Riel
The code in on_each_cpu_cond sets CPUs in a locally allocated bitmask, which should never be used by other CPUs simultaneously. There is no need to use locked memory accesses to set the bits in this bitmap. Switch to __cpumask_set_cpu. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel

[PATCH 3/7] smp: use __cpumask_set_cpu in on_each_cpu_cond

2018-09-24 Thread Rik van Riel
The code in on_each_cpu_cond sets CPUs in a locally allocated bitmask, which should never be used by other CPUs simultaneously. There is no need to use locked memory accesses to set the bits in this bitmap. Switch to __cpumask_set_cpu. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel

[PATCH 5/7] Add freed_tables argument to flush_tlb_mm_range

2018-09-24 Thread Rik van Riel
Add an argument to flush_tlb_mm_range to indicate whether page tables are about to be freed after this TLB flush. This allows for an optimization of flush_tlb_mm_range to skip CPUs in lazy TLB mode. No functional changes. Signed-off-by: Rik van Riel --- arch/x86/include/asm/tlb.h | 6

[PATCH 4/7] smp,cpumask: introduce on_each_cpu_cond_mask

2018-09-24 Thread Rik van Riel
Introduce a variant of on_each_cpu_cond that iterates only over the CPUs in a cpumask, in order to avoid making callbacks for every single CPU in the system when we only need to test a subset. Signed-off-by: Rik van Riel --- include/linux/smp.h | 4 kernel/smp.c| 17

[PATCH 2/7] x86/mm/tlb: Restructure switch_mm_irqs_off()

2018-09-24 Thread Rik van Riel
Move some code that will be needed for the lazy -> !lazy state transition when a lazy TLB CPU has gotten out of date. No functional changes, since the if (real_prev == next) branch always returns. Suggested-by: Andy Lutomirski Signed-off-by: Rik van Riel Acked-by: Dave Hansen Cc: Li

[PATCH 6/7] Add freed_tables element to flush_tlb_info

2018-09-24 Thread Rik van Riel
Pass the information on to native_flush_tlb_others. No functional changes. Signed-off-by: Rik van Riel --- arch/x86/include/asm/tlbflush.h | 1 + arch/x86/mm/tlb.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm

[PATCH 0/7] x86/mm/tlb: make lazy TLB mode even lazier

2018-09-24 Thread Rik van Riel
Linus asked me to come up with a smaller patch set to get the benefits of lazy TLB mode, so I spent some time trying out various permutations of the code, with a few workloads that do lots of context switches, and also happen to have a fair number of TLB flushes a second. Both of the workloads

[PATCH 1/7] x86/mm/tlb: Always use lazy TLB mode

2018-09-24 Thread Rik van Riel
Now that CPUs in lazy TLB mode no longer receive TLB shootdown IPIs, except at page table freeing time, and idle CPUs will no longer get shootdown IPIs for things like mprotect and madvise, we can always use lazy TLB mode. Tested-by: Song Liu Signed-off-by: Rik van Riel Acked-by: Dave Hansen

[PATCH 7/7] x86/mm/tlb: Make lazy TLB mode lazier

2018-09-24 Thread Rik van Riel
ode remain part of the mm_cpumask(mm), both because that allows TLB flush IPIs to be sent at page table freeing time, and because the cache line bouncing on the mm_cpumask(mm) was responsible for about half the CPU use in switch_mm_irqs_off(). Tested-by: Song Liu Signed-off-by: Rik van Riel --- a

[PATCH 2/7] x86/mm/tlb: Restructure switch_mm_irqs_off()

2018-09-24 Thread Rik van Riel
Move some code that will be needed for the lazy -> !lazy state transition when a lazy TLB CPU has gotten out of date. No functional changes, since the if (real_prev == next) branch always returns. Suggested-by: Andy Lutomirski Signed-off-by: Rik van Riel Acked-by: Dave Hansen Cc: Li

[PATCH 6/7] Add freed_tables element to flush_tlb_info

2018-09-24 Thread Rik van Riel
Pass the information on to native_flush_tlb_others. No functional changes. Signed-off-by: Rik van Riel --- arch/x86/include/asm/tlbflush.h | 1 + arch/x86/mm/tlb.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm

[PATCH 0/7] x86/mm/tlb: make lazy TLB mode even lazier

2018-09-24 Thread Rik van Riel
Linus asked me to come up with a smaller patch set to get the benefits of lazy TLB mode, so I spent some time trying out various permutations of the code, with a few workloads that do lots of context switches, and also happen to have a fair number of TLB flushes a second. Both of the workloads

[PATCH 1/7] x86/mm/tlb: Always use lazy TLB mode

2018-09-24 Thread Rik van Riel
Now that CPUs in lazy TLB mode no longer receive TLB shootdown IPIs, except at page table freeing time, and idle CPUs will no longer get shootdown IPIs for things like mprotect and madvise, we can always use lazy TLB mode. Tested-by: Song Liu Signed-off-by: Rik van Riel Acked-by: Dave Hansen

[PATCH 7/7] x86/mm/tlb: Make lazy TLB mode lazier

2018-09-24 Thread Rik van Riel
ode remain part of the mm_cpumask(mm), both because that allows TLB flush IPIs to be sent at page table freeing time, and because the cache line bouncing on the mm_cpumask(mm) was responsible for about half the CPU use in switch_mm_irqs_off(). Tested-by: Song Liu Signed-off-by: Rik van Riel --- a

Re: [RFC 00/60] Coscheduling for Linux

2018-09-24 Thread Rik van Riel
On Mon, 2018-09-24 at 17:23 +0200, Jan H. Schönherr wrote: > On 09/18/2018 04:40 PM, Rik van Riel wrote: > > On Fri, 2018-09-14 at 18:25 +0200, Jan H. Schönherr wrote: > > > On 09/14/2018 01:12 PM, Peter Zijlstra wrote: > > > > On Fri, Sep 07, 2018 at 11:3

Re: [RFC 00/60] Coscheduling for Linux

2018-09-24 Thread Rik van Riel
On Mon, 2018-09-24 at 17:23 +0200, Jan H. Schönherr wrote: > On 09/18/2018 04:40 PM, Rik van Riel wrote: > > On Fri, 2018-09-14 at 18:25 +0200, Jan H. Schönherr wrote: > > > On 09/14/2018 01:12 PM, Peter Zijlstra wrote: > > > > On Fri, Sep 07, 2018 at 11:3

Re: Code of Conduct: Let's revamp it.

2018-09-20 Thread Rik van Riel
On Thu, 2018-09-20 at 03:14 +0100, Edward Cree wrote: > I think there are important differences between code to be run by > CPUs > and a Code to be run by humans. And when the author goes on a > victory > lap on Twitter and declares the Code to be "a political document", > is > it any

Re: Code of Conduct: Let's revamp it.

2018-09-20 Thread Rik van Riel
On Thu, 2018-09-20 at 03:14 +0100, Edward Cree wrote: > I think there are important differences between code to be run by > CPUs > and a Code to be run by humans. And when the author goes on a > victory > lap on Twitter and declares the Code to be "a political document", > is > it any

Re: [RFC PATCH 04/10 v2 ] x86/fpu: eager switch PKRU state

2018-09-19 Thread Rik van Riel
> On Sep 19, 2018, at 1:00 PM, Paolo Bonzini wrote: > > On 19/09/2018 18:57, Sebastian Andrzej Siewior wrote: >> On 2018-09-19 07:55:51 [+0200], Paolo Bonzini wrote: >>> A kthread can do use_mm/unuse_mm. >> >> indeed. The FPU struct for the kernel thread isn't valid / does not >> contain the

Re: [RFC PATCH 04/10 v2 ] x86/fpu: eager switch PKRU state

2018-09-19 Thread Rik van Riel
> On Sep 19, 2018, at 1:00 PM, Paolo Bonzini wrote: > > On 19/09/2018 18:57, Sebastian Andrzej Siewior wrote: >> On 2018-09-19 07:55:51 [+0200], Paolo Bonzini wrote: >>> A kthread can do use_mm/unuse_mm. >> >> indeed. The FPU struct for the kernel thread isn't valid / does not >> contain the

Re: [RFC PATCH 04/10 v2 ] x86/fpu: eager switch PKRU state

2018-09-18 Thread Rik van Riel
On Tue, 2018-09-18 at 18:04 +0200, Sebastian Andrzej Siewior wrote: > On 2018-09-18 17:29:52 [+0200], Paolo Bonzini wrote: > > > I don't think it matters what the PKRU state is > > > for kernel threads, since kernel PTEs should not > > > be using protection keys anyway. > > > > What about

Re: [RFC PATCH 04/10 v2 ] x86/fpu: eager switch PKRU state

2018-09-18 Thread Rik van Riel
On Tue, 2018-09-18 at 18:04 +0200, Sebastian Andrzej Siewior wrote: > On 2018-09-18 17:29:52 [+0200], Paolo Bonzini wrote: > > > I don't think it matters what the PKRU state is > > > for kernel threads, since kernel PTEs should not > > > be using protection keys anyway. > > > > What about

Re: [RFC PATCH 04/10 v2 ] x86/fpu: eager switch PKRU state

2018-09-18 Thread Rik van Riel
On Tue, 2018-09-18 at 17:07 +0200, Paolo Bonzini wrote: > On 18/09/2018 16:27, Sebastian Andrzej Siewior wrote: > > > Likewise, move this to fpu__clear and outside "if > > > (static_cpu_has(X86_FEATURE_FPU))"? > > > > okay. But if there is no FPU we did not save/restore the pkru > > value. Is > >

Re: [RFC PATCH 04/10 v2 ] x86/fpu: eager switch PKRU state

2018-09-18 Thread Rik van Riel
On Tue, 2018-09-18 at 17:07 +0200, Paolo Bonzini wrote: > On 18/09/2018 16:27, Sebastian Andrzej Siewior wrote: > > > Likewise, move this to fpu__clear and outside "if > > > (static_cpu_has(X86_FEATURE_FPU))"? > > > > okay. But if there is no FPU we did not save/restore the pkru > > value. Is > >

Re: [RFC 00/60] Coscheduling for Linux

2018-09-18 Thread Rik van Riel
On Fri, 2018-09-14 at 18:25 +0200, Jan H. Schönherr wrote: > On 09/14/2018 01:12 PM, Peter Zijlstra wrote: > > On Fri, Sep 07, 2018 at 11:39:47PM +0200, Jan H. Schönherr wrote: > > > > > > B) Why would I want this? > > >In the L1TF context, it prevents other applications from > > > loading >

Re: [RFC 00/60] Coscheduling for Linux

2018-09-18 Thread Rik van Riel
On Fri, 2018-09-14 at 18:25 +0200, Jan H. Schönherr wrote: > On 09/14/2018 01:12 PM, Peter Zijlstra wrote: > > On Fri, Sep 07, 2018 at 11:39:47PM +0200, Jan H. Schönherr wrote: > > > > > > B) Why would I want this? > > >In the L1TF context, it prevents other applications from > > > loading >

Re: Task group cleanups and optimizations (was: Re: [RFC 00/60] Coscheduling for Linux)

2018-09-18 Thread Rik van Riel
On Tue, 2018-09-18 at 15:22 +0200, Jan H. Schönherr wrote: > On 09/17/2018 11:48 AM, Peter Zijlstra wrote: > > On Sat, Sep 15, 2018 at 10:48:20AM +0200, Jan H. Schönherr wrote: > > > > > > > > CFS bandwidth control would also need to change significantly as > > > we would now > > > have to

Re: Task group cleanups and optimizations (was: Re: [RFC 00/60] Coscheduling for Linux)

2018-09-18 Thread Rik van Riel
On Tue, 2018-09-18 at 15:22 +0200, Jan H. Schönherr wrote: > On 09/17/2018 11:48 AM, Peter Zijlstra wrote: > > On Sat, Sep 15, 2018 at 10:48:20AM +0200, Jan H. Schönherr wrote: > > > > > > > > CFS bandwidth control would also need to change significantly as > > > we would now > > > have to

Re: [RFC PATCH 04/10] x86/fpu: eager switch PKRU state

2018-09-12 Thread Rik van Riel
On Wed, 2018-09-12 at 08:20 -0700, Andy Lutomirski wrote: > > > > --- a/arch/x86/mm/pkeys.c > > +++ b/arch/x86/mm/pkeys.c > > @@ -18,6 +18,20 @@ > > > > #include /* boot_cpu_has, > > ...*/ > > #include /* > > vma_pkey() */ > > +#include > >

Re: [RFC PATCH 04/10] x86/fpu: eager switch PKRU state

2018-09-12 Thread Rik van Riel
On Wed, 2018-09-12 at 08:20 -0700, Andy Lutomirski wrote: > > > > --- a/arch/x86/mm/pkeys.c > > +++ b/arch/x86/mm/pkeys.c > > @@ -18,6 +18,20 @@ > > > > #include /* boot_cpu_has, > > ...*/ > > #include /* > > vma_pkey() */ > > +#include > >

Re: [RFC PATCH 1/2] mm: move tlb_table_flush to tlb_flush_mmu_free

2018-09-07 Thread Rik van Riel
On Fri, 2018-09-07 at 14:44 +0100, Will Deacon wrote: > On Thu, Sep 06, 2018 at 04:29:59PM -0400, Rik van Riel wrote: > > On Thu, 2018-08-23 at 18:47 +1000, Nicholas Piggin wrote: > > > There is no need to call this from tlb_flush_mmu_tlbonly, it > > > logically belo

Re: [RFC PATCH 1/2] mm: move tlb_table_flush to tlb_flush_mmu_free

2018-09-07 Thread Rik van Riel
On Fri, 2018-09-07 at 14:44 +0100, Will Deacon wrote: > On Thu, Sep 06, 2018 at 04:29:59PM -0400, Rik van Riel wrote: > > On Thu, 2018-08-23 at 18:47 +1000, Nicholas Piggin wrote: > > > There is no need to call this from tlb_flush_mmu_tlbonly, it > > > logically belo

Re: [RFC PATCH 1/2] mm: move tlb_table_flush to tlb_flush_mmu_free

2018-09-06 Thread Rik van Riel
On Thu, 2018-08-23 at 18:47 +1000, Nicholas Piggin wrote: > There is no need to call this from tlb_flush_mmu_tlbonly, it > logically belongs with tlb_flush_mmu_free. This allows some > code consolidation with a subsequent fix. > > Signed-off-by: Nicholas Piggin Reviewed-b

Re: [RFC PATCH 1/2] mm: move tlb_table_flush to tlb_flush_mmu_free

2018-09-06 Thread Rik van Riel
On Thu, 2018-08-23 at 18:47 +1000, Nicholas Piggin wrote: > There is no need to call this from tlb_flush_mmu_tlbonly, it > logically belongs with tlb_flush_mmu_free. This allows some > code consolidation with a subsequent fix. > > Signed-off-by: Nicholas Piggin Reviewed-b

Re: [PATCH] mm: slowly shrink slabs with a relatively small number of objects

2018-08-31 Thread Rik van Riel
On Fri, 2018-08-31 at 14:31 -0700, Roman Gushchin wrote: > On Fri, Aug 31, 2018 at 05:15:39PM -0400, Rik van Riel wrote: > > On Fri, 2018-08-31 at 13:34 -0700, Roman Gushchin wrote: > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > index fa2c150ab7b9..c

Re: [PATCH] mm: slowly shrink slabs with a relatively small number of objects

2018-08-31 Thread Rik van Riel
On Fri, 2018-08-31 at 14:31 -0700, Roman Gushchin wrote: > On Fri, Aug 31, 2018 at 05:15:39PM -0400, Rik van Riel wrote: > > On Fri, 2018-08-31 at 13:34 -0700, Roman Gushchin wrote: > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > index fa2c150ab7b9..c

Re: [PATCH] mm: slowly shrink slabs with a relatively small number of objects

2018-08-31 Thread Rik van Riel
it? With this patch, a slab with 5000 objects on it will get 1 item scanned, while a slab with 4000 objects on it will see shrinker->batch or SHRINK_BATCH objects scanned every time. I don't know if this would cause any issues, just something to ponder. If nobody things this is a problem, you ca

Re: [PATCH] mm: slowly shrink slabs with a relatively small number of objects

2018-08-31 Thread Rik van Riel
it? With this patch, a slab with 5000 objects on it will get 1 item scanned, while a slab with 4000 objects on it will see shrinker->batch or SHRINK_BATCH objects scanned every time. I don't know if this would cause any issues, just something to ponder. If nobody things this is a problem, you ca

Re: [PATCH v2] x86/nmi: Fix some races in NMI uaccess

2018-08-29 Thread Rik van Riel
he wrong memory. > > Fix it by adding a new nmi_uaccess_okay() helper and checking it in > copy_from_user_nmi() and in __copy_from_user_nmi()'s callers. > > Cc: sta...@vger.kernel.org > Cc: Peter Zijlstra > Cc: Nadav Amit > Signed-off-by: Andy Lutomirski Reviewed-by:

Re: [PATCH v2] x86/nmi: Fix some races in NMI uaccess

2018-08-29 Thread Rik van Riel
he wrong memory. > > Fix it by adding a new nmi_uaccess_okay() helper and checking it in > copy_from_user_nmi() and in __copy_from_user_nmi()'s callers. > > Cc: sta...@vger.kernel.org > Cc: Peter Zijlstra > Cc: Nadav Amit > Signed-off-by: Andy Lutomirski Reviewed-by:

Re: [PATCH v2] x86/nmi: Fix some races in NMI uaccess

2018-08-29 Thread Rik van Riel
On Wed, 2018-08-29 at 08:36 -0700, Andy Lutomirski wrote: > On Wed, Aug 29, 2018 at 8:17 AM, Rik van Riel > wrote: > > On Tue, 2018-08-28 at 20:46 -0700, Andy Lutomirski wrote: > > > On Tue, Aug 28, 2018 at 10:56 AM, Rik van Riel > > > wrote: > > &g

Re: [PATCH v2] x86/nmi: Fix some races in NMI uaccess

2018-08-29 Thread Rik van Riel
On Wed, 2018-08-29 at 08:36 -0700, Andy Lutomirski wrote: > On Wed, Aug 29, 2018 at 8:17 AM, Rik van Riel > wrote: > > On Tue, 2018-08-28 at 20:46 -0700, Andy Lutomirski wrote: > > > On Tue, Aug 28, 2018 at 10:56 AM, Rik van Riel > > > wrote: > > &g

Re: [PATCH v2] x86/nmi: Fix some races in NMI uaccess

2018-08-29 Thread Rik van Riel
On Tue, 2018-08-28 at 20:46 -0700, Andy Lutomirski wrote: > On Tue, Aug 28, 2018 at 10:56 AM, Rik van Riel > wrote: > > On Mon, 27 Aug 2018 16:04:16 -0700 > > Andy Lutomirski wrote: > > > > > The 0day bot is still chewing on this, but I've tested it a bit >

Re: [PATCH v2] x86/nmi: Fix some races in NMI uaccess

2018-08-29 Thread Rik van Riel
On Tue, 2018-08-28 at 20:46 -0700, Andy Lutomirski wrote: > On Tue, Aug 28, 2018 at 10:56 AM, Rik van Riel > wrote: > > On Mon, 27 Aug 2018 16:04:16 -0700 > > Andy Lutomirski wrote: > > > > > The 0day bot is still chewing on this, but I've tested it a bit >

[PATCH v2] x86/nmi: Fix some races in NMI uaccess

2018-08-28 Thread Rik van Riel
r.kernel.org Cc: Peter Zijlstra Cc: Nadav Amit Signed-off-by: Rik van Riel Signed-off-by: Andy Lutomirski --- arch/x86/events/core.c | 2 +- arch/x86/include/asm/tlbflush.h | 15 +++ arch/x86/lib/usercopy.c | 5 + arch/x86/mm/tlb.c | 9 +++

[PATCH v2] x86/nmi: Fix some races in NMI uaccess

2018-08-28 Thread Rik van Riel
r.kernel.org Cc: Peter Zijlstra Cc: Nadav Amit Signed-off-by: Rik van Riel Signed-off-by: Andy Lutomirski --- arch/x86/events/core.c | 2 +- arch/x86/include/asm/tlbflush.h | 15 +++ arch/x86/lib/usercopy.c | 5 + arch/x86/mm/tlb.c | 9 +++

Re: [PATCH] x86/nmi: Fix some races in NMI uaccess

2018-08-28 Thread Rik van Riel
On Mon, 2018-08-27 at 19:10 -0700, Andy Lutomirski wrote: > On Mon, Aug 27, 2018 at 6:31 PM, Rik van Riel > wrote: > > > What is special about this path wrt nmi_uaccess_ok that is > > not also true for the need_flush branch right above it? > > > > What am I

Re: [PATCH] x86/nmi: Fix some races in NMI uaccess

2018-08-28 Thread Rik van Riel
On Mon, 2018-08-27 at 19:10 -0700, Andy Lutomirski wrote: > On Mon, Aug 27, 2018 at 6:31 PM, Rik van Riel > wrote: > > > What is special about this path wrt nmi_uaccess_ok that is > > not also true for the need_flush branch right above it? > > > > What am I

Re: [PATCH] x86/nmi: Fix some races in NMI uaccess

2018-08-27 Thread Rik van Riel
On Mon, 2018-08-27 at 16:04 -0700, Andy Lutomirski wrote: > +++ b/arch/x86/mm/tlb.c > @@ -345,6 +345,9 @@ void switch_mm_irqs_off(struct mm_struct *prev, > struct mm_struct *next, >*/ > trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, > TLB_FLUSH_ALL); > }

Re: [PATCH] x86/nmi: Fix some races in NMI uaccess

2018-08-27 Thread Rik van Riel
On Mon, 2018-08-27 at 16:04 -0700, Andy Lutomirski wrote: > +++ b/arch/x86/mm/tlb.c > @@ -345,6 +345,9 @@ void switch_mm_irqs_off(struct mm_struct *prev, > struct mm_struct *next, >*/ > trace_tlb_flush_rcuidle(TLB_FLUSH_ON_TASK_SWITCH, > TLB_FLUSH_ALL); > }

Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-27 Thread Rik van Riel
On Mon, 2018-08-27 at 18:04 +1000, Nicholas Piggin wrote: > It could do that. It requires a tlbie that matches the page size, > so it means 3 sizes. I think possibly even that would be better > than current code, but we could do better if we had a few specific > fields in there. Would it cause a

Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

2018-08-27 Thread Rik van Riel
On Mon, 2018-08-27 at 18:04 +1000, Nicholas Piggin wrote: > It could do that. It requires a tlbie that matches the page size, > so it means 3 sizes. I think possibly even that would be better > than current code, but we could do better if we had a few specific > fields in there. Would it cause a

Re: [PATCH 1/4] x86/mm/tlb: Revert the recent lazy TLB patches

2018-08-22 Thread Rik van Riel
On Wed, 2018-08-22 at 14:37 -0700, Linus Torvalds wrote: > On Wed, Aug 22, 2018 at 8:46 AM Peter Zijlstra > wrote: > > > > Revert [..] in order to simplify the TLB invalidate fixes for x86. > > We'll try again later. > > Rik, I assume I should take your earlier "yeah, I can try later" as > an >

Re: [PATCH 1/4] x86/mm/tlb: Revert the recent lazy TLB patches

2018-08-22 Thread Rik van Riel
On Wed, 2018-08-22 at 14:37 -0700, Linus Torvalds wrote: > On Wed, Aug 22, 2018 at 8:46 AM Peter Zijlstra > wrote: > > > > Revert [..] in order to simplify the TLB invalidate fixes for x86. > > We'll try again later. > > Rik, I assume I should take your earlier "yeah, I can try later" as > an >

Re: [PATCH 2/7] x86,tlb: leave lazy TLB mode at page table free time

2018-08-15 Thread Rik van Riel
On Wed, 2018-08-15 at 18:54 -0700, Andy Lutomirski wrote: > On Mon, Jul 16, 2018 at 12:03 PM, Rik van Riel > wrote: > Hi Rik- > > I was looking through this, and I see: > > > -static void tlb_remove_table_one(void *table) > > +static void tlb_remove_table_one(

Re: [PATCH 2/7] x86,tlb: leave lazy TLB mode at page table free time

2018-08-15 Thread Rik van Riel
On Wed, 2018-08-15 at 18:54 -0700, Andy Lutomirski wrote: > On Mon, Jul 16, 2018 at 12:03 PM, Rik van Riel > wrote: > Hi Rik- > > I was looking through this, and I see: > > > -static void tlb_remove_table_one(void *table) > > +static void tlb_remove_table_one(

Re: [PATCH 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-08-03 Thread Rik van Riel
On Fri, 2018-08-03 at 19:25 +0200, Peter Zijlstra wrote: > On Fri, Aug 03, 2018 at 12:40:48PM -0400, Rik van Riel wrote: > > On Fri, 2018-08-03 at 17:56 +0200, Peter Zijlstra wrote: > > > > > > Why can't we skip the ->active_mm swizzle and keep ->active_mm

Re: [PATCH 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-08-03 Thread Rik van Riel
On Fri, 2018-08-03 at 19:25 +0200, Peter Zijlstra wrote: > On Fri, Aug 03, 2018 at 12:40:48PM -0400, Rik van Riel wrote: > > On Fri, 2018-08-03 at 17:56 +0200, Peter Zijlstra wrote: > > > > > > Why can't we skip the ->active_mm swizzle and keep ->active_mm

Re: [PATCH 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-08-03 Thread Rik van Riel
On Fri, 2018-08-03 at 17:56 +0200, Peter Zijlstra wrote: > On Wed, Aug 01, 2018 at 06:02:55AM -0400, Rik van Riel wrote: > > Conditionally skip lazy TLB mm refcounting. When an architecture > > has > > CONFIG_ARCH_NO_ACTIVE_MM_REFCOUNTING enabled, an mm that is used in >

Re: [PATCH 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-08-03 Thread Rik van Riel
On Fri, 2018-08-03 at 17:56 +0200, Peter Zijlstra wrote: > On Wed, Aug 01, 2018 at 06:02:55AM -0400, Rik van Riel wrote: > > Conditionally skip lazy TLB mm refcounting. When an architecture > > has > > CONFIG_ARCH_NO_ACTIVE_MM_REFCOUNTING enabled, an mm that is used in >

[PATCH 07/11] x86,mm: remove leave_mm cpu argument

2018-08-01 Thread Rik van Riel
The function leave_mm does not use its cpu argument, but always works on the CPU where it is called. Change the argument to a void *, so leave_mm can be called directly from smp_call_function_mask, and stop looking up the CPU number in current leave_mm callers. Signed-off-by: Rik van Riel

[PATCH 07/11] x86,mm: remove leave_mm cpu argument

2018-08-01 Thread Rik van Riel
The function leave_mm does not use its cpu argument, but always works on the CPU where it is called. Change the argument to a void *, so leave_mm can be called directly from smp_call_function_mask, and stop looking up the CPU number in current leave_mm callers. Signed-off-by: Rik van Riel

[PATCH 01/11] x86,tlb: clarify memory barrier in switch_mm_irqs_off

2018-08-01 Thread Rik van Riel
Clarify exactly what the memory barrier synchronizes with. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel Reviewed-by: Andy Lutomirski --- arch/x86/mm/tlb.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index

[PATCH 01/11] x86,tlb: clarify memory barrier in switch_mm_irqs_off

2018-08-01 Thread Rik van Riel
Clarify exactly what the memory barrier synchronizes with. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel Reviewed-by: Andy Lutomirski --- arch/x86/mm/tlb.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index

[PATCH 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-08-01 Thread Rik van Riel
is about to start using. Signed-off-by: Rik van Riel --- arch/x86/mm/tlb.c| 5 +++-- fs/exec.c| 2 +- include/linux/sched/mm.h | 25 + kernel/sched/core.c | 29 + mm/mmu_context.c | 21

[PATCH 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-08-01 Thread Rik van Riel
is about to start using. Signed-off-by: Rik van Riel --- arch/x86/mm/tlb.c| 5 +++-- fs/exec.c| 2 +- include/linux/sched/mm.h | 25 + kernel/sched/core.c | 29 + mm/mmu_context.c | 21

[PATCH 10/11] x86,tlb: really leave mm on shootdown

2018-08-01 Thread Rik van Riel
in lazy TLB mode. Signed-off-by: Rik van Riel --- arch/x86/mm/tlb.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 7b1add904396..425cb9fa2640 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -140,6 +140,8 @@ void leave_mm(void *dummy

[PATCH 04/11] x86,mm: use on_each_cpu_cond for TLB flushes

2018-08-01 Thread Rik van Riel
Instead of open coding bitmap magic, use on_each_cpu_cond to determine which CPUs to send TLB flush IPIs to. This might be a little bit slower than examining the bitmaps, but it should be a lot easier to maintain in the long run. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel

[PATCH 10/11] x86,tlb: really leave mm on shootdown

2018-08-01 Thread Rik van Riel
in lazy TLB mode. Signed-off-by: Rik van Riel --- arch/x86/mm/tlb.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 7b1add904396..425cb9fa2640 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -140,6 +140,8 @@ void leave_mm(void *dummy

[PATCH 04/11] x86,mm: use on_each_cpu_cond for TLB flushes

2018-08-01 Thread Rik van Riel
Instead of open coding bitmap magic, use on_each_cpu_cond to determine which CPUs to send TLB flush IPIs to. This might be a little bit slower than examining the bitmaps, but it should be a lot easier to maintain in the long run. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel

[PATCH 06/11] mm,x86: skip cr4 and ldt reload when mm stays the same

2018-08-01 Thread Rik van Riel
to the same mm after a flush. Suggested-by: Andy Lutomirski Signed-off-by: Rik van Riel Acked-by: Andy Lutomirski --- arch/x86/mm/tlb.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 671cc66df801..149fb64e4bf4 100644 --- a/arch/x86/mm

[PATCH 09/11] mm,x86: shoot down lazy TLB references at exit_mmap time

2018-08-01 Thread Rik van Riel
Shooting down lazy TLB references to an mm at exit_mmap time ensures that no users of the mm_struct will be left anywhere in the system, allowing it to be torn down and freed immediately. Signed-off-by: Rik van Riel Suggested-by: Andy Lutomirski Suggested-by: Peter Zijlstra --- arch/x86

[PATCH v2 0/11] x86,tlb,mm: more lazy TLB cleanups & optimizations

2018-08-01 Thread Rik van Riel
This patch series implements the cleanups suggested by Peter and Andy, removes lazy TLB mm refcounting on x86, and shows how other architectures could implement that same optimization. The previous patch series already seems to have removed most of the cache line contention I was seeing at

[PATCH 08/11] arch,mm: add config variable to skip lazy TLB mm refcounting

2018-08-01 Thread Rik van Riel
Add a config variable indicating that this architecture does not require lazy TLB mm refcounting, because lazy TLB mms get shot down instantly at exit_mmap time. Signed-off-by: Rik van Riel --- arch/Kconfig | 4 1 file changed, 4 insertions(+) diff --git a/arch/Kconfig b/arch/Kconfig

[PATCH 06/11] mm,x86: skip cr4 and ldt reload when mm stays the same

2018-08-01 Thread Rik van Riel
to the same mm after a flush. Suggested-by: Andy Lutomirski Signed-off-by: Rik van Riel Acked-by: Andy Lutomirski --- arch/x86/mm/tlb.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 671cc66df801..149fb64e4bf4 100644 --- a/arch/x86/mm

[PATCH 09/11] mm,x86: shoot down lazy TLB references at exit_mmap time

2018-08-01 Thread Rik van Riel
Shooting down lazy TLB references to an mm at exit_mmap time ensures that no users of the mm_struct will be left anywhere in the system, allowing it to be torn down and freed immediately. Signed-off-by: Rik van Riel Suggested-by: Andy Lutomirski Suggested-by: Peter Zijlstra --- arch/x86

[PATCH v2 0/11] x86,tlb,mm: more lazy TLB cleanups & optimizations

2018-08-01 Thread Rik van Riel
This patch series implements the cleanups suggested by Peter and Andy, removes lazy TLB mm refcounting on x86, and shows how other architectures could implement that same optimization. The previous patch series already seems to have removed most of the cache line contention I was seeing at

[PATCH 08/11] arch,mm: add config variable to skip lazy TLB mm refcounting

2018-08-01 Thread Rik van Riel
Add a config variable indicating that this architecture does not require lazy TLB mm refcounting, because lazy TLB mms get shot down instantly at exit_mmap time. Signed-off-by: Rik van Riel --- arch/Kconfig | 4 1 file changed, 4 insertions(+) diff --git a/arch/Kconfig b/arch/Kconfig

[PATCH 03/11] smp,cpumask: introduce on_each_cpu_cond_mask

2018-08-01 Thread Rik van Riel
Introduce a variant of on_each_cpu_cond that iterates only over the CPUs in a cpumask, in order to avoid making callbacks for every single CPU in the system when we only need to test a subset. Signed-off-by: Rik van Riel --- include/linux/smp.h | 4 kernel/smp.c| 17

[PATCH 03/11] smp,cpumask: introduce on_each_cpu_cond_mask

2018-08-01 Thread Rik van Riel
Introduce a variant of on_each_cpu_cond that iterates only over the CPUs in a cpumask, in order to avoid making callbacks for every single CPU in the system when we only need to test a subset. Signed-off-by: Rik van Riel --- include/linux/smp.h | 4 kernel/smp.c| 17

[PATCH 02/11] smp: use __cpumask_set_cpu in on_each_cpu_cond

2018-08-01 Thread Rik van Riel
The code in on_each_cpu_cond sets CPUs in a locally allocated bitmask, which should never be used by other CPUs simultaneously. There is no need to use locked memory accesses to set the bits in this bitmap. Switch to __cpumask_set_cpu. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel

[PATCH 02/11] smp: use __cpumask_set_cpu in on_each_cpu_cond

2018-08-01 Thread Rik van Riel
The code in on_each_cpu_cond sets CPUs in a locally allocated bitmask, which should never be used by other CPUs simultaneously. There is no need to use locked memory accesses to set the bits in this bitmap. Switch to __cpumask_set_cpu. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel

[PATCH 05/11] mm,tlb: turn dummy defines into inline functions

2018-08-01 Thread Rik van Riel
Turn the dummy tlb_flush_remove_tables* defines into inline functions, in order to get compiler type checking, etc. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel --- include/asm-generic/tlb.h | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/include/asm

[PATCH 05/11] mm,tlb: turn dummy defines into inline functions

2018-08-01 Thread Rik van Riel
Turn the dummy tlb_flush_remove_tables* defines into inline functions, in order to get compiler type checking, etc. Suggested-by: Peter Zijlstra Signed-off-by: Rik van Riel --- include/asm-generic/tlb.h | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/include/asm

Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-31 Thread Rik van Riel
On Tue, 2018-07-31 at 07:29 -0700, Andy Lutomirski wrote: > > On Jul 31, 2018, at 2:12 AM, Peter Zijlstra > > wrote: > > > > > On Mon, Jul 30, 2018 at 09:05:55PM -0400, Rik van Riel wrote: > > > > On Mon, 2018-07-30 at 18:26 +0200, Peter Zijlstra wrote: &g

Re: [PATCH v2 11/11] mm,sched: conditionally skip lazy TLB mm refcounting

2018-07-31 Thread Rik van Riel
On Tue, 2018-07-31 at 07:29 -0700, Andy Lutomirski wrote: > > On Jul 31, 2018, at 2:12 AM, Peter Zijlstra > > wrote: > > > > > On Mon, Jul 30, 2018 at 09:05:55PM -0400, Rik van Riel wrote: > > > > On Mon, 2018-07-30 at 18:26 +0200, Peter Zijlstra wrote: &g

<    1   2   3   4   5   6   7   8   9   10   >