Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-20 Thread Peter Zijlstra
On Fri, Jul 20, 2018 at 11:32:39AM +0200, Peter Zijlstra wrote: > + if (!next->mm) {// to kernel > + enter_lazy_tlb(prev->active_mm, next); > + > +#ifdef ARCH_NO_ACTIVE_MM > + next->active_mm = prev->active_mm; > + if (prev->m

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-20 Thread Vitaly Kuznetsov
Peter Zijlstra writes: > On Fri, Jul 20, 2018 at 10:02:10AM +0200, Vitaly Kuznetsov wrote: >> Andy Lutomirski writes: >> >> > [I added PeterZ and Vitaly -- can you see any way in which this would >> > break something obscure? I don't.] >> >> Thanks for CCing me, >> >> I don't see how this ca

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-20 Thread Peter Zijlstra
On Fri, Jul 20, 2018 at 10:02:10AM +0200, Vitaly Kuznetsov wrote: > Andy Lutomirski writes: > > > [I added PeterZ and Vitaly -- can you see any way in which this would > > break something obscure? I don't.] > > Thanks for CCing me, > > I don't see how this can break things either. At first gla

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-20 Thread Peter Zijlstra
On Thu, Jul 19, 2018 at 09:45:47AM -0700, Andy Lutomirski wrote: > After some grepping, there are very few users. The > only nontrivial ones are the ones in kernel/ and mm/mmu_context.c that > are involved in the rather complicated dance of refcounting active_mm. Something like so should work I s

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-20 Thread Peter Zijlstra
On Thu, Jul 19, 2018 at 10:04:09AM -0700, Andy Lutomirski wrote: > I added some more arch maintainers. The idea here is that, on x86 at > least, task->active_mm and all its refcounting is pure overhead. When > a process exits, __mmput() gets called, but the core kernel has a > longstanding "optim

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-20 Thread Vitaly Kuznetsov
Andy Lutomirski writes: > [I added PeterZ and Vitaly -- can you see any way in which this would > break something obscure? I don't.] Thanks for CCing me, I don't see how this can break things either. At first glance, however, I'm afraid we can add performance penalty to virtualized guests whic

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-19 Thread Andy Lutomirski
On Thu, Jul 19, 2018 at 10:15 AM, Rik van Riel wrote: > > > Given that CPUs in lazy TLB mode stay part of the mm_cpumask, > that WARN_ON seems misplaced. You are right though, that the > mm_cpumask alone should provide enough information for us to > avoid a need for both tsk->active_mm and the ref

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-19 Thread Andy Lutomirski
[I added PeterZ and Vitaly -- can you see any way in which this would break something obscure? I don't.] On Thu, Jul 19, 2018 at 7:14 AM, Rik van Riel wrote: > I guess we can skip both switch_ldt and load_mm_cr4 if real_prev equals > next? Yes, AFAICS. > > On to the lazy TLB mm_struct refcount

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-18 Thread Andy Lutomirski
> On Jul 18, 2018, at 10:58 AM, Rik van Riel wrote: > > > >> On Jul 17, 2018, at 4:04 PM, Andy Lutomirski wrote: >> >> >> I think you've introduced a minor-ish performance regression due to >> changing the old (admittedly terribly documented) control flow a bit. >> Before, if real_prev ==

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-18 Thread Rik van Riel
> On Jul 17, 2018, at 4:04 PM, Andy Lutomirski wrote: > > > I think you've introduced a minor-ish performance regression due to > changing the old (admittedly terribly documented) control flow a bit. > Before, if real_prev == next, we would skip: > > load_mm_cr4(next); > switch_ldt(real_prev

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-17 Thread Andy Lutomirski
> On Jul 17, 2018, at 12:05 PM, Rik van Riel wrote: > > > >> On Jul 17, 2018, at 5:29 PM, Andy Lutomirski wrote: >> >> On Tue, Jul 17, 2018 at 1:16 PM, Rik van Riel wrote: >>> Can I skip both the cr4 and let switches when the TLB contents >>> are no longer valid and got reloaded? >>> >>>

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-17 Thread Rik van Riel
> On Jul 17, 2018, at 5:29 PM, Andy Lutomirski wrote: > > On Tue, Jul 17, 2018 at 1:16 PM, Rik van Riel wrote: >> Can I skip both the cr4 and let switches when the TLB contents >> are no longer valid and got reloaded? >> >> If the TLB contents are still valid, either because we never went >>

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-17 Thread Andy Lutomirski
On Tue, Jul 17, 2018 at 1:16 PM, Rik van Riel wrote: > Can I skip both the cr4 and let switches when the TLB contents > are no longer valid and got reloaded? > > If the TLB contents are still valid, either because we never went > into lazy TLB mode, or because no invalidates happened while > we we

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-17 Thread Andy Lutomirski
On Mon, Jul 16, 2018 at 12:03 PM, Rik van Riel wrote: > Lazy TLB mode can result in an idle CPU being woken up by a TLB flush, > when all it really needs to do is reload %CR3 at the next context switch, > assuming no page table pages got freed. > > Memory ordering is used to prevent race condition

[PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-16 Thread Rik van Riel
Lazy TLB mode can result in an idle CPU being woken up by a TLB flush, when all it really needs to do is reload %CR3 at the next context switch, assuming no page table pages got freed. Memory ordering is used to prevent race conditions between switch_mm_irqs_off, which checks whether .tlb_gen chan

[PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-10 Thread Rik van Riel
Lazy TLB mode can result in an idle CPU being woken up by a TLB flush, when all it really needs to do is reload %CR3 at the next context switch, assuming no page table pages got freed. Memory ordering is used to prevent race conditions between switch_mm_irqs_off, which checks whether .tlb_gen chan

[PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-07-06 Thread Rik van Riel
Lazy TLB mode can result in an idle CPU being woken up by a TLB flush, when all it really needs to do is reload %CR3 at the next context switch, assuming no page table pages got freed. Memory ordering is used to prevent race conditions between switch_mm_irqs_off, which checks whether .tlb_gen chan

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-06-29 Thread Rik van Riel
On Fri, 2018-06-29 at 10:05 -0700, Dave Hansen wrote: > On 06/29/2018 07:29 AM, Rik van Riel wrote: > > + /* > > +* If the CPU is not in lazy TLB mode, we are just > > switching > > +* from one thread in a process to another thread > > in the same > > +

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-06-29 Thread Dave Hansen
On 06/29/2018 07:29 AM, Rik van Riel wrote: > + /* > + * If the CPU is not in lazy TLB mode, we are just switching > + * from one thread in a process to another thread in the same > + * process. No TLB flush required. > + */ > +

[PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-06-29 Thread Rik van Riel
Lazy TLB mode can result in an idle CPU being woken up by a TLB flush, when all it really needs to do is reload %CR3 at the next context switch, assuming no page table pages got freed. Memory ordering is used to prevent race conditions between switch_mm_irqs_off, which checks whether .tlb_gen chan

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-06-22 Thread Rik van Riel
On Fri, 2018-06-22 at 10:05 -0700, Dave Hansen wrote: > On 06/20/2018 12:56 PM, Rik van Riel wrote: > > This patch deals with that issue by introducing a third TLB state, > > TLBSTATE_FLUSH, which causes %CR3 to be reloaded at the next > > context > > switch. > > With PCIDs, do we need to be a bit

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-06-22 Thread Dave Hansen
On 06/20/2018 12:56 PM, Rik van Riel wrote: > This patch deals with that issue by introducing a third TLB state, > TLBSTATE_FLUSH, which causes %CR3 to be reloaded at the next context > switch. With PCIDs, do we need to be a bit more explicit about what kind of %CR3 reload we have? Because, with

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-06-22 Thread Andy Lutomirski
On Fri, Jun 22, 2018 at 8:15 AM Rik van Riel wrote: > > On Fri, 2018-06-22 at 08:04 -0700, Andy Lutomirski wrote: > > On Wed, Jun 20, 2018 at 12:57 PM Rik van Riel > > wrote: > > > > > > Lazy TLB mode can result in an idle CPU being woken up by a TLB > > > flush, > > > when all it really needs to

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-06-22 Thread Rik van Riel
On Fri, 2018-06-22 at 08:04 -0700, Andy Lutomirski wrote: > On Wed, Jun 20, 2018 at 12:57 PM Rik van Riel > wrote: > > > > Lazy TLB mode can result in an idle CPU being woken up by a TLB > > flush, > > when all it really needs to do is reload %CR3 at the next context > > switch, > > assuming no p

Re: [PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-06-22 Thread Andy Lutomirski
On Wed, Jun 20, 2018 at 12:57 PM Rik van Riel wrote: > > Lazy TLB mode can result in an idle CPU being woken up by a TLB flush, > when all it really needs to do is reload %CR3 at the next context switch, > assuming no page table pages got freed. > > This patch deals with that issue by introducing

[PATCH 4/7] x86,tlb: make lazy TLB mode lazier

2018-06-20 Thread Rik van Riel
Lazy TLB mode can result in an idle CPU being woken up by a TLB flush, when all it really needs to do is reload %CR3 at the next context switch, assuming no page table pages got freed. This patch deals with that issue by introducing a third TLB state, TLBSTATE_FLUSH, which causes %CR3 to be reload