Re: [PATCH v6] powerpc/64s: reimplement book3s idle code in C

2019-02-19 Thread Paul Mackerras
On Tue, Feb 19, 2019 at 02:13:51PM +1000, Nicholas Piggin wrote:
> Paul Mackerras's on February 18, 2019 9:06 am:
> > On Sat, Oct 13, 2018 at 10:04:09PM +1000, Nicholas Piggin wrote:
> >> Reimplement Book3S idle code in C, moving POWER7/8/9 implementation
> >> speific HV idle code to the powernv platform code.
> >> 
> > 
> > [...]
> > 
> >> @@ -2760,21 +2744,47 @@ BEGIN_FTR_SECTION
> >>li  r4, LPCR_PECE_HVEE@higher
> >>sldir4, r4, 32
> >>or  r5, r5, r4
> >> -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
> >> +FTR_SECTION_ELSE
> >> +  li  r3, PNV_THREAD_NAP
> >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
> >>mtspr   SPRN_LPCR,r5
> >>isync
> >> -  li  r0, 0
> >> -  std r0, HSTATE_SCRATCH0(r13)
> >> -  ptesync
> >> -  ld  r0, HSTATE_SCRATCH0(r13)
> >> -1:cmpdr0, r0
> >> -  bne 1b
> >> +
> >> +  mr  r0, r1
> >> +  ld  r1, PACAEMERGSP(r13)
> >> +  subir1, r1, STACK_FRAME_OVERHEAD
> >> +  std r0, 0(r1)
> >> +  ld  r0, PACAR1(r13)
> >> +  std r0, 8(r1)
> > 
> > This bit seems wrong to me.  If this is a secondary thread on POWER8,
> > we were already on the emergency stack, and now we've reset r1 back to
> > the top of the emergency stack and we're overwriting it.
> 
> I'll have to find some time to take another look at this stuff. The KVM
> stuff was a bit hasty.
> 
> > I wonder why you didn't see secondary threads going off into lala land
> > in your tests?
> 
> It must have been because I wasn't testing the guest SMT properly 
> because I did get it to break trivially sometime after posting this 
> patch out. So we were on the emergency stack here, that should make
> things easier, that may be what's wrong.

In fact I don't see why you need to load up a new stack here at all;
you could just use whatever stack we're currently on AFAICS.

Paul.


Re: [PATCH v6] powerpc/64s: reimplement book3s idle code in C

2019-02-18 Thread Nicholas Piggin
Paul Mackerras's on February 18, 2019 9:06 am:
> On Sat, Oct 13, 2018 at 10:04:09PM +1000, Nicholas Piggin wrote:
>> Reimplement Book3S idle code in C, moving POWER7/8/9 implementation
>> speific HV idle code to the powernv platform code.
>> 
> 
> [...]
> 
>> @@ -2760,21 +2744,47 @@ BEGIN_FTR_SECTION
>>  li  r4, LPCR_PECE_HVEE@higher
>>  sldir4, r4, 32
>>  or  r5, r5, r4
>> -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
>> +FTR_SECTION_ELSE
>> +li  r3, PNV_THREAD_NAP
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
>>  mtspr   SPRN_LPCR,r5
>>  isync
>> -li  r0, 0
>> -std r0, HSTATE_SCRATCH0(r13)
>> -ptesync
>> -ld  r0, HSTATE_SCRATCH0(r13)
>> -1:  cmpdr0, r0
>> -bne 1b
>> +
>> +mr  r0, r1
>> +ld  r1, PACAEMERGSP(r13)
>> +subir1, r1, STACK_FRAME_OVERHEAD
>> +std r0, 0(r1)
>> +ld  r0, PACAR1(r13)
>> +std r0, 8(r1)
> 
> This bit seems wrong to me.  If this is a secondary thread on POWER8,
> we were already on the emergency stack, and now we've reset r1 back to
> the top of the emergency stack and we're overwriting it.

I'll have to find some time to take another look at this stuff. The KVM
stuff was a bit hasty.

> I wonder why you didn't see secondary threads going off into lala land
> in your tests?

It must have been because I wasn't testing the guest SMT properly 
because I did get it to break trivially sometime after posting this 
patch out. So we were on the emergency stack here, that should make
things easier, that may be what's wrong.

Thanks,
Nick



Re: [PATCH v6] powerpc/64s: reimplement book3s idle code in C

2019-02-17 Thread Paul Mackerras
On Sat, Oct 13, 2018 at 10:04:09PM +1000, Nicholas Piggin wrote:
> Reimplement Book3S idle code in C, moving POWER7/8/9 implementation
> speific HV idle code to the powernv platform code.
> 

[...]

> @@ -2760,21 +2744,47 @@ BEGIN_FTR_SECTION
>   li  r4, LPCR_PECE_HVEE@higher
>   sldir4, r4, 32
>   or  r5, r5, r4
> -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
> +FTR_SECTION_ELSE
> + li  r3, PNV_THREAD_NAP
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
>   mtspr   SPRN_LPCR,r5
>   isync
> - li  r0, 0
> - std r0, HSTATE_SCRATCH0(r13)
> - ptesync
> - ld  r0, HSTATE_SCRATCH0(r13)
> -1:   cmpdr0, r0
> - bne 1b
> +
> + mr  r0, r1
> + ld  r1, PACAEMERGSP(r13)
> + subir1, r1, STACK_FRAME_OVERHEAD
> + std r0, 0(r1)
> + ld  r0, PACAR1(r13)
> + std r0, 8(r1)

This bit seems wrong to me.  If this is a secondary thread on POWER8,
we were already on the emergency stack, and now we've reset r1 back to
the top of the emergency stack and we're overwriting it.

I wonder why you didn't see secondary threads going off into lala land
in your tests?

Paul.


[PATCH v6] powerpc/64s: reimplement book3s idle code in C

2018-10-13 Thread Nicholas Piggin
Reimplement Book3S idle code in C, moving POWER7/8/9 implementation
speific HV idle code to the powernv platform code.

Book3S assembly stubs are kept in common code and used only to save
the stack frame and non-volatile GPRs before executing architected
idle instructions, and restoring the stack and reloading GPRs then
returning to C after waking from idle.

The complex logic dealing with threads and subcores, locking, SPRs,
HMIs, timebase resync, etc., is all done in C which makes it more
maintainable.

This is not a strict translation to C code, there are some
significant differences:

- Idle wakeup no longer uses the ->cpu_restore call to reinit SPRs,
  but saves and restores them itself.

- The optimisation where EC=ESL=0 idle modes did not have to save GPRs
  or change MSR is restored, because it's now simple to do. ESL=1
  sleeps that do not lose GPRs can use this optimization too.

- KVM secondary entry and cede is now more of a call/return style
  rather than branchy. nap_state_lost is not required because KVM
  always returns via NVGPR restoring path.

Reviewed-by: Gautham R. Shenoy 
Signed-off-by: Nicholas Piggin 
---
Notes:
- P9 restores some of the PMU SPRs, but not others, and P8 only zeroes
  them. There are improvmets to be made to SPR save restore policies and
  documentation, but this first pass tries to keep things as they were.

Left to do:
- Test actual POWER7 hardware.

- KVM could use more review, it's pretty tricky. Not sure if what
  I'm doing with the emergency stack is kosher. But it runs pretty fine
  here with a POWER8 SMP+SMT guest. Possible to streamline
  KVM cede code now that idle function saves nv gprs for us?

Since RFC v1:
- Now tested and working with POWER9 hash and radix.
- KVM support added. This took a bit of work to untangle and might
  still have some issues, but POWER9 seems to work including hash on
  radix with dependent threads mode.
- This snowballed a bit because of KVM and other details making it
  not feasible to leave POWER7/8 code alone. That's only half done
  at the moment.
- So far this trades about 800 lines of asm for 500 of C. With POWER7/8
  support done it might be another hundred or so lines of C.

Since RFC v2:
- Fixed deep state SLB reloading
- Now tested and working with POWER8.
- Accounted for most feedback.

Since RFC v3:
- Rebased to powerpc merge + idle state bugfix
- Split SLB flush/restore code out and shared with MCE code (pseries
  MCE patches can also use).
- More testing on POWER8 including KVM with secondaries.
- Performance testing looks good. EC=ESL=0 is about 5% faster, other
  stop states look a little faster too.
- Adjusted SPR saving to handler POWER7, haven't tested it.

Since v1:
- More review comments from Gautham.
- Rename isa3_ to isa300_ prefix.
- Tinkered with some comments, copyright notice, changelog.
- Cede and regular idle do not go via KVM secondary wakeup code path,
  so hwthread_state stores and barriers can be simplified, and some
  KVM code paths simplified a little.

Since v2:
- Rebase, SLB reload patch has been merged.
- More testing. Tested machine check idle wakeup path with mambo stepping
  through instructions.

Since v3:
- Build fixes caught by CI

Since v4:
- PSSCR test PLS rather than RL (Akshay)

Since v5:
- Fix TB loss test to use PLS instead of RL as well
- Rename hv_loss variable to spr_loss to better describe its usage
- Clamp the SPR loss level to shallower of SPR loss or TB loss in case
  future CPU has that behaviour (P8 type behaviour).
- Added a few more comments.
- Re-tested on P8, P9 and KVM.
---
 arch/powerpc/include/asm/cpuidle.h   |   19 +-
 arch/powerpc/include/asm/paca.h  |   40 +-
 arch/powerpc/include/asm/processor.h |9 +-
 arch/powerpc/include/asm/reg.h   |8 +-
 arch/powerpc/kernel/asm-offsets.c|   18 -
 arch/powerpc/kernel/dt_cpu_ftrs.c|   21 +-
 arch/powerpc/kernel/exceptions-64s.S |   17 +-
 arch/powerpc/kernel/idle_book3s.S| 1012 +++---
 arch/powerpc/kernel/setup-common.c   |4 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  |   86 +-
 arch/powerpc/platforms/powernv/idle.c|  850 ++
 arch/powerpc/platforms/powernv/subcore.c |2 +-
 arch/powerpc/xmon/xmon.c |   24 +-
 13 files changed, 904 insertions(+), 1206 deletions(-)

diff --git a/arch/powerpc/include/asm/cpuidle.h 
b/arch/powerpc/include/asm/cpuidle.h
index 43e5f31fe64d..9844b3ded187 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -27,10 +27,11 @@
  * the THREAD_WINKLE_BITS are set, which indicate which threads have not
  * yet woken from the winkle state.
  */
-#define PNV_CORE_IDLE_LOCK_BIT 0x1000
+#define NR_PNV_CORE_IDLE_LOCK_BIT  28
+#define PNV_CORE_IDLE_LOCK_BIT (1ULL << 
NR_PNV_CORE_IDLE_LOCK_BIT)
 
+#define PNV_CORE_IDLE_WINKLE_COUNT_SHIFT   16
 #define PNV_CORE_IDLE_WINKLE_COUNT 0x0001