Re: [PATCH v6] powerpc/64s: reimplement book3s idle code in C
On Tue, Feb 19, 2019 at 02:13:51PM +1000, Nicholas Piggin wrote: > Paul Mackerras's on February 18, 2019 9:06 am: > > On Sat, Oct 13, 2018 at 10:04:09PM +1000, Nicholas Piggin wrote: > >> Reimplement Book3S idle code in C, moving POWER7/8/9 implementation > >> speific HV idle code to the powernv platform code. > >> > > > > [...] > > > >> @@ -2760,21 +2744,47 @@ BEGIN_FTR_SECTION > >>li r4, LPCR_PECE_HVEE@higher > >>sldir4, r4, 32 > >>or r5, r5, r4 > >> -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) > >> +FTR_SECTION_ELSE > >> + li r3, PNV_THREAD_NAP > >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300) > >>mtspr SPRN_LPCR,r5 > >>isync > >> - li r0, 0 > >> - std r0, HSTATE_SCRATCH0(r13) > >> - ptesync > >> - ld r0, HSTATE_SCRATCH0(r13) > >> -1:cmpdr0, r0 > >> - bne 1b > >> + > >> + mr r0, r1 > >> + ld r1, PACAEMERGSP(r13) > >> + subir1, r1, STACK_FRAME_OVERHEAD > >> + std r0, 0(r1) > >> + ld r0, PACAR1(r13) > >> + std r0, 8(r1) > > > > This bit seems wrong to me. If this is a secondary thread on POWER8, > > we were already on the emergency stack, and now we've reset r1 back to > > the top of the emergency stack and we're overwriting it. > > I'll have to find some time to take another look at this stuff. The KVM > stuff was a bit hasty. > > > I wonder why you didn't see secondary threads going off into lala land > > in your tests? > > It must have been because I wasn't testing the guest SMT properly > because I did get it to break trivially sometime after posting this > patch out. So we were on the emergency stack here, that should make > things easier, that may be what's wrong. In fact I don't see why you need to load up a new stack here at all; you could just use whatever stack we're currently on AFAICS. Paul.
Re: [PATCH v6] powerpc/64s: reimplement book3s idle code in C
Paul Mackerras's on February 18, 2019 9:06 am: > On Sat, Oct 13, 2018 at 10:04:09PM +1000, Nicholas Piggin wrote: >> Reimplement Book3S idle code in C, moving POWER7/8/9 implementation >> speific HV idle code to the powernv platform code. >> > > [...] > >> @@ -2760,21 +2744,47 @@ BEGIN_FTR_SECTION >> li r4, LPCR_PECE_HVEE@higher >> sldir4, r4, 32 >> or r5, r5, r4 >> -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) >> +FTR_SECTION_ELSE >> +li r3, PNV_THREAD_NAP >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300) >> mtspr SPRN_LPCR,r5 >> isync >> -li r0, 0 >> -std r0, HSTATE_SCRATCH0(r13) >> -ptesync >> -ld r0, HSTATE_SCRATCH0(r13) >> -1: cmpdr0, r0 >> -bne 1b >> + >> +mr r0, r1 >> +ld r1, PACAEMERGSP(r13) >> +subir1, r1, STACK_FRAME_OVERHEAD >> +std r0, 0(r1) >> +ld r0, PACAR1(r13) >> +std r0, 8(r1) > > This bit seems wrong to me. If this is a secondary thread on POWER8, > we were already on the emergency stack, and now we've reset r1 back to > the top of the emergency stack and we're overwriting it. I'll have to find some time to take another look at this stuff. The KVM stuff was a bit hasty. > I wonder why you didn't see secondary threads going off into lala land > in your tests? It must have been because I wasn't testing the guest SMT properly because I did get it to break trivially sometime after posting this patch out. So we were on the emergency stack here, that should make things easier, that may be what's wrong. Thanks, Nick
Re: [PATCH v6] powerpc/64s: reimplement book3s idle code in C
On Sat, Oct 13, 2018 at 10:04:09PM +1000, Nicholas Piggin wrote: > Reimplement Book3S idle code in C, moving POWER7/8/9 implementation > speific HV idle code to the powernv platform code. > [...] > @@ -2760,21 +2744,47 @@ BEGIN_FTR_SECTION > li r4, LPCR_PECE_HVEE@higher > sldir4, r4, 32 > or r5, r5, r4 > -END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) > +FTR_SECTION_ELSE > + li r3, PNV_THREAD_NAP > +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300) > mtspr SPRN_LPCR,r5 > isync > - li r0, 0 > - std r0, HSTATE_SCRATCH0(r13) > - ptesync > - ld r0, HSTATE_SCRATCH0(r13) > -1: cmpdr0, r0 > - bne 1b > + > + mr r0, r1 > + ld r1, PACAEMERGSP(r13) > + subir1, r1, STACK_FRAME_OVERHEAD > + std r0, 0(r1) > + ld r0, PACAR1(r13) > + std r0, 8(r1) This bit seems wrong to me. If this is a secondary thread on POWER8, we were already on the emergency stack, and now we've reset r1 back to the top of the emergency stack and we're overwriting it. I wonder why you didn't see secondary threads going off into lala land in your tests? Paul.
[PATCH v6] powerpc/64s: reimplement book3s idle code in C
Reimplement Book3S idle code in C, moving POWER7/8/9 implementation speific HV idle code to the powernv platform code. Book3S assembly stubs are kept in common code and used only to save the stack frame and non-volatile GPRs before executing architected idle instructions, and restoring the stack and reloading GPRs then returning to C after waking from idle. The complex logic dealing with threads and subcores, locking, SPRs, HMIs, timebase resync, etc., is all done in C which makes it more maintainable. This is not a strict translation to C code, there are some significant differences: - Idle wakeup no longer uses the ->cpu_restore call to reinit SPRs, but saves and restores them itself. - The optimisation where EC=ESL=0 idle modes did not have to save GPRs or change MSR is restored, because it's now simple to do. ESL=1 sleeps that do not lose GPRs can use this optimization too. - KVM secondary entry and cede is now more of a call/return style rather than branchy. nap_state_lost is not required because KVM always returns via NVGPR restoring path. Reviewed-by: Gautham R. Shenoy Signed-off-by: Nicholas Piggin --- Notes: - P9 restores some of the PMU SPRs, but not others, and P8 only zeroes them. There are improvmets to be made to SPR save restore policies and documentation, but this first pass tries to keep things as they were. Left to do: - Test actual POWER7 hardware. - KVM could use more review, it's pretty tricky. Not sure if what I'm doing with the emergency stack is kosher. But it runs pretty fine here with a POWER8 SMP+SMT guest. Possible to streamline KVM cede code now that idle function saves nv gprs for us? Since RFC v1: - Now tested and working with POWER9 hash and radix. - KVM support added. This took a bit of work to untangle and might still have some issues, but POWER9 seems to work including hash on radix with dependent threads mode. - This snowballed a bit because of KVM and other details making it not feasible to leave POWER7/8 code alone. That's only half done at the moment. - So far this trades about 800 lines of asm for 500 of C. With POWER7/8 support done it might be another hundred or so lines of C. Since RFC v2: - Fixed deep state SLB reloading - Now tested and working with POWER8. - Accounted for most feedback. Since RFC v3: - Rebased to powerpc merge + idle state bugfix - Split SLB flush/restore code out and shared with MCE code (pseries MCE patches can also use). - More testing on POWER8 including KVM with secondaries. - Performance testing looks good. EC=ESL=0 is about 5% faster, other stop states look a little faster too. - Adjusted SPR saving to handler POWER7, haven't tested it. Since v1: - More review comments from Gautham. - Rename isa3_ to isa300_ prefix. - Tinkered with some comments, copyright notice, changelog. - Cede and regular idle do not go via KVM secondary wakeup code path, so hwthread_state stores and barriers can be simplified, and some KVM code paths simplified a little. Since v2: - Rebase, SLB reload patch has been merged. - More testing. Tested machine check idle wakeup path with mambo stepping through instructions. Since v3: - Build fixes caught by CI Since v4: - PSSCR test PLS rather than RL (Akshay) Since v5: - Fix TB loss test to use PLS instead of RL as well - Rename hv_loss variable to spr_loss to better describe its usage - Clamp the SPR loss level to shallower of SPR loss or TB loss in case future CPU has that behaviour (P8 type behaviour). - Added a few more comments. - Re-tested on P8, P9 and KVM. --- arch/powerpc/include/asm/cpuidle.h | 19 +- arch/powerpc/include/asm/paca.h | 40 +- arch/powerpc/include/asm/processor.h |9 +- arch/powerpc/include/asm/reg.h |8 +- arch/powerpc/kernel/asm-offsets.c| 18 - arch/powerpc/kernel/dt_cpu_ftrs.c| 21 +- arch/powerpc/kernel/exceptions-64s.S | 17 +- arch/powerpc/kernel/idle_book3s.S| 1012 +++--- arch/powerpc/kernel/setup-common.c |4 +- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 86 +- arch/powerpc/platforms/powernv/idle.c| 850 ++ arch/powerpc/platforms/powernv/subcore.c |2 +- arch/powerpc/xmon/xmon.c | 24 +- 13 files changed, 904 insertions(+), 1206 deletions(-) diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h index 43e5f31fe64d..9844b3ded187 100644 --- a/arch/powerpc/include/asm/cpuidle.h +++ b/arch/powerpc/include/asm/cpuidle.h @@ -27,10 +27,11 @@ * the THREAD_WINKLE_BITS are set, which indicate which threads have not * yet woken from the winkle state. */ -#define PNV_CORE_IDLE_LOCK_BIT 0x1000 +#define NR_PNV_CORE_IDLE_LOCK_BIT 28 +#define PNV_CORE_IDLE_LOCK_BIT (1ULL << NR_PNV_CORE_IDLE_LOCK_BIT) +#define PNV_CORE_IDLE_WINKLE_COUNT_SHIFT 16 #define PNV_CORE_IDLE_WINKLE_COUNT 0x0001