Re: [PATCH v4] powerpc/64s: reimplement book3s idle code in C

2018-09-27 Thread Nicholas Piggin
On Wed, 26 Sep 2018 19:39:14 +0530
Akshay Adiga  wrote:

> On Fri, Sep 14, 2018 at 11:52:40AM +1000, Nicholas Piggin wrote:
> > +
> > +   /*
> > +* On POWER9, SRR1 bits do not match exactly as expected.
> > +* SRR1_WS_GPRLOSS (10b) can also result in SPR loss, so
> > +* always test PSSCR if there is any state loss.
> > +*/
> > +   if (likely((psscr & PSSCR_RL_MASK) < pnv_first_hv_loss_level)) {  
> 
> Shouldn't we check PLS field to see if the cpu/core woke up from hv loss ?
> 
> Currently, a cpu requested stop4 (RL=4) and exited from a shallower state
> (PLS=2), SPR's are unecessarily restored.
> 
> We can do something like : 
> 
> #define   PSSCR_PLS_SHIFT 60
> if (likely((psscr & PSSCR_PLS) >> PSSCR_PLS_SHIFT) < pnv_first_hv_loss_level)


Ah, that corresponds with the following existing code?

/*
 * POWER ISA 3. Use PSSCR to determine if we
 * are waking up from deep idle state
 */
LOAD_REG_ADDRBASE(r5,pnv_first_deep_stop_state)
ld  r4,ADDROFF(pnv_first_deep_stop_state)(r5)

/*
 * 0-3 bits correspond to Power-Saving Level Status
 * which indicates the idle state we are waking up from
 */
mfspr   r5, SPRN_PSSCR
rldicl  r5,r5,4,60
li  r0, 0   /* clear requested_psscr to say we're awake */
std r0, PACA_REQ_PSSCR(r13)
cmpdcr4,r5,r4
bge cr4,pnv_wakeup_tb_loss /* returns to caller */

Yes I didn't get that right, good catch.

Thanks,
Nick


Re: [PATCH v4] powerpc/64s: reimplement book3s idle code in C

2018-09-26 Thread Akshay Adiga
On Fri, Sep 14, 2018 at 11:52:40AM +1000, Nicholas Piggin wrote:
> +
> + /*
> +  * On POWER9, SRR1 bits do not match exactly as expected.
> +  * SRR1_WS_GPRLOSS (10b) can also result in SPR loss, so
> +  * always test PSSCR if there is any state loss.
> +  */
> + if (likely((psscr & PSSCR_RL_MASK) < pnv_first_hv_loss_level)) {

Shouldn't we check PLS field to see if the cpu/core woke up from hv loss ?

Currently, a cpu requested stop4 (RL=4) and exited from a shallower state
(PLS=2), SPR's are unecessarily restored.

We can do something like : 

#define PSSCR_PLS_SHIFT 60
if (likely((psscr & PSSCR_PLS) >> PSSCR_PLS_SHIFT) < pnv_first_hv_loss_level)
> + if (sprs_saved)
> + atomic_stop_thread_idle();
> + goto out;
> + }
> +
> + /* HV state loss */
> + BUG_ON(!sprs_saved);
> +
> + atomic_lock_thread_idle();
> +
> + if ((*state & ((1 << threads_per_core) - 1)) != 0)
> + goto core_woken;
> +
> + /* Per-core SPRs */
> + mtspr(SPRN_PTCR,sprs.ptcr);
> + mtspr(SPRN_RPR, sprs.rpr);
> + mtspr(SPRN_TSCR,sprs.tscr);
> + mtspr(SPRN_LDBAR,   sprs.ldbar);
> + mtspr(SPRN_AMOR,sprs.amor);
> +
> + if ((psscr & PSSCR_RL_MASK) >= pnv_first_tb_loss_level) {
> + /* TB loss */
> + if (opal_resync_timebase() != OPAL_SUCCESS)
> + BUG();
> + }




[PATCH v4] powerpc/64s: reimplement book3s idle code in C

2018-09-13 Thread Nicholas Piggin
Reimplement Book3S idle code in C, moving POWER7/8/9 implementation
speific HV idle code to the powernv platform code.

Book3S assembly stubs are kept in common code and used only to save
the stack frame and non-volatile GPRs before executing architected
idle instructions, and restoring the stack and reloading GPRs then
returning to C after waking from idle.

The complex logic dealing with threads and subcores, locking, SPRs,
HMIs, timebase resync, etc., is all done in C which makes it more
maintainable.

This is not a strict translation to C code, there are some
significant differences:

- Idle wakeup no longer uses the ->cpu_restore call to reinit SPRs,
  but saves and restores them itself.

- The optimisation where EC=ESL=0 idle modes did not have to save GPRs
  or change MSR is restored, because it's now simple to do. ESL=1
  sleeps that do not lose GPRs can use this optimization too.

- KVM secondary entry and cede is now more of a call/return style
  rather than branchy. nap_state_lost is not required because KVM
  always returns via NVGPR restoring path.

Reviewed-by: Gautham R. Shenoy 
Signed-off-by: Nicholas Piggin 

- P9 restores some of the PMU SPRs, but not others, and P8 only zeroes
  them. There are improvmets to be made to SPR save restore policies and
  documentation, but this first pass tries to keep things as they were.

Left to do:
- Test actual POWER7 hardware.

- KVM could use more review, it's pretty tricky. Not sure if what
  I'm doing with the emergency stack is kosher. But it runs pretty fine
  here with a POWER8 SMP+SMT guest. Possible to streamline
  KVM cede code now that idle function saves nv gprs for us?

Since RFC v1:
- Now tested and working with POWER9 hash and radix.
- KVM support added. This took a bit of work to untangle and might
  still have some issues, but POWER9 seems to work including hash on
  radix with dependent threads mode.
- This snowballed a bit because of KVM and other details making it
  not feasible to leave POWER7/8 code alone. That's only half done
  at the moment.
- So far this trades about 800 lines of asm for 500 of C. With POWER7/8
  support done it might be another hundred or so lines of C.

Since RFC v2:
- Fixed deep state SLB reloading
- Now tested and working with POWER8.
- Accounted for most feedback.

Since RFC v3:
- Rebased to powerpc merge + idle state bugfix
- Split SLB flush/restore code out and shared with MCE code (pseries
  MCE patches can also use).
- More testing on POWER8 including KVM with secondaries.
- Performance testing looks good. EC=ESL=0 is about 5% faster, other
  stop states look a little faster too.
- Adjusted SPR saving to handler POWER7, haven't tested it.

Since submission v1:
- More review comments from Gautham.
- Rename isa3_ to isa300_ prefix.
- Tinkered with some comments, copyright notice, changelog.
- Cede and regular idle do not go via KVM secondary wakeup code path,
  so hwthread_state stores and barriers can be simplified, and some
  KVM code paths simplified a little.

Since v2:
- Rebase, SLB reload patch has been merged.
- More testing. Tested machine check idle wakeup path with mambo stepping
  through instructions.

Since v3:
- Build fixes caught by CI
---
 arch/powerpc/include/asm/cpuidle.h   |   19 +-
 arch/powerpc/include/asm/paca.h  |   38 +-
 arch/powerpc/include/asm/processor.h |9 +-
 arch/powerpc/include/asm/reg.h   |7 +-
 arch/powerpc/kernel/asm-offsets.c|   18 -
 arch/powerpc/kernel/dt_cpu_ftrs.c|   21 +-
 arch/powerpc/kernel/exceptions-64s.S |   17 +-
 arch/powerpc/kernel/idle_book3s.S| 1012 +++---
 arch/powerpc/kernel/setup-common.c   |4 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  |   86 +-
 arch/powerpc/platforms/powernv/idle.c|  828 ++
 arch/powerpc/platforms/powernv/subcore.c |2 +-
 arch/powerpc/xmon/xmon.c |   24 +-
 13 files changed, 880 insertions(+), 1205 deletions(-)

diff --git a/arch/powerpc/include/asm/cpuidle.h 
b/arch/powerpc/include/asm/cpuidle.h
index 43e5f31fe64d..9844b3ded187 100644
--- a/arch/powerpc/include/asm/cpuidle.h
+++ b/arch/powerpc/include/asm/cpuidle.h
@@ -27,10 +27,11 @@
  * the THREAD_WINKLE_BITS are set, which indicate which threads have not
  * yet woken from the winkle state.
  */
-#define PNV_CORE_IDLE_LOCK_BIT 0x1000
+#define NR_PNV_CORE_IDLE_LOCK_BIT  28
+#define PNV_CORE_IDLE_LOCK_BIT (1ULL << 
NR_PNV_CORE_IDLE_LOCK_BIT)
 
+#define PNV_CORE_IDLE_WINKLE_COUNT_SHIFT   16
 #define PNV_CORE_IDLE_WINKLE_COUNT 0x0001
-#define PNV_CORE_IDLE_WINKLE_COUNT_ALL_BIT 0x0008
 #define PNV_CORE_IDLE_WINKLE_COUNT_BITS0x000F
 #define PNV_CORE_IDLE_THREAD_WINKLE_BITS_SHIFT 8
 #define PNV_CORE_IDLE_THREAD_WINKLE_BITS   0xFF00
@@ -68,16 +69,6 @@
 #define ERR_DEEP_STATE_ESL_MISMATCH-2
 
 #ifndef __ASSEMBLY__
-/* Additional SPRs that need to be