Nicholas Piggin's on June 20, 2019 3:14 pm: > The idle wake up code in the system reset interrupt is not very > optimal. There are two requirements: perform idle wake up quickly; > and save everything including CFAR for non-idle interrupts, with > no performance requirement. > > The problem with placing the idle test in the middle of the handler > and using the normal handler code to save CFAR, is that it's quite > costly (e.g., mfcfar is serialising, speculative workarounds get > applied, SRR1 has to be reloaded, etc). It also prevents the standard > interrupt handler boilerplate being used. > > This pain can be avoided by using a dedicated idle interrupt handler > at the start of the interrupt handler, which restores all registers > back to the way they were in case it was not an idle wake up. CFAR > is preserved without saving it before the non-idle case by making that > the fall-through, and idle is a taken branch. > > Performance seems to be in the noise, but possibly around 0.5% faster, > the executed instructions certainly look better. The bigger benefit is > being able to drop in standard interrupt handlers after the idle code, > which helps with subsequent cleanup and consolidation. > > Signed-off-by: Nicholas Piggin <npig...@gmail.com> > --- > arch/powerpc/kernel/exceptions-64s.S | 89 ++++++++++++++-------------- > 1 file changed, 44 insertions(+), 45 deletions(-) > > diff --git a/arch/powerpc/kernel/exceptions-64s.S > b/arch/powerpc/kernel/exceptions-64s.S > index e0492912ea79..f582ae30f3f7 100644 > --- a/arch/powerpc/kernel/exceptions-64s.S > +++ b/arch/powerpc/kernel/exceptions-64s.S > @@ -241,7 +241,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) > * load KBASE for a slight optimisation. > */ > #define BRANCH_TO_C000(reg, label) \ > - __LOAD_HANDLER(reg, label); \ > + __LOAD_FAR_HANDLER(reg, label); \ > mtctr reg; \ > bctr > > @@ -784,16 +784,6 @@ EXC_VIRT_NONE(0x4000, 0x100) > > > EXC_REAL_BEGIN(system_reset, 0x100, 0x100) > - SET_SCRATCH0(r13) > - EXCEPTION_PROLOG_0 PACA_EXNMI > - > - /* This is EXCEPTION_PROLOG_1 with the idle feature section added */ > - OPT_SAVE_REG_TO_PACA(PACA_EXNMI+EX_PPR, r9, CPU_FTR_HAS_PPR) > - OPT_SAVE_REG_TO_PACA(PACA_EXNMI+EX_CFAR, r10, CPU_FTR_CFAR) > - INTERRUPT_TO_KERNEL > - SAVE_CTR(r10, PACA_EXNMI) > - mfcr r9 > - > #ifdef CONFIG_PPC_P7_NAP > /* > * If running native on arch 2.06 or later, check if we are waking up > @@ -801,45 +791,67 @@ EXC_REAL_BEGIN(system_reset, 0x100, 0x100) > * bits 46:47. A non-0 value indicates that we are coming from a power > * saving state. The idle wakeup handler initially runs in real mode, > * but we branch to the 0xc000... address so we can turn on relocation > - * with mtmsr. > + * with mtmsrd later, after SPRs are restored. > + * > + * Careful to minimise cost for the fast path (idle wakeup) while > + * also avoiding clobbering CFAR for the non-idle case. Once we know > + * it is an idle wake, volatiles don't matter, which is why we use > + * those here, and then re-do the entry in case of non-idle (without > + * branching for the non-idle case, to keep CFAR). > */ > BEGIN_FTR_SECTION > - mfspr r10,SPRN_SRR1 > - rlwinm. r10,r10,47-31,30,31 > - beq- 1f > - cmpwi cr1,r10,2 > + SET_SCRATCH0(r13) > + GET_PACA(r13) > + std r3,PACA_EXNMI+0*8(r13) > + std r4,PACA_EXNMI+1*8(r13) > + std r5,PACA_EXNMI+2*8(r13) > mfspr r3,SPRN_SRR1 > - bltlr cr1 /* no state loss, return to idle caller */ > - BRANCH_TO_C000(r10, system_reset_idle_common) > -1: > + mfocrf r4,0x80 > + rlwinm. r5,r3,47-31,30,31 > + bne+ system_reset_idle_wake > + /* Not powersave wakeup. Restore regs for regular interrupt handler. */ > + mtocrf 0x80,r4 > + ld r12,PACA_EXNMI+0*8(r13) > + ld r4,PACA_EXNMI+1*8(r13) > + ld r5,PACA_EXNMI+2*8(r13) > + GET_SCRATCH0(r13)
For the love of... that should be 'ld r3', not 'ld r12', sorry. Thanks, Nick