On Tue, 6 Dec 2016 17:09:07 +1100
Paul Mackerras <pau...@ozlabs.org> wrote:

> On Thu, Dec 01, 2016 at 06:18:10PM +1100, Nicholas Piggin wrote:
> > Change the calling convention to put the trap number together with
> > CR in two halves of r12, which frees up HSTATE_SCRATCH2 in the HV
> > handler, and r9 free.  
> 
> Cute idea!  Some comments below...
> 
> > The 64-bit PR handler entry translates the calling convention back
> > to match the previous call convention (i.e., shared with 32-bit), for
> > simplicity.
> > 
> > Signed-off-by: Nicholas Piggin <npig...@gmail.com>
> > ---
> >  arch/powerpc/include/asm/exception-64s.h | 28 +++++++++++++++-------------
> >  arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 15 +++++++--------
> >  arch/powerpc/kvm/book3s_segment.S        | 27 ++++++++++++++++++++-------
> >  3 files changed, 42 insertions(+), 28 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/exception-64s.h 
> > b/arch/powerpc/include/asm/exception-64s.h
> > index 9a3eee6..bc8fc45 100644
> > --- a/arch/powerpc/include/asm/exception-64s.h
> > +++ b/arch/powerpc/include/asm/exception-64s.h
> > @@ -233,7 +233,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
> >  
> >  #endif
> >  
> > -#define __KVM_HANDLER_PROLOG(area, n)                                      
> > \
> > +#define __KVM_HANDLER(area, h, n)                                  \
> >     BEGIN_FTR_SECTION_NESTED(947)                                   \
> >     ld      r10,area+EX_CFAR(r13);                                  \
> >     std     r10,HSTATE_CFAR(r13);                                   \
> > @@ -243,30 +243,32 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
> >     std     r10,HSTATE_PPR(r13);                                    \
> >     END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948);    \
> >     ld      r10,area+EX_R10(r13);                                   \
> > -   stw     r9,HSTATE_SCRATCH1(r13);                                \
> > -   ld      r9,area+EX_R9(r13);                                     \
> >     std     r12,HSTATE_SCRATCH0(r13);                               \
> > -
> > -#define __KVM_HANDLER(area, h, n)                                  \
> > -   __KVM_HANDLER_PROLOG(area, n)                                   \
> > -   li      r12,n;                                                  \
> > +   li      r12,(n);                                                \
> > +   sldi    r12,r12,32;                                             \
> > +   or      r12,r12,r9;                                             \  
> 
> Did you consider doing it the other way around, i.e. with r12
> containing (cr << 32) | trap?  That would save 1 instruction in each
> handler:

When I tinkered with it I thought it came out slightly nicer this way, but
your suggested versions seem to prove me wrong. I can change it if you'd
like.

> 
> +     sldi    r12,r9,32;                                              \
> +     ori     r12,r12,(n);                                            \
> 
> > +   ld      r9,area+EX_R9(r13);                                     \
> > +   std     r9,HSTATE_SCRATCH1(r13);                                \  
> 
> Why not put this std in kvmppc_interrupt[_hv] rather than in each
> handler?

Patch 3/3 uses r9 to load the ctr when CONFIG_RELOCATABLE is turned on, so
this resulted in the smaller difference between the two cases. I agree it's
not ideal when config relocatable is off.

[snip]

Thanks,
Nick

Reply via email to