Re: [PATCH] powerpc: cmp -> cmpd for 64-bit
* Segher Boessenkool[2016-10-12 08:26:48]: > On Wed, Oct 12, 2016 at 02:05:19PM +1100, Michael Ellerman wrote: > > Segher Boessenkool writes: > > [snip] > > > > --- a/arch/powerpc/include/asm/cpuidle.h > > > +++ b/arch/powerpc/include/asm/cpuidle.h > > > @@ -26,7 +26,7 @@ extern u64 pnv_first_deep_stop_state; > > > > #define IDLE_STATE_ENTER_SEQ(IDLE_INST) \ > > /* Magic NAP/SLEEP/WINKLE mode enter sequence */\ > > > std r0,0(r1); \ > > > ptesync;\ > > > ld r0,0(r1); \ > > > -1: cmp cr0,r0,r0; \ > > > +1: cmpdcr0,r0,r0; \ > > > bne 1b; \ > > > IDLE_INST; \ > > > b . > > > > What's this one doing, is it a bug? I can't really tell without knowing > > what the magic sequence is meant to do. This one is the recommended idle state entry sequence described in ISA. We need to ensure the context is fully saved and also create a register dependency using cmp and loop which will ideally not be taken. This will get the thread (pipeline) ready to start losing state when the idle instruction is executed. ISA 2.07 Section: 3.3.2.1 Entering and Exiting Power-Saving Mode > > It looks like it is making sure the ptesync is done. The ld/cmp/bne > is the usual to make sure the ld is done, and in std/ptesync/ld the ld > won't be done before the ptesync is done. > > The cmp always compares equal, of course, so both cmpw and cmpd would > work fine here. cmpd looks better after ld ;-) Yes :) cmpd or cmpw would provide same result as far as this code sequence is concerned. I agree that cpmd is more appropriate here. --Vaidy
Re: [PATCH] powerpc: cmp -> cmpd for 64-bit
On Wed, Oct 12, 2016 at 02:05:19PM +1100, Michael Ellerman wrote: > Segher Boessenkoolwrites: > > > PowerPC's "cmp" instruction has four operands. Normally people write > > "cmpw" or "cmpd" for the second cmp operand 0 or 1. But, frequently > > people forget, and write "cmp" with just three operands. > > > > With older binutils this is silently accepted as if this was "cmpw", > > while often "cmpd" is wanted. With newer binutils GAS will complain > > about this for 64-bit code. For 32-bit code it still silently assumes > > "cmpw" is what is meant. > > Thanks. > > Anton already sent a fix for the two vdso ones, which were real bugs, > and that's now in Linus' tree. Ah cool. You'll just need the one then (and many more for book4e, but I cannot really handle that, other people can do that a lot better). > > --- a/arch/powerpc/include/asm/cpuidle.h > > +++ b/arch/powerpc/include/asm/cpuidle.h > > @@ -26,7 +26,7 @@ extern u64 pnv_first_deep_stop_state; > > #define IDLE_STATE_ENTER_SEQ(IDLE_INST) \ > /* Magic NAP/SLEEP/WINKLE mode enter sequence */\ > > std r0,0(r1); \ > > ptesync;\ > > ld r0,0(r1); \ > > -1: cmp cr0,r0,r0; \ > > +1: cmpdcr0,r0,r0; \ > > bne 1b; \ > > IDLE_INST; \ > > b . > > What's this one doing, is it a bug? I can't really tell without knowing > what the magic sequence is meant to do. It looks like it is making sure the ptesync is done. The ld/cmp/bne is the usual to make sure the ld is done, and in std/ptesync/ld the ld won't be done before the ptesync is done. The cmp always compares equal, of course, so both cmpw and cmpd would work fine here. cmpd looks better after ld ;-) Segher
Re: [PATCH] powerpc: cmp -> cmpd for 64-bit
Segher Boessenkoolwrites: > PowerPC's "cmp" instruction has four operands. Normally people write > "cmpw" or "cmpd" for the second cmp operand 0 or 1. But, frequently > people forget, and write "cmp" with just three operands. > > With older binutils this is silently accepted as if this was "cmpw", > while often "cmpd" is wanted. With newer binutils GAS will complain > about this for 64-bit code. For 32-bit code it still silently assumes > "cmpw" is what is meant. Thanks. Anton already sent a fix for the two vdso ones, which were real bugs, and that's now in Linus' tree. > diff --git a/arch/powerpc/include/asm/cpuidle.h > b/arch/powerpc/include/asm/cpuidle.h > index 01b8a13..3919332 100644 > --- a/arch/powerpc/include/asm/cpuidle.h > +++ b/arch/powerpc/include/asm/cpuidle.h > @@ -26,7 +26,7 @@ extern u64 pnv_first_deep_stop_state; #define IDLE_STATE_ENTER_SEQ(IDLE_INST) \ /* Magic NAP/SLEEP/WINKLE mode enter sequence */\ > std r0,0(r1); \ > ptesync;\ > ld r0,0(r1); \ > -1: cmp cr0,r0,r0; \ > +1: cmpdcr0,r0,r0; \ > bne 1b; \ > IDLE_INST; \ > b . What's this one doing, is it a bug? I can't really tell without knowing what the magic sequence is meant to do. Mahesh, Vaidy? cheers
[PATCH] powerpc: cmp -> cmpd for 64-bit
PowerPC's "cmp" instruction has four operands. Normally people write "cmpw" or "cmpd" for the second cmp operand 0 or 1. But, frequently people forget, and write "cmp" with just three operands. With older binutils this is silently accepted as if this was "cmpw", while often "cmpd" is wanted. With newer binutils GAS will complain about this for 64-bit code. For 32-bit code it still silently assumes "cmpw" is what is meant. Signed-off-by: Segher Boessenkool--- arch/powerpc/include/asm/cpuidle.h| 2 +- arch/powerpc/kernel/vdso64/datapage.S | 2 +- arch/powerpc/kernel/vdso64/gettimeofday.S | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h index 01b8a13..3919332 100644 --- a/arch/powerpc/include/asm/cpuidle.h +++ b/arch/powerpc/include/asm/cpuidle.h @@ -26,7 +26,7 @@ extern u64 pnv_first_deep_stop_state; std r0,0(r1); \ ptesync;\ ld r0,0(r1); \ -1: cmp cr0,r0,r0; \ +1: cmpdcr0,r0,r0; \ bne 1b; \ IDLE_INST; \ b . diff --git a/arch/powerpc/kernel/vdso64/datapage.S b/arch/powerpc/kernel/vdso64/datapage.S index 184a6ba..abf17fe 100644 --- a/arch/powerpc/kernel/vdso64/datapage.S +++ b/arch/powerpc/kernel/vdso64/datapage.S @@ -59,7 +59,7 @@ V_FUNCTION_BEGIN(__kernel_get_syscall_map) bl V_LOCAL_FUNC(__get_datapage) mtlrr12 addir3,r3,CFG_SYSCALL_MAP64 - cmpli cr0,r4,0 + cmpldi cr0,r4,0 crclr cr0*4+so beqlr li r0,NR_syscalls diff --git a/arch/powerpc/kernel/vdso64/gettimeofday.S b/arch/powerpc/kernel/vdso64/gettimeofday.S index a76b4af..3820213 100644 --- a/arch/powerpc/kernel/vdso64/gettimeofday.S +++ b/arch/powerpc/kernel/vdso64/gettimeofday.S @@ -145,7 +145,7 @@ V_FUNCTION_BEGIN(__kernel_clock_getres) bne cr0,99f li r3,0 - cmpli cr0,r4,0 + cmpldi cr0,r4,0 crclr cr0*4+so beqlr lis r5,CLOCK_REALTIME_RES@h -- 1.9.3