Re: [RFC PATCH v2 12/13] powerpc/kernel: Do not inconditionally save non volatile registers on system call

2020-04-06 Thread Nicholas Piggin
Christophe Leroy's on April 7, 2020 4:18 am:
> 
> 
> Le 06/04/2020 à 03:25, Nicholas Piggin a écrit :
>> Christophe Leroy's on April 6, 2020 3:44 am:
>>> Before : 347 cycles on null_syscall
>>> After  : 327 cycles on null_syscall
>> 
>> The problem I had doing this is that signal delivery wnats full regs,
>> and you don't know if you have a signal pending ahead of time if you
>> have interrupts enabled.
>> 
>> I began to try bailing out back to asm to save nvgprs and call again.
>> I think that can be made to work, but it is more complication in asm,
>> and I soon found that 64s CPUs don't care about NVGPRs too much so it's
>> nice to get rid of the !fullregs state.
> 
> I tried a new way in v3, please have a look. I split 
> syscall_exit_prepare() in 3 parts and the result is unexpected: it is 
> better than before the series (307 cycles now versus 311 cycles with 
> full ASM syscall entry/exit).

Great! Well I don't really see a problem with how you changed the C code 
around. I'll have to look at the assembly but I don't think it would 
have caused a problem for 64s.

Thanks,
Nick


Re: [RFC PATCH v2 12/13] powerpc/kernel: Do not inconditionally save non volatile registers on system call

2020-04-06 Thread Christophe Leroy




Le 06/04/2020 à 03:25, Nicholas Piggin a écrit :

Christophe Leroy's on April 6, 2020 3:44 am:

Before : 347 cycles on null_syscall
After  : 327 cycles on null_syscall


The problem I had doing this is that signal delivery wnats full regs,
and you don't know if you have a signal pending ahead of time if you
have interrupts enabled.

I began to try bailing out back to asm to save nvgprs and call again.
I think that can be made to work, but it is more complication in asm,
and I soon found that 64s CPUs don't care about NVGPRs too much so it's
nice to get rid of the !fullregs state.


I tried a new way in v3, please have a look. I split 
syscall_exit_prepare() in 3 parts and the result is unexpected: it is 
better than before the series (307 cycles now versus 311 cycles with 
full ASM syscall entry/exit).




Possibly another approach would be to leave interrupts disabled for the
case where you have no work to do. You could create a small
syscall_exit_prepare_nowork fastpath for that case for 32-bit, perhaps?

Thanks,
Nick



Christophe


Re: [RFC PATCH v2 12/13] powerpc/kernel: Do not inconditionally save non volatile registers on system call

2020-04-05 Thread Nicholas Piggin
Christophe Leroy's on April 6, 2020 3:44 am:
> Before : 347 cycles on null_syscall
> After  : 327 cycles on null_syscall

The problem I had doing this is that signal delivery wnats full regs,
and you don't know if you have a signal pending ahead of time if you
have interrupts enabled.

I began to try bailing out back to asm to save nvgprs and call again.
I think that can be made to work, but it is more complication in asm,
and I soon found that 64s CPUs don't care about NVGPRs too much so it's
nice to get rid of the !fullregs state.

Possibly another approach would be to leave interrupts disabled for the
case where you have no work to do. You could create a small
syscall_exit_prepare_nowork fastpath for that case for 32-bit, perhaps?

Thanks,
Nick


[RFC PATCH v2 12/13] powerpc/kernel: Do not inconditionally save non volatile registers on system call

2020-04-05 Thread Christophe Leroy
Before : 347 cycles on null_syscall
After  : 327 cycles on null_syscall

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/entry_32.S | 15 +++
 arch/powerpc/kernel/head_32.h  |  3 +--
 arch/powerpc/kernel/syscall.c  |  2 +-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 103f5158bc44..b5113593e57f 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -315,13 +315,28 @@ stack_ovf:
RFI
 #endif
 
+save_nvgprs:
+   lwz r11, _TRAP(r1)
+   andi.   r12, r11, 1
+   rlwinm  r11, r11, 0, ~1
+   beqlr
+   SAVE_NVGPRS(r1)
+   stw r11, _TRAP(r1)
+   blr
+
.globl  transfer_to_syscall
 transfer_to_syscall:
+   lwz r10, TI_FLAGS(r2)
mr  r9, r0
+   andi.   r10, r10, _TIF_SYSCALL_DOTRACE
addir10, r1, STACK_FRAME_OVERHEAD
+   bnel-   save_nvgprs
bl  system_call_exception
 ret_from_syscall:
+   lwz r9, TI_FLAGS(r2)
addir4, r1, STACK_FRAME_OVERHEAD
+   andi.   r0, r9, _TIF_SYSCALL_DOTRACE | _TIF_SINGLESTEP | 
_TIF_USER_WORK_MASK
+   bnel-   save_nvgprs
bl  syscall_exit_prepare
lwz r2, _CCR(r1)
lwz r4, _NIP(r1)
diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index c301d666a3e5..1cc9a67cb42c 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -174,13 +174,12 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
stw r2,GPR2(r11)
addir10,r10,STACK_FRAME_REGS_MARKER@l
stw r9,_MSR(r11)
-   li  r2, \trapno
+   li  r2, \trapno + 1
stw r10,8(r11)
stw r2,_TRAP(r11)
SAVE_GPR(0, r11)
SAVE_4GPRS(3, r11)
SAVE_2GPRS(7, r11)
-   SAVE_NVGPRS(r11)
addir11,r1,STACK_FRAME_OVERHEAD
addir2,r12,-THREAD
stw r11,PT_REGS(r12)
diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
index 630c423e089a..34fd66fd11a2 100644
--- a/arch/powerpc/kernel/syscall.c
+++ b/arch/powerpc/kernel/syscall.c
@@ -39,7 +39,7 @@ notrace long system_call_exception(long r3, long r4, long r5,
if (!IS_ENABLED(CONFIG_PPC_BOOK3E))
BUG_ON(!(regs->msr & MSR_RI));
BUG_ON(IS_ENABLED(CONFIG_PPC64) && !(regs->msr & MSR_PR));
-   BUG_ON(!FULL_REGS(regs));
+   BUG_ON(IS_ENABLED(CONFIG_PPC64) && !FULL_REGS(regs));
 #ifdef CONFIG_PPC64
BUG_ON(regs->softe != IRQS_ENABLED);
 #endif
-- 
2.25.0