Le 25/03/2023 à 14:06, Nicholas Piggin a écrit :
> The _switch stack frame setup are substantially the same, so are the
> comments. The difference in how the stack and current are switched,
> and other hardware and software housekeeping is done is moved into
> macros.
> 
> Signed-off-by: Nicholas Piggin <npig...@gmail.com>
> ---
> These patches are mostly just shuffling code around. Better? Worse?

I find it nice, at least for PPC32 part.

For PPC32 generated code is almost the same, only a few reordering at 
the start of the function.

Before the change I have:

00000238 <_switch>:
  238:  94 21 ff 30     stwu    r1,-208(r1)
  23c:  7c 08 02 a6     mflr    r0
  240:  90 01 00 d4     stw     r0,212(r1)
  244:  91 a1 00 44     stw     r13,68(r1)
...
  28c:  93 e1 00 8c     stw     r31,140(r1)
  290:  90 01 00 90     stw     r0,144(r1)
  294:  7d 40 00 26     mfcr    r10
  298:  91 41 00 a8     stw     r10,168(r1)
  29c:  90 23 00 00     stw     r1,0(r3)
  2a0:  3c 04 40 00     addis   r0,r4,16384
  2a4:  7c 13 43 a6     mtsprg  3,r0
...

After the change I have:

00000000 <_switch>:
    0:  7c 08 02 a6     mflr    r0
    4:  90 01 00 04     stw     r0,4(r1)
    8:  94 21 ff 30     stwu    r1,-208(r1)
    c:  90 23 00 00     stw     r1,0(r3)
   10:  91 a1 00 44     stw     r13,68(r1)
...
   58:  93 e1 00 8c     stw     r31,140(r1)
   5c:  90 01 00 90     stw     r0,144(r1)
   60:  7c 00 00 26     mfcr    r0
   64:  90 01 00 a8     stw     r0,168(r1)
   68:  3c 04 40 00     addis   r0,r4,16384
   6c:  7c 13 43 a6     mtsprg  3,r0
...

Everything else is identical.

Not sure, maybe re-using r1 immediately after stwu will introduce latency.

Christophe

Reply via email to