Before implementing Zcmp, I did some optimizations and restructures to save-restore. https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=a5b2a3bff8152aa34408d8ce40add82f4d22ff87 https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=60524be1e3929d83e15fceac6e2aa053c8a6fb20 https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=a782346757c54a5a3cfb9f416a7ebe3554a617d7
Then Zcmp can share the same logic as save-restore in stack allocation: pre-allocation by cm.push, step 1 and step 2. please be noted cm.push pushes ra, s0-s11 in reverse order than what save-restore does. So adaption has been done in .cfi directives in my patch. A discussion be found here: https://github.com/riscv/riscv-code-size-reduction/issues/182 Weeks before, Jiawei also posted Zcmp in https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615287.html. [PATCH 0/5] RISC-V: Support ZC* extensions. Jiawei [PATCH 1/5] RISC-V: Minimal support for ZC extensions. Jiawei [PATCH 2/5] RISC-V: Enable compressible features when use ZC* extensions. Jiawei [PATCH 3/5] RISC-V: Add ZC* test for march args being passed. Jiawei [PATCH 4/5] RISC-V: Add Zcmp extension supports. Jiawei [PATCH 5/5] RISC-V: Add ZCMP push/pop testcases. Jiawei I tested his codes and observed some issues in [PATCH 4/5]. So I plan to post my codes as an alternative of Jiawei's [PATCH 4/5]. My Zcmp switch codes are almost same as Jiawei's. So i avoid repeating them in my patch series, but please pick up Jiawei's [PATCH 1/5] before picking up my patch series. Here're some comparison. Result left side is REF from Jiawei and right side is from my patch. 1. REF fails to generate zcmp insns. TC rv32e_zcmp.c foo: foo: addi sp,sp,-12 cm.push {ra}, -16 sw ra,8(sp) call f1 call f1 lw ra,8(sp) cm.pop {ra}, 16 addi sp,sp,12 tail f2 tail f2 2. REF fails to restore regs. TC rv32i_zcmp.c test_f0: test_f0: cm.push {ra,s0},-32 cm.push {ra, s0}, -32 fsw fs0,12(sp) fsw fs0,12(sp) call my_getchar call my_getchar mv s0,a0 mv s0,a0 call getf call getf fmv.s fs0,fa0 fmv.s fs0,fa0 call my_getchar call my_getchar fcvt.s.w fa5,s0 fcvt.s.w fa5,s0 fcvt.s.w fa4,a0 fcvt.s.w fa4,a0 fadd.s fa0,fa5,fs0 fadd.s fa0,fa5,fs0 flw fs0,-20(sp) //issue in restoring fs0 flw fs0,12(sp) fadd.s fa0,fa0,fa4 fadd.s fa0,fa0,fa4 fcvt.w.s a0,fa0,rtz fcvt.w.s a0,fa0,rtz cm.popret {ra,s0},32 cm.popret {ra, s0}, 32 3. REF accesses incorrect address of incoming para. TC: zcmp_stack_alignment.c fool_rv32e: fool_rv32e: cm.push {ra,s0-s1},-32 cm.push {ra, s0-s1}, -32 mv s0,a0 sw a1,12(sp) sw a1,12(sp) mv s0,a0 sw a2,8(sp) sw a2,8(sp) sw a3,4(sp) sw a3,4(sp) sw a4,0(sp) sw a4,0(sp) mv s1,a5 mv s1,a5 call bar call bar lw a1,12(sp) lw a1,12(sp) lw a2,8(sp) lw a2,8(sp) lw a3,4(sp) lw a3,4(sp) lw a4,0(sp) lw a4,0(sp) add a0,s0,a1 add a0,s0,a1 add a2,a0,a2 add a2,a0,a2 add a3,a2,a3 add a3,a2,a3 lw a0,28(sp) //issue in accessing incoming para lw a0,32(sp) add a4,a3,a4 add a4,a3,a4 add a4,a4,s1 add a4,a4,s1 add a0,a4,a0 add a0,a4,a0 cm.popret {ra,s0-s1},32 cm.popret {ra, s0-s1}, 32 Fei Gao (2): [RISC-V] disable shrink-wrap-separate if zcmp enabled. [RISC-V] support cm.push cm.pop cm.popret in zcmp gcc/config/riscv/predicates.md | 6 + gcc/config/riscv/riscv-protos.h | 3 + gcc/config/riscv/riscv.cc | 403 ++++++++++++++++-- gcc/config/riscv/riscv.h | 26 ++ gcc/config/riscv/riscv.md | 7 + gcc/config/riscv/zc.md | 55 +++ gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c | 239 +++++++++++ gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c | 239 +++++++++++ .../gcc.target/riscv/zcmp_stack_alignment.c | 23 + 9 files changed, 960 insertions(+), 41 deletions(-) create mode 100644 gcc/config/riscv/zc.md create mode 100644 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c create mode 100644 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c create mode 100644 gcc/testsuite/gcc.target/riscv/zcmp_stack_alignment.c -- 2.17.1