On 5/13/24 20:36, Jeff Law wrote:


On 5/13/24 6:54 PM, Patrick O'Neill wrote:

On 5/13/24 13:28, Jeff Law wrote:


On 5/13/24 12:49 PM, Vineet Gupta wrote:
If the constant used for stack offset can be expressed as sum of two S12 values, the constant need not be materialized (in a reg) and instead the
two S12 bits can be added to instructions involved with frame pointer.
This avoids burning a register and more importantly can often get down
to be 2 insn vs. 3.

The prev patches to generally avoid LUI based const materialization didn't
fix this PR and need this directed fix in funcion prologue/epilogue
expansion.

This fix doesn't move the neddle for SPEC, at all, but it is still a
win considering gcc generates one insn fewer than llvm for the test ;-)

    gcc-13.1 release   |      gcc 230823     | |
                       |    g6619b3d4c15c    |   This patch | clang/llvm --------------------------------------------------------------------------------- li      t0,-4096     | li    t0,-4096      | addi sp,sp,-2048 | addi sp,sp,-2048 addi    t0,t0,2016   | addi  t0,t0,2032    | add sp,sp,-16   | addi sp,sp,-32 li      a4,4096      | add   sp,sp,t0      | add a5,sp,a0    | add a1,sp,16 add     sp,sp,t0     | addi  a5,sp,-2032   | sb zero,0(a5)  | add a0,a0,a1 li      a5,-4096     | add   a0,a5,a0      | addi sp,sp,2032  | sb zero,0(a0) addi    a4,a4,-2032  | li    t0, 4096      | addi sp,sp,32    | addi sp,sp,2032 add     a4,a4,a5     | sb    zero,2032(a0) | ret               | addi sp,sp,48
addi    a5,sp,16     | addi  t0,t0,-2032 |                   | ret
add     a5,a4,a5     | add   sp,sp,t0      |
add     a0,a5,a0     | ret                 |
li      t0,4096      |
sd      a5,8(sp)     |
sb      zero,2032(a0)|
addi    t0,t0,-2016  |
add     sp,sp,t0     |
ret                  |

gcc/ChangeLog:
    PR target/105733
    * config/riscv/riscv.h: New macros for with aligned offsets.
    * config/riscv/riscv.cc (riscv_split_sum_of_two_s12): New
    function to split a sum of two s12 values into constituents.
    (riscv_expand_prologue): Handle offset being sum of two S12.
    (riscv_expand_epilogue): Ditto.
    * config/riscv/riscv-protos.h (riscv_split_sum_of_two_s12): New.

gcc/testsuite/ChangeLog:
    * gcc.target/riscv/pr105733.c: New Test.
    * gcc.target/riscv/rvv/autovec/vls/spill-1.c: Adjust to not
    expect LUI 4096.
    * gcc.target/riscv/rvv/autovec/vls/spill-2.c: Ditto.
    * gcc.target/riscv/rvv/autovec/vls/spill-3.c: Ditto.
    * gcc.target/riscv/rvv/autovec/vls/spill-4.c: Ditto.
    * gcc.target/riscv/rvv/autovec/vls/spill-5.c: Ditto.
    * gcc.target/riscv/rvv/autovec/vls/spill-6.c: Ditto.
    * gcc.target/riscv/rvv/autovec/vls/spill-7.c: Ditto.



@@ -8074,14 +8111,26 @@ riscv_expand_epilogue (int style)
      }
        else
      {
-      if (!SMALL_OPERAND (adjust_offset.to_constant ()))
+      HOST_WIDE_INT adj_off_value = adjust_offset.to_constant ();
+      if (SMALL_OPERAND (adj_off_value))
+        {
+          adjust = GEN_INT (adj_off_value);
+        }
+      else if (SUM_OF_TWO_S12_ALGN (adj_off_value))
+        {
+          HOST_WIDE_INT base, off;
+          riscv_split_sum_of_two_s12 (adj_off_value, &base, &off);
+          insn = gen_add3_insn (stack_pointer_rtx, hard_frame_pointer_rtx,
+                    GEN_INT (base));
+          RTX_FRAME_RELATED_P (insn) = 1;
+          adjust = GEN_INT (off);
+        }
So this was the hunk that we identified internally as causing problems with libgomp's testsuite.  We never fully chased it down as this hunk didn't seem terribly important performance wise -- we just set it aside.  The thing is it looked basically correct to me.  So the failure was certainly unexpected, but it was consistent.

So I think the question is whether or not the CI system runs the libgomp testsuite, particularly in the rv64 linux configuration. If it does, and it passes, then we're good. I'm still finding my way around the configuration, so I don't know if the CI system Edwin & Patrick have built tests libgomp or not.

I poked around the .sum files in pre/postcommit and we do run tests like:

PASS: c-c++-common/gomp/affinity-2.c  (test for errors, line 45)
I was able to find the summary info:

Tests that now fail, but worked before (15 tests):
libgomp: libgomp.fortran/simd7.f90   -O0  execution test
libgomp: libgomp.fortran/task2.f90   -O0  execution test
libgomp: libgomp.fortran/vla2.f90   -O0  execution test
libgomp: libgomp.fortran/vla3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
libgomp: libgomp.fortran/vla3.f90   -O3 -g  execution test
libgomp: libgomp.fortran/vla4.f90   -O1  execution test
libgomp: libgomp.fortran/vla4.f90   -O2  execution test
libgomp: libgomp.fortran/vla4.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
libgomp: libgomp.fortran/vla4.f90   -O3 -g  execution test
libgomp: libgomp.fortran/vla4.f90   -Os  execution test
libgomp: libgomp.fortran/vla5.f90   -O1  execution test
libgomp: libgomp.fortran/vla5.f90   -O2  execution test
libgomp: libgomp.fortran/vla5.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
libgomp: libgomp.fortran/vla5.f90   -O3 -g  execution test
libgomp: libgomp.fortran/vla5.f90   -Os  execution test

So if you could check on those, it'd be appreciated.

I checked rv64gcv linux and those do not currently run in CI.

Patrick

Reply via email to