From: Pan Li <pan2...@intel.com> This patch would like to introduce the combine of vec_dup + vmacc.vv into vmacc.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 2, 15 in test.
From: | ... | vmv.v.x | L1: | vmacc.vv | J L1 | ... To: | ... | L1: | vmacc.vx | J L1 | ... The below test suites are passed for this patch series. * The rv64gcv fully regression test. Pan Li (4): RISC-V: Combine vec_duplicate + vmacc.vv to vmacc.vx on GR2VR cost RISC-V: Add test for vec_duplicate + vmacc.vv signed combine with GR2VR cost 0, 1 and 15 RISC-V: Add test for vec_duplicate + vmacc.vv unsigned combine with GR2VR cost 0, 1 and 15 RISC-V: Adjust the asm check after enable vmacc.vx combine gcc/config/riscv/autovec-opt.md | 22 + gcc/config/riscv/vector.md | 48 +++ .../riscv/rvv/autovec/vx_vf/vx-1-i16.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-1-i32.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-1-i64.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-1-i8.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-1-u16.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-1-u32.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-1-u64.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-1-u8.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-2-i16.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-2-i32.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-2-i64.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-2-i8.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-2-u16.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-2-u32.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-2-u64.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-2-u8.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-3-i16.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-3-i32.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-3-i64.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-3-i8.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-3-u16.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-3-u32.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-3-u64.c | 3 + .../riscv/rvv/autovec/vx_vf/vx-3-u8.c | 3 + .../riscv/rvv/autovec/vx_vf/vx_ternary.h | 35 ++ .../riscv/rvv/autovec/vx_vf/vx_ternary_data.h | 377 ++++++++++++++++++ .../riscv/rvv/autovec/vx_vf/vx_ternary_run.h | 26 ++ .../rvv/autovec/vx_vf/vx_vmacc-run-1-i16.c | 16 + .../rvv/autovec/vx_vf/vx_vmacc-run-1-i32.c | 16 + .../rvv/autovec/vx_vf/vx_vmacc-run-1-i64.c | 16 + .../rvv/autovec/vx_vf/vx_vmacc-run-1-i8.c | 16 + .../rvv/autovec/vx_vf/vx_vmacc-run-1-u16.c | 16 + .../rvv/autovec/vx_vf/vx_vmacc-run-1-u32.c | 16 + .../rvv/autovec/vx_vf/vx_vmacc-run-1-u64.c | 16 + .../rvv/autovec/vx_vf/vx_vmacc-run-1-u8.c | 16 + .../riscv/rvv/base/ternop_vx_constraint-4.c | 8 +- .../riscv/rvv/base/ternop_vx_constraint-5.c | 8 +- .../riscv/rvv/base/ternop_vx_constraint-6.c | 8 +- .../riscv/rvv/base/ternop_vx_constraint-7.c | 8 +- 41 files changed, 724 insertions(+), 16 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_ternary.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_ternary_data.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_ternary_run.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vmacc-run-1-i16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vmacc-run-1-i32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vmacc-run-1-i64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vmacc-run-1-i8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vmacc-run-1-u16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vmacc-run-1-u32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vmacc-run-1-u64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vmacc-run-1-u8.c -- 2.43.0