On 5/6/24 2:40 PM, Christoph Müllner wrote:
The combiner attempts to optimize a zero-extension of a logical right shift
using zero_extract. We already utilize this optimization for those cases
that result in a single instructions.  Let's add a insn_and_split
pattern that also matches the generic case, where we can emit an
optimized sequence of a slli/srli.

Tested with SPEC CPU 2017 (rv64gc).

        PR 111501

gcc/ChangeLog:

        * config/riscv/riscv.md (*lshr<GPR:mode>3_zero_extend_4): New
        pattern for zero-extraction.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/pr111501.c: New test.
        * gcc.target/riscv/zero-extend-rshift-32.c: New test.
        * gcc.target/riscv/zero-extend-rshift-64.c: New test.
        * gcc.target/riscv/zero-extend-rshift.c: New test.
So I had Lyut looking in this space as well. Mostly because there's a desire to avoid the srl+and approach and instead represent this stuff as shifts (which are fusible in our uarch). SO I've already got some state...



Signed-off-by: Christoph Müllner <christoph.muell...@vrull.eu>
---
  gcc/config/riscv/riscv.md                     |  30 +++++
  gcc/testsuite/gcc.target/riscv/pr111501.c     |  32 +++++
  .../gcc.target/riscv/zero-extend-rshift-32.c  |  37 ++++++
  .../gcc.target/riscv/zero-extend-rshift-64.c  |  63 ++++++++++
  .../gcc.target/riscv/zero-extend-rshift.c     | 119 ++++++++++++++++++
  5 files changed, 281 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/riscv/pr111501.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-64.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index d4676507b45..80cbecb78e8 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2792,6 +2792,36 @@ (define_insn "*lshrsi3_zero_extend_3"
    [(set_attr "type" "shift")
     (set_attr "mode" "SI")])
+;; Canonical form for a zero-extend of a logical right shift.
+;; Special cases are handled above.
+;; Skip for single-bit extraction (Zbs/XTheadBs) and th.extu (XTheadBb)
+(define_insn_and_split "*lshr<GPR:mode>3_zero_extend_4"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+        (zero_extract:GPR
+       (match_operand:GPR 1 "register_operand" " r")
+       (match_operand     2 "const_int_operand")
+       (match_operand     3 "const_int_operand")))
+   (clobber (match_scratch:GPR  4 "=&r"))]
+  "!((TARGET_ZBS || TARGET_XTHEADBS) && (INTVAL (operands[2]) == 1))
+   && !TARGET_XTHEADBB"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 4)
+     (ashift:GPR (match_dup 1) (match_dup 2)))
+   (set (match_dup 0)
+     (lshiftrt:GPR (match_dup 4) (match_dup 3)))]
Consider adding support for signed extractions as well. You just need an iterator across zero_extract/sign_extract and suitable selection of arithmetic vs logical right shift step.

A nit on the condition. Bring the && INTVAL (operands[2]) == 1 down to a new line like you've gone with !TARGET_XTHEADBB.

You also want to make sure the condition rejects the cases handled by this pattern (or merge your pattern with this one):

;; Canonical form for a zero-extend of a logical right shift.
(define_insn "*lshrsi3_zero_extend_2" [(set (match_operand:DI 0 "register_operand" "=r")
        (zero_extract:DI (match_operand:DI  1 "register_operand" " r")
                         (match_operand     2 "const_int_operand")
                         (match_operand     3 "const_int_operand")))]
  "(TARGET_64BIT && (INTVAL (operands[3]) > 0)
    && (INTVAL (operands[2]) + INTVAL (operands[3]) == 32))"
{
  return "srliw\t%0,%1,%3";
}
  [(set_attr "type" "shift")
   (set_attr "mode" "SI")])

So generally going the right direction.  But needs another iteration.

Jeff

Reply via email to