On Mon, May 6, 2024 at 11:43 PM Vineet Gupta <vine...@rivosinc.com> wrote: > > > > On 5/6/24 13:40, Christoph Müllner wrote: > > The combiner attempts to optimize a zero-extension of a logical right shift > > using zero_extract. We already utilize this optimization for those cases > > that result in a single instructions. Let's add a insn_and_split > > pattern that also matches the generic case, where we can emit an > > optimized sequence of a slli/srli. > > > > ... > > > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > > index d4676507b45..80cbecb78e8 100644 > > --- a/gcc/config/riscv/riscv.md > > +++ b/gcc/config/riscv/riscv.md > > @@ -2792,6 +2792,36 @@ (define_insn "*lshrsi3_zero_extend_3" > > [(set_attr "type" "shift") > > (set_attr "mode" "SI")]) > > > > +;; Canonical form for a zero-extend of a logical right shift. > > +;; Special cases are handled above. > > +;; Skip for single-bit extraction (Zbs/XTheadBs) and th.extu (XTheadBb) > > Dumb question: Why not for Zbs: Zb[abs] is going to be very common going > fwd and will end up being unused. > > > +(define_insn_and_split "*lshr<GPR:mode>3_zero_extend_4" > > + [(set (match_operand:GPR 0 "register_operand" "=r") > > + (zero_extract:GPR > > + (match_operand:GPR 1 "register_operand" " r") > > + (match_operand 2 "const_int_operand") > > + (match_operand 3 "const_int_operand"))) > > + (clobber (match_scratch:GPR 4 "=&r"))] > > + "!((TARGET_ZBS || TARGET_XTHEADBS) && (INTVAL (operands[2]) == 1)) > > + && !TARGET_XTHEADBB" > > + "#" > > + "&& reload_completed" > > + [(set (match_dup 4) > > + (ashift:GPR (match_dup 1) (match_dup 2))) > > + (set (match_dup 0) > > + (lshiftrt:GPR (match_dup 4) (match_dup 3)))] > > +{ > > + int regbits = GET_MODE_BITSIZE (GET_MODE (operands[0])).to_constant (); > > + int sizebits = INTVAL (operands[2]); > > + int startbits = INTVAL (operands[3]); > > + int lshamt = regbits - sizebits - startbits; > > + int rshamt = lshamt + startbits; > > + operands[2] = GEN_INT (lshamt); > > + operands[3] = GEN_INT (rshamt); > > +} > > + [(set_attr "type" "shift") > > + (set_attr "mode" "<GPR:MODE>")]) > > + > > ;; Handle AND with 2^N-1 for N from 12 to XLEN. This can be split into > > ;; two logical shifts. Otherwise it requires 3 instructions: lui, > > ;; xor/addi/srli, and. > > diff --git a/gcc/testsuite/gcc.target/riscv/pr111501.c > > b/gcc/testsuite/gcc.target/riscv/pr111501.c > > new file mode 100644 > > index 00000000000..9355be242e7 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/pr111501.c > > @@ -0,0 +1,32 @@ > > +/* { dg-do compile } */ > > +/* { dg-require-effective-target rv64 } */ > > +/* { dg-options "-march=rv64gc" { target { rv64 } } } */ > > +/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */ > > +/* { dg-final { check-function-bodies "**" "" } } */ > > Is function body check really needed: isn't count of srli and slli each > sufficient ? > Last year we saw a lot of false failures due to unrelated scheduling > changes as such tripping these up.
I've dropped the check-function-bodies in the v2. Thanks! > > > +/* { dg-allow-blank-lines-in-output 1 } */ > > + > > +/* > > +**do_shift: > > +** ... > > +** slli\ta[0-9],a[0-9],16 > > +** srli\ta[0-9],a[0-9],48 > > +** ... > > +*/ > > +unsigned int > > +do_shift(unsigned long csum) > > +{ > > + return (unsigned short)(csum >> 32); > > +} > > + > > +/* > > +**do_shift2: > > +** ... > > +** slli\ta[0-9],a[0-9],16 > > +** srli\ta[0-9],a[0-9],48 > > +** ... > > +*/ > > +unsigned int > > +do_shift2(unsigned long csum) > > +{ > > + return (csum << 16) >> 48; > > +} > > diff --git a/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c > > b/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c > > new file mode 100644 > > index 00000000000..2824d6fe074 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c > > @@ -0,0 +1,37 @@ > > +/* { dg-do compile } */ > > +/* { dg-require-effective-target rv32 } */ > > +/* { dg-options "-march=rv32gc" } */ > > +/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */ > > +/* { dg-final { check-function-bodies "**" "" } } */ > > Same as above, counts where possible. > > -Vineet >