On Mon, May 6, 2024 at 11:24 PM Jeff Law <jeffreya...@gmail.com> wrote:
>
>
>
> On 5/6/24 2:40 PM, Christoph Müllner wrote:
> > The combiner attempts to optimize a zero-extension of a logical right shift
> > using zero_extract. We already utilize this optimization for those cases
> > that result in a single instructions.  Let's add a insn_and_split
> > pattern that also matches the generic case, where we can emit an
> > optimized sequence of a slli/srli.
> >
> > Tested with SPEC CPU 2017 (rv64gc).
> >
> >       PR 111501
> >
> > gcc/ChangeLog:
> >
> >       * config/riscv/riscv.md (*lshr<GPR:mode>3_zero_extend_4): New
> >       pattern for zero-extraction.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/riscv/pr111501.c: New test.
> >       * gcc.target/riscv/zero-extend-rshift-32.c: New test.
> >       * gcc.target/riscv/zero-extend-rshift-64.c: New test.
> >       * gcc.target/riscv/zero-extend-rshift.c: New test.
> So I had Lyut looking in this space as well.  Mostly because there's a
> desire to avoid the srl+and approach and instead represent this stuff as
> shifts (which are fusible in our uarch).  SO I've already got some state...
>
>
> >
> > Signed-off-by: Christoph Müllner <christoph.muell...@vrull.eu>
> > ---
> >   gcc/config/riscv/riscv.md                     |  30 +++++
> >   gcc/testsuite/gcc.target/riscv/pr111501.c     |  32 +++++
> >   .../gcc.target/riscv/zero-extend-rshift-32.c  |  37 ++++++
> >   .../gcc.target/riscv/zero-extend-rshift-64.c  |  63 ++++++++++
> >   .../gcc.target/riscv/zero-extend-rshift.c     | 119 ++++++++++++++++++
> >   5 files changed, 281 insertions(+)
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/pr111501.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-64.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift.c
> >
> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > index d4676507b45..80cbecb78e8 100644
> > --- a/gcc/config/riscv/riscv.md
> > +++ b/gcc/config/riscv/riscv.md
> > @@ -2792,6 +2792,36 @@ (define_insn "*lshrsi3_zero_extend_3"
> >     [(set_attr "type" "shift")
> >      (set_attr "mode" "SI")])
> >
> > +;; Canonical form for a zero-extend of a logical right shift.
> > +;; Special cases are handled above.
> > +;; Skip for single-bit extraction (Zbs/XTheadBs) and th.extu (XTheadBb)
> > +(define_insn_and_split "*lshr<GPR:mode>3_zero_extend_4"
> > +  [(set (match_operand:GPR 0 "register_operand" "=r")
> > +      (zero_extract:GPR
> > +       (match_operand:GPR 1 "register_operand" " r")
> > +       (match_operand     2 "const_int_operand")
> > +       (match_operand     3 "const_int_operand")))
> > +   (clobber (match_scratch:GPR  4 "=&r"))]
> > +  "!((TARGET_ZBS || TARGET_XTHEADBS) && (INTVAL (operands[2]) == 1))
> > +   && !TARGET_XTHEADBB"
> > +  "#"
> > +  "&& reload_completed"
> > +  [(set (match_dup 4)
> > +     (ashift:GPR (match_dup 1) (match_dup 2)))
> > +   (set (match_dup 0)
> > +     (lshiftrt:GPR (match_dup 4) (match_dup 3)))]
> Consider adding support for signed extractions as well.  You just need
> an iterator across zero_extract/sign_extract and suitable selection of
> arithmetic vs logical right shift step.

The sign-extension/extraction code was worse than the
zero-extension/extraction code.
So, I ended up doing some initial work for addressing corner cases first, before
converting this pattern using an any_extract iterator for the v2
(already on the list).

>
> A nit on the condition.   Bring the && INTVAL (operands[2]) == 1 down to
> a new line like you've gone with !TARGET_XTHEADBB.
>
> You also want to make sure the condition rejects the cases handled by
> this pattern (or merge your pattern with this one):

I kept the pattern, but added sign_extract support.

>
> > ;; Canonical form for a zero-extend of a logical right shift.
> > (define_insn "*lshrsi3_zero_extend_2"
> >   [(set (match_operand:DI                   0 "register_operand" "=r")
> >         (zero_extract:DI (match_operand:DI  1 "register_operand" " r")
> >                          (match_operand     2 "const_int_operand")
> >                          (match_operand     3 "const_int_operand")))]
> >   "(TARGET_64BIT && (INTVAL (operands[3]) > 0)
> >     && (INTVAL (operands[2]) + INTVAL (operands[3]) == 32))"
> > {
> >   return "srliw\t%0,%1,%3";
> > }
> >   [(set_attr "type" "shift")
> >    (set_attr "mode" "SI")])
>
> So generally going the right direction.  But needs another iteration.

Thanks for the review!

>
> Jeff
>

Reply via email to