Re: [PATCH v2 4/4] RISC-V: Cover sign-extensions in lshr3_zero_extend_4

2024-05-08 Thread Christoph Müllner
On Wed, May 8, 2024 at 3:48 PM Jeff Law  wrote:
>
>
>
> On 5/8/24 1:36 AM, Christoph Müllner wrote:
> > The lshr3_zero_extend_4 pattern targets bit extraction
> > with zero-extension. This pattern represents the canonical form
> > of zero-extensions of a logical right shift.
> >
> > The same optimization can be applied to sign-extensions.
> > Given the two optimizations are so similar, this patch converts
> > the existing one to also cover the sign-extension case as well.
> >
> > gcc/ChangeLog:
> >
> >   * config/riscv/iterators.md (ashiftrt): New code attribute
> >   'extract_shift' and adding extractions to optab.
> >   * config/riscv/riscv.md (*lshr3_zero_extend_4): Rename to...
> >   (*3):...this and add support for
> >   sign-extensions.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/riscv/extend-shift-helpers.h: Add helpers for
> >   sign-extension.
> >   * gcc.target/riscv/sign-extend-rshift-32.c: New test.
> >   * gcc.target/riscv/sign-extend-rshift-64.c: New test.
> >   * gcc.target/riscv/sign-extend-rshift.c: New test.
> Oh, I see, you handled the special case with this patch.  Ignore my
> comment on 3/4.  3/4 is fine, as is this patch.

Oh, yes, I forgot to add this to 3/4.

Thanks!

>
> Thanks!
>
> jeff


Re: [PATCH] RISC-V: Add zero_extract support for rv64gc

2024-05-08 Thread Christoph Müllner
On Mon, May 6, 2024 at 11:43 PM Vineet Gupta  wrote:
>
>
>
> On 5/6/24 13:40, Christoph Müllner wrote:
> > The combiner attempts to optimize a zero-extension of a logical right shift
> > using zero_extract. We already utilize this optimization for those cases
> > that result in a single instructions.  Let's add a insn_and_split
> > pattern that also matches the generic case, where we can emit an
> > optimized sequence of a slli/srli.
> >
> > ...
> >
> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > index d4676507b45..80cbecb78e8 100644
> > --- a/gcc/config/riscv/riscv.md
> > +++ b/gcc/config/riscv/riscv.md
> > @@ -2792,6 +2792,36 @@ (define_insn "*lshrsi3_zero_extend_3"
> >[(set_attr "type" "shift")
> > (set_attr "mode" "SI")])
> >
> > +;; Canonical form for a zero-extend of a logical right shift.
> > +;; Special cases are handled above.
> > +;; Skip for single-bit extraction (Zbs/XTheadBs) and th.extu (XTheadBb)
>
> Dumb question: Why not for Zbs: Zb[abs] is going to be very common going
> fwd and will end up being unused.
>
> > +(define_insn_and_split "*lshr3_zero_extend_4"
> > +  [(set (match_operand:GPR 0 "register_operand" "=r")
> > +  (zero_extract:GPR
> > +   (match_operand:GPR 1 "register_operand" " r")
> > +   (match_operand 2 "const_int_operand")
> > +   (match_operand 3 "const_int_operand")))
> > +   (clobber (match_scratch:GPR  4 "="))]
> > +  "!((TARGET_ZBS || TARGET_XTHEADBS) && (INTVAL (operands[2]) == 1))
> > +   && !TARGET_XTHEADBB"
> > +  "#"
> > +  "&& reload_completed"
> > +  [(set (match_dup 4)
> > + (ashift:GPR (match_dup 1) (match_dup 2)))
> > +   (set (match_dup 0)
> > + (lshiftrt:GPR (match_dup 4) (match_dup 3)))]
> > +{
> > +  int regbits = GET_MODE_BITSIZE (GET_MODE (operands[0])).to_constant ();
> > +  int sizebits = INTVAL (operands[2]);
> > +  int startbits = INTVAL (operands[3]);
> > +  int lshamt = regbits - sizebits - startbits;
> > +  int rshamt = lshamt + startbits;
> > +  operands[2] = GEN_INT (lshamt);
> > +  operands[3] = GEN_INT (rshamt);
> > +}
> > +  [(set_attr "type" "shift")
> > +   (set_attr "mode" "")])
> > +
> >  ;; Handle AND with 2^N-1 for N from 12 to XLEN.  This can be split into
> >  ;; two logical shifts.  Otherwise it requires 3 instructions: lui,
> >  ;; xor/addi/srli, and.
> > diff --git a/gcc/testsuite/gcc.target/riscv/pr111501.c 
> > b/gcc/testsuite/gcc.target/riscv/pr111501.c
> > new file mode 100644
> > index 000..9355be242e7
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/pr111501.c
> > @@ -0,0 +1,32 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target rv64 } */
> > +/* { dg-options "-march=rv64gc" { target { rv64 } } } */
> > +/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
> > +/* { dg-final { check-function-bodies "**" "" } } */
>
> Is function body check really needed: isn't count of srli and slli each
> sufficient ?
> Last year we saw a lot of false failures due to unrelated scheduling
> changes as such tripping these up.

I've dropped the check-function-bodies in the v2.

Thanks!

>
> > +/* { dg-allow-blank-lines-in-output 1 } */
> > +
> > +/*
> > +**do_shift:
> > +**...
> > +**slli\ta[0-9],a[0-9],16
> > +**srli\ta[0-9],a[0-9],48
> > +**...
> > +*/
> > +unsigned int
> > +do_shift(unsigned long csum)
> > +{
> > +  return (unsigned short)(csum >> 32);
> > +}
> > +
> > +/*
> > +**do_shift2:
> > +**...
> > +**slli\ta[0-9],a[0-9],16
> > +**srli\ta[0-9],a[0-9],48
> > +**...
> > +*/
> > +unsigned int
> > +do_shift2(unsigned long csum)
> > +{
> > +  return (csum << 16) >> 48;
> > +}
> > diff --git a/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c 
> > b/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
> > new file mode 100644
> > index 000..2824d6fe074
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
> > @@ -0,0 +1,37 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target rv32 } */
> > +/* { dg-options "-march=rv32gc" } */
> > +/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
> > +/* { dg-final { check-function-bodies "**" "" } } */
>
> Same as above, counts where possible.
>
> -Vineet
>


Re: [PATCH] RISC-V: Add zero_extract support for rv64gc

2024-05-08 Thread Christoph Müllner
On Mon, May 6, 2024 at 11:24 PM Jeff Law  wrote:
>
>
>
> On 5/6/24 2:40 PM, Christoph Müllner wrote:
> > The combiner attempts to optimize a zero-extension of a logical right shift
> > using zero_extract. We already utilize this optimization for those cases
> > that result in a single instructions.  Let's add a insn_and_split
> > pattern that also matches the generic case, where we can emit an
> > optimized sequence of a slli/srli.
> >
> > Tested with SPEC CPU 2017 (rv64gc).
> >
> >   PR 111501
> >
> > gcc/ChangeLog:
> >
> >   * config/riscv/riscv.md (*lshr3_zero_extend_4): New
> >   pattern for zero-extraction.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/riscv/pr111501.c: New test.
> >   * gcc.target/riscv/zero-extend-rshift-32.c: New test.
> >   * gcc.target/riscv/zero-extend-rshift-64.c: New test.
> >   * gcc.target/riscv/zero-extend-rshift.c: New test.
> So I had Lyut looking in this space as well.  Mostly because there's a
> desire to avoid the srl+and approach and instead represent this stuff as
> shifts (which are fusible in our uarch).  SO I've already got some state...
>
>
> >
> > Signed-off-by: Christoph Müllner 
> > ---
> >   gcc/config/riscv/riscv.md |  30 +
> >   gcc/testsuite/gcc.target/riscv/pr111501.c |  32 +
> >   .../gcc.target/riscv/zero-extend-rshift-32.c  |  37 ++
> >   .../gcc.target/riscv/zero-extend-rshift-64.c  |  63 ++
> >   .../gcc.target/riscv/zero-extend-rshift.c | 119 ++
> >   5 files changed, 281 insertions(+)
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/pr111501.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-64.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift.c
> >
> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > index d4676507b45..80cbecb78e8 100644
> > --- a/gcc/config/riscv/riscv.md
> > +++ b/gcc/config/riscv/riscv.md
> > @@ -2792,6 +2792,36 @@ (define_insn "*lshrsi3_zero_extend_3"
> > [(set_attr "type" "shift")
> >  (set_attr "mode" "SI")])
> >
> > +;; Canonical form for a zero-extend of a logical right shift.
> > +;; Special cases are handled above.
> > +;; Skip for single-bit extraction (Zbs/XTheadBs) and th.extu (XTheadBb)
> > +(define_insn_and_split "*lshr3_zero_extend_4"
> > +  [(set (match_operand:GPR 0 "register_operand" "=r")
> > +  (zero_extract:GPR
> > +   (match_operand:GPR 1 "register_operand" " r")
> > +   (match_operand 2 "const_int_operand")
> > +   (match_operand 3 "const_int_operand")))
> > +   (clobber (match_scratch:GPR  4 "="))]
> > +  "!((TARGET_ZBS || TARGET_XTHEADBS) && (INTVAL (operands[2]) == 1))
> > +   && !TARGET_XTHEADBB"
> > +  "#"
> > +  "&& reload_completed"
> > +  [(set (match_dup 4)
> > + (ashift:GPR (match_dup 1) (match_dup 2)))
> > +   (set (match_dup 0)
> > + (lshiftrt:GPR (match_dup 4) (match_dup 3)))]
> Consider adding support for signed extractions as well.  You just need
> an iterator across zero_extract/sign_extract and suitable selection of
> arithmetic vs logical right shift step.

The sign-extension/extraction code was worse than the
zero-extension/extraction code.
So, I ended up doing some initial work for addressing corner cases first, before
converting this pattern using an any_extract iterator for the v2
(already on the list).

>
> A nit on the condition.   Bring the && INTVAL (operands[2]) == 1 down to
> a new line like you've gone with !TARGET_XTHEADBB.
>
> You also want to make sure the condition rejects the cases handled by
> this pattern (or merge your pattern with this one):

I kept the pattern, but added sign_extract support.

>
> > ;; Canonical form for a zero-extend of a logical right shift.
> > (define_insn "*lshrsi3_zero_extend_2"
> >   [(set (match_operand:DI   0 "register_operand" "=r")
> > (zero_extract:DI (match_operand:DI  1 "register_operand" " r")
> >  (match_operand 2 "const_int_operand")
> >  (match_operand 3 "const_int_operand")))]
> >   "(TARGET_64BIT && (INTVAL (operands[3]) > 0)
> > && (INTVAL (operands[2]) + INTVAL (operands[3]) == 32))"
> > {
> >   return "srliw\t%0,%1,%3";
> > }
> >   [(set_attr "type" "shift")
> >(set_attr "mode" "SI")])
>
> So generally going the right direction.  But needs another iteration.

Thanks for the review!

>
> Jeff
>


[PATCH v2 3/4] RISC-V: Add zero_extract support for rv64gc

2024-05-08 Thread Christoph Müllner
The combiner attempts to optimize a zero-extension of a logical right shift
using zero_extract. We already utilize this optimization for those cases
that result in a single instructions.  Let's add a insn_and_split
pattern that also matches the generic case, where we can emit an
optimized sequence of a slli/srli.

Tested with SPEC CPU 2017 (rv64gc).

PR 111501

gcc/ChangeLog:

* config/riscv/riscv.md (*lshr3_zero_extend_4): New
pattern for zero-extraction.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/extend-shift-helpers.h: New test.
* gcc.target/riscv/pr111501.c: New test.
* gcc.target/riscv/zero-extend-rshift-32.c: New test.
* gcc.target/riscv/zero-extend-rshift-64.c: New test.
* gcc.target/riscv/zero-extend-rshift.c: New test.

Signed-off-by: Christoph Müllner 
---
 gcc/config/riscv/riscv.md |  30 +
 .../gcc.target/riscv/extend-shift-helpers.h   |  26 
 gcc/testsuite/gcc.target/riscv/pr111501.c |  21 
 .../gcc.target/riscv/zero-extend-rshift-32.c  |  13 ++
 .../gcc.target/riscv/zero-extend-rshift-64.c  |  17 +++
 .../gcc.target/riscv/zero-extend-rshift.c | 115 ++
 6 files changed, 222 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/extend-shift-helpers.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr111501.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index b7fc13e4e61..58bf7712277 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2793,6 +2793,36 @@ (define_insn "*lshrsi3_zero_extend_3"
   [(set_attr "type" "shift")
(set_attr "mode" "SI")])
 
+;; Canonical form for a zero-extend of a logical right shift.
+;; Special cases are handled above.
+;; Skip for single-bit extraction (Zbs/XTheadBs) and th.extu (XTheadBb)
+(define_insn_and_split "*lshr3_zero_extend_4"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+(zero_extract:GPR
+   (match_operand:GPR 1 "register_operand" " r")
+   (match_operand 2 "const_int_operand")
+   (match_operand 3 "const_int_operand")))
+   (clobber (match_scratch:GPR  4 "="))]
+  "!((TARGET_ZBS || TARGET_XTHEADBS) && (INTVAL (operands[2]) == 1))
+   && !TARGET_XTHEADBB"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 4)
+ (ashift:GPR (match_dup 1) (match_dup 2)))
+   (set (match_dup 0)
+ (lshiftrt:GPR (match_dup 4) (match_dup 3)))]
+{
+  int regbits = GET_MODE_BITSIZE (GET_MODE (operands[0])).to_constant ();
+  int sizebits = INTVAL (operands[2]);
+  int startbits = INTVAL (operands[3]);
+  int lshamt = regbits - sizebits - startbits;
+  int rshamt = lshamt + startbits;
+  operands[2] = GEN_INT (lshamt);
+  operands[3] = GEN_INT (rshamt);
+}
+  [(set_attr "type" "shift")
+   (set_attr "mode" "")])
+
 ;; Handle AND with 2^N-1 for N from 12 to XLEN.  This can be split into
 ;; two logical shifts.  Otherwise it requires 3 instructions: lui,
 ;; xor/addi/srli, and.
diff --git a/gcc/testsuite/gcc.target/riscv/extend-shift-helpers.h 
b/gcc/testsuite/gcc.target/riscv/extend-shift-helpers.h
new file mode 100644
index 000..4853fe490d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/extend-shift-helpers.h
@@ -0,0 +1,26 @@
+#ifndef EXTEND_SHIFT_HELPERS_H
+#define EXTEND_SHIFT_HELPERS_H
+
+#define RT_EXT_CT_RSHIFT_N_AT(RTS,RT,CTS,CT,N,ATS,AT)  \
+RTS RT \
+RTS##_##RT##_ext_##CTS##_##CT##_rshift_##N##_##ATS##_##AT(ATS AT v)\
+{  \
+return (CTS CT)(v >> N);   \
+}
+
+#define ULONG_EXT_USHORT_RSHIFT_N_ULONG(N) \
+   RT_EXT_CT_RSHIFT_N_AT(unsigned,long,unsigned,short,N,unsigned,long)
+
+#define ULONG_EXT_UINT_RSHIFT_N_ULONG(N) \
+   RT_EXT_CT_RSHIFT_N_AT(unsigned,long,unsigned,int,N,unsigned,long)
+
+#define UINT_EXT_USHORT_RSHIFT_N_UINT(N) \
+   RT_EXT_CT_RSHIFT_N_AT(unsigned,int,unsigned,short,N,unsigned,int)
+
+#define UINT_EXT_USHORT_RSHIFT_N_ULONG(N) \
+   RT_EXT_CT_RSHIFT_N_AT(unsigned,int,unsigned,short,N,unsigned,long)
+
+#define ULONG_EXT_USHORT_RSHIFT_N_UINT(N) \
+   RT_EXT_CT_RSHIFT_N_AT(unsigned,long,unsigned,short,N,unsigned,int)
+
+#endif /* EXTEND_SHIFT_HELPERS_H */
diff --git a/gcc/testsuite/gcc.target/riscv/pr111501.c 
b/gcc/testsuite/gcc.target/riscv/pr111501.c
new file mode 100644
index 000..db48c34ce9a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr111501.c

[PATCH v2 4/4] RISC-V: Cover sign-extensions in lshr3_zero_extend_4

2024-05-08 Thread Christoph Müllner
The lshr3_zero_extend_4 pattern targets bit extraction
with zero-extension. This pattern represents the canonical form
of zero-extensions of a logical right shift.

The same optimization can be applied to sign-extensions.
Given the two optimizations are so similar, this patch converts
the existing one to also cover the sign-extension case as well.

gcc/ChangeLog:

* config/riscv/iterators.md (ashiftrt): New code attribute
'extract_shift' and adding extractions to optab.
* config/riscv/riscv.md (*lshr3_zero_extend_4): Rename to...
(*3):...this and add support for
sign-extensions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/extend-shift-helpers.h: Add helpers for
sign-extension.
* gcc.target/riscv/sign-extend-rshift-32.c: New test.
* gcc.target/riscv/sign-extend-rshift-64.c: New test.
* gcc.target/riscv/sign-extend-rshift.c: New test.

Signed-off-by: Christoph Müllner 
---
 gcc/config/riscv/iterators.md |   4 +
 gcc/config/riscv/riscv.md |  25 ++--
 .../gcc.target/riscv/extend-shift-helpers.h   |  20 +++
 .../gcc.target/riscv/sign-extend-rshift-32.c  |  17 +++
 .../gcc.target/riscv/sign-extend-rshift-64.c  |  17 +++
 .../gcc.target/riscv/sign-extend-rshift.c | 123 ++
 6 files changed, 198 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/sign-extend-rshift-32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sign-extend-rshift-64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sign-extend-rshift.c

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index c5ca01f382a..8a9d1986b4a 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -155,6 +155,8 @@ (define_code_iterator any_extend [sign_extend zero_extend])
 (define_code_iterator any_extract [sign_extract zero_extract])
 (define_code_attr extract_sidi_shift [(sign_extract "sraiw")
  (zero_extract "srliw")])
+(define_code_attr extract_shift [(sign_extract "ashiftrt")
+(zero_extract "lshiftrt")])
 
 ;; This code iterator allows the two right shift instructions to be
 ;; generated from the same template.
@@ -261,6 +263,8 @@ (define_code_attr optab [(ashift "ashl")
 (us_minus "ussub")
 (sign_extend "extend")
 (zero_extend "zero_extend")
+(sign_extract "extract")
+(zero_extract "zero_extract")
 (fix "fix_trunc")
 (unsigned_fix "fixuns_trunc")])
 
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 58bf7712277..620a1b3bd32 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2793,24 +2793,33 @@ (define_insn "*lshrsi3_zero_extend_3"
   [(set_attr "type" "shift")
(set_attr "mode" "SI")])
 
-;; Canonical form for a zero-extend of a logical right shift.
-;; Special cases are handled above.
-;; Skip for single-bit extraction (Zbs/XTheadBs) and th.extu (XTheadBb)
-(define_insn_and_split "*lshr3_zero_extend_4"
+;; Canonical form for a extend of a logical shift right (sign/zero extraction).
+;; Special cases, that are ignored (handled elsewhere):
+;; * Single-bit extraction (Zbs/XTheadBs)
+;; * Single-bit extraction (Zicondops/XVentanaCondops)
+;; * Single-bit extraction (SFB)
+;; * Extraction instruction th.ext(u) (XTheadBb)
+;; * lshrsi3_extend_2 (see above)
+(define_insn_and_split "*3"
   [(set (match_operand:GPR 0 "register_operand" "=r")
-(zero_extract:GPR
+(any_extract:GPR
(match_operand:GPR 1 "register_operand" " r")
(match_operand 2 "const_int_operand")
(match_operand 3 "const_int_operand")))
(clobber (match_scratch:GPR  4 "="))]
-  "!((TARGET_ZBS || TARGET_XTHEADBS) && (INTVAL (operands[2]) == 1))
-   && !TARGET_XTHEADBB"
+  "!((TARGET_ZBS || TARGET_XTHEADBS || TARGET_ZICOND
+  || TARGET_XVENTANACONDOPS || TARGET_SFB_ALU)
+ && (INTVAL (operands[2]) == 1))
+   && !TARGET_XTHEADBB
+   && !(TARGET_64BIT
+&& (INTVAL (operands[3]) > 0)
+&& (INTVAL (operands[2]) + INTVAL (operands[3]) == 32))"
   "#"
   "&& reload_completed"
   [(set (match_dup 4)
  (ashift:GPR (match_dup 1) (match_dup 2)))
(set (match_dup 0)
- (lshiftrt:GPR (match_dup 4) (match_dup 3)))]
+ (:GPR (match_dup 4) (match_dup 3)))]
 {
   int regbits = GET_MODE_BITSIZE (GET_MODE (operands[0])).to_constant ();
   int sizebits = INTVAL (operands[2]);

[PATCH v2 2/4] RISC-V: Cover sign-extensions in lshrsi3_zero_extend_2

2024-05-08 Thread Christoph Müllner
The pattern lshrsi3_zero_extend_2 extracts the MSB bits of the lower
32-bit word and zero-extends it back to DImode.
This is realized using srliw, which operates on 32-bit registers.

The same optimziation can be applied to sign-extensions when emitting
a sraiw instead of the srliw.

Given these two optimizations are so similar, this patch simply
converts the existing one to also cover the sign-extension case as well.

gcc/ChangeLog:

* config/riscv/iterators.md (sraiw): New code iterator 'any_extract'.
New code attribute 'extract_sidi_shift'.
* config/riscv/riscv.md (*lshrsi3_zero_extend_2): Rename to...
(*lshrsi3_extend_2):...this and add support for sign-extensions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sign-extend-1.c: Test sraiw 24 and sraiw 16.

Signed-off-by: Christoph Müllner 
---
 gcc/config/riscv/iterators.md  |  6 ++
 gcc/config/riscv/riscv.md  |  9 +
 gcc/testsuite/gcc.target/riscv/sign-extend-1.c | 14 ++
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 32e1b140305..c5ca01f382a 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -150,6 +150,12 @@ (define_mode_attr slot12_offset [(SI "-52") (DI "-104")])
 ;; to use the same template.
 (define_code_iterator any_extend [sign_extend zero_extend])
 
+;; These code iterators allow unsigned and signed extraction to be generated
+;; from the same template.
+(define_code_iterator any_extract [sign_extract zero_extract])
+(define_code_attr extract_sidi_shift [(sign_extract "sraiw")
+ (zero_extract "srliw")])
+
 ;; This code iterator allows the two right shift instructions to be
 ;; generated from the same template.
 (define_code_iterator any_shiftrt [ashiftrt lshiftrt])
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 24558682eb8..b7fc13e4e61 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2765,16 +2765,17 @@ (define_insn "*lshrsi3_zero_extend_1"
   [(set_attr "type" "shift")
(set_attr "mode" "SI")])
 
-;; Canonical form for a zero-extend of a logical right shift.
-(define_insn "*lshrsi3_zero_extend_2"
+;; Canonical form for a sign/zero-extend of a logical right shift.
+;; Special case: extract MSB bits of lower 32-bit word
+(define_insn "*lshrsi3_extend_2"
   [(set (match_operand:DI   0 "register_operand" "=r")
-   (zero_extract:DI (match_operand:DI  1 "register_operand" " r")
+   (any_extract:DI (match_operand:DI  1 "register_operand" " r")
 (match_operand 2 "const_int_operand")
 (match_operand 3 "const_int_operand")))]
   "(TARGET_64BIT && (INTVAL (operands[3]) > 0)
 && (INTVAL (operands[2]) + INTVAL (operands[3]) == 32))"
 {
-  return "srliw\t%0,%1,%3";
+  return "\t%0,%1,%3";
 }
   [(set_attr "type" "shift")
(set_attr "mode" "SI")])
diff --git a/gcc/testsuite/gcc.target/riscv/sign-extend-1.c 
b/gcc/testsuite/gcc.target/riscv/sign-extend-1.c
index e9056ec0d42..d8c18dd1aaa 100644
--- a/gcc/testsuite/gcc.target/riscv/sign-extend-1.c
+++ b/gcc/testsuite/gcc.target/riscv/sign-extend-1.c
@@ -9,6 +9,20 @@ foo1 (int i)
 }
 /* { dg-final { scan-assembler "sraiw\ta\[0-9\],a\[0-9\],31" } } */
 
+signed char
+sub2 (long i)
+{
+  return i >> 24;
+}
+/* { dg-final { scan-assembler "sraiw\ta\[0-9\],a\[0-9\],24" } } */
+
+signed short
+sub3 (long i)
+{
+  return i >> 16;
+}
+/* { dg-final { scan-assembler "sraiw\ta\[0-9\],a\[0-9\],16" } } */
+
 /* { dg-final { scan-assembler-not "srai\t" } } */
 /* { dg-final { scan-assembler-not "srli\t" } } */
 /* { dg-final { scan-assembler-not "srliw\t" } } */
-- 
2.44.0



[PATCH v2 1/4] RISC-V: Add test for sraiw-31 special case

2024-05-08 Thread Christoph Müllner
We already optimize a sign-extension of a right-shift by 31 in
si3_extend.  Let's add a test for that (similar to
zero-extend-1.c).

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sign-extend-1.c: New test.

Signed-off-by: Christoph Müllner 
---
 gcc/testsuite/gcc.target/riscv/sign-extend-1.c | 14 ++
 1 file changed, 14 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/sign-extend-1.c

diff --git a/gcc/testsuite/gcc.target/riscv/sign-extend-1.c 
b/gcc/testsuite/gcc.target/riscv/sign-extend-1.c
new file mode 100644
index 000..e9056ec0d42
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sign-extend-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { riscv64*-*-* } } } */
+/* { dg-options "-march=rv64gc -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
+
+signed long
+foo1 (int i)
+{
+  return i >> 31;
+}
+/* { dg-final { scan-assembler "sraiw\ta\[0-9\],a\[0-9\],31" } } */
+
+/* { dg-final { scan-assembler-not "srai\t" } } */
+/* { dg-final { scan-assembler-not "srli\t" } } */
+/* { dg-final { scan-assembler-not "srliw\t" } } */
-- 
2.44.0



[PATCH 2/2] RISC-V: Add cmpmemsi expansion

2024-05-07 Thread Christoph Müllner
GCC has a generic cmpmemsi expansion via the by-pieces framework,
which shows some room for target-specific optimizations.
E.g. for comparing two aligned memory blocks of 15 bytes
we get the following sequence:

my_mem_cmp_aligned_15:
li  a4,0
j   .L2
.L8:
bgeua4,a7,.L7
.L2:
add a2,a0,a4
add a3,a1,a4
lbu a5,0(a2)
lbu a6,0(a3)
addia4,a4,1
li  a7,15// missed hoisting
subwa5,a5,a6
andia5,a5,0xff // useless
beq a5,zero,.L8
lbu a0,0(a2) // loading again!
lbu a5,0(a3) // loading again!
subwa0,a0,a5
ret
.L7:
li  a0,0
ret

Diff first byte: 15 insns
Diff second byte: 25 insns
No diff: 25 insns

Possible improvements:
* unroll the loop and use load-with-displacement to avoid offset increments
* load and compare multiple (aligned) bytes at once
* Use the bitmanip/strcmp result calculation (reverse words and
  synthesize (a2 >= a3) ? 1 : -1 in a branchless sequence)

When applying these improvements we get the following sequence:

my_mem_cmp_aligned_15:
ld  a5,0(a0)
ld  a4,0(a1)
bne a5,a4,.L2
ld  a5,8(a0)
ld  a4,8(a1)
sllia5,a5,8
sllia4,a4,8
bne a5,a4,.L2
li  a0,0
.L3:
sext.w  a0,a0
ret
.L2:
rev8a5,a5
rev8a4,a4
sltua5,a5,a4
neg a5,a5
ori a0,a5,1
j   .L3

Diff first byte: 11 insns
Diff second byte: 16 insns
No diff: 11 insns

This patch implements this improvements.

The tests consist of a execution test (similar to
gcc/testsuite/gcc.dg/torture/inline-mem-cmp-1.c) and a few tests
that test the expansion conditions (known length and alignment).

Similar to the cpymemsi expansion this patch does not introduce any
gating for the cmpmemsi expansion (on top of requiring the known length,
alignment and Zbb).

Bootstrapped and SPEC CPU 2017 tested.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_expand_block_compare): New
prototype.
* config/riscv/riscv-string.cc (GEN_EMIT_HELPER2): New helper.
(do_load_from_addr): Add support for HI and SI/64 modes.
(emit_memcmp_scalar_load_and_compare): New helper to emit memcmp.
(emit_memcmp_scalar_result_calculation): Likewise.
(riscv_expand_block_compare_scalar): Likewise.
(riscv_expand_block_compare): New RISC-V expander for memory compare.
* config/riscv/riscv.md (cmpmemsi): New cmpmem expansion.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cmpmemsi-1.c: New test.
* gcc.target/riscv/cmpmemsi-2.c: New test.
* gcc.target/riscv/cmpmemsi-3.c: New test.
* gcc.target/riscv/cmpmemsi.c: New test.

Signed-off-by: Christoph Müllner 
---
 gcc/config/riscv/riscv-protos.h |   1 +
 gcc/config/riscv/riscv-string.cc| 161 
 gcc/config/riscv/riscv.md   |  15 ++
 gcc/testsuite/gcc.target/riscv/cmpmemsi-1.c |   6 +
 gcc/testsuite/gcc.target/riscv/cmpmemsi-2.c |  42 +
 gcc/testsuite/gcc.target/riscv/cmpmemsi-3.c |  43 ++
 gcc/testsuite/gcc.target/riscv/cmpmemsi.c   |  22 +++
 7 files changed, 290 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cmpmemsi-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cmpmemsi-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cmpmemsi-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cmpmemsi.c

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index e5aebf3fc3d..30ffe30be1d 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -188,6 +188,7 @@ rtl_opt_pass * make_pass_avlprop (gcc::context *ctxt);
 rtl_opt_pass * make_pass_vsetvl (gcc::context *ctxt);
 
 /* Routines implemented in riscv-string.c.  */
+extern bool riscv_expand_block_compare (rtx, rtx, rtx, rtx);
 extern bool riscv_expand_block_move (rtx, rtx, rtx);
 
 /* Information about one CPU we know about.  */
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index b09b51d7526..9d4dc0cb827 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -86,6 +86,7 @@ GEN_EMIT_HELPER2(th_rev) /* do_th_rev2  */
 GEN_EMIT_HELPER2(th_tstnbz) /* do_th_tstnbz2  */
 GEN_EMIT_HELPER3(xor) /* do_xor3  */
 GEN_EMIT_HELPER2(zero_extendqi) /* do_zero_extendqi2  */
+GEN_EMIT_HELPER2(zero_extendhi) /* do_zero_extendhi2  */
 
 #undef GEN_EMIT_HELPER2
 #undef GEN_EMIT_HELPER3
@@ -109,6 +110,10 @@ do_load_from_addr (machine_mode mode, rtx dest, rtx 
addr_reg, rtx addr)
 
   if (mode == QImode)
 do_zero_extendqi2 (dest, mem);
+  else if (mode == HImode)
+do_zero_extendhi2 (dest, mem);
+  else if (mode == SImode && TARGET_64BIT)
+emit_insn (gen_zero_extendsidi2 (dest, mem));
   else if (mo

[PATCH 1/2] RISC-V: Add tests for cpymemsi expansion

2024-05-07 Thread Christoph Müllner
cpymemsi expansion was available for RISC-V since the initial port.
However, there are not tests to detect regression.
This patch adds such tests.

Three of the tests target the expansion requirements (known length and
alignment). One test reuses an existing memcpy test from the by-pieces
framework (gcc/testsuite/gcc.dg/torture/inline-mem-cpy-1.c).

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cpymemsi-1.c: New test.
* gcc.target/riscv/cpymemsi-2.c: New test.
* gcc.target/riscv/cpymemsi-3.c: New test.
* gcc.target/riscv/cpymemsi.c: New test.

Signed-off-by: Christoph Müllner 
---
 gcc/testsuite/gcc.target/riscv/cpymemsi-1.c |  9 +
 gcc/testsuite/gcc.target/riscv/cpymemsi-2.c | 42 
 gcc/testsuite/gcc.target/riscv/cpymemsi-3.c | 43 +
 gcc/testsuite/gcc.target/riscv/cpymemsi.c   | 22 +++
 4 files changed, 116 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cpymemsi-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cpymemsi-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cpymemsi-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cpymemsi.c

diff --git a/gcc/testsuite/gcc.target/riscv/cpymemsi-1.c 
b/gcc/testsuite/gcc.target/riscv/cpymemsi-1.c
new file mode 100644
index 000..983b564ccaf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/cpymemsi-1.c
@@ -0,0 +1,9 @@
+/* { dg-do run } */
+/* { dg-options "-march=rv32gc -save-temps -g0 -fno-lto" { target { rv32 } } } 
*/
+/* { dg-options "-march=rv64gc -save-temps -g0 -fno-lto" { target { rv64 } } } 
*/
+/* { dg-additional-options "-DRUN_FRACTION=11" { target simulator } } */
+/* { dg-timeout-factor 2 } */
+
+#include "../../gcc.dg/memcmp-1.c"
+/* Yeah, this memcmp test exercises plenty of memcpy, more than any of the
+   memcpy tests.  */
diff --git a/gcc/testsuite/gcc.target/riscv/cpymemsi-2.c 
b/gcc/testsuite/gcc.target/riscv/cpymemsi-2.c
new file mode 100644
index 000..833d1c04487
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/cpymemsi-2.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc" { target { rv32 } } } */
+/* { dg-options "-march=rv64gc" { target { rv64 } } } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Og" "-Oz" } } */
+
+#include 
+#define aligned32 __attribute__ ((aligned (32)))
+
+const char myconst15[] aligned32 = { 1, 2, 3, 4, 5, 6, 7,
+0, 1, 2, 3, 4, 5, 6, 7 };
+const char myconst23[] aligned32 = { 1, 2, 3, 4, 5, 6, 7,
+0, 1, 2, 3, 4, 5, 6, 7,
+0, 1, 2, 3, 4, 5, 6, 7 };
+const char myconst31[] aligned32 = { 1, 2, 3, 4, 5, 6, 7,
+0, 1, 2, 3, 4, 5, 6, 7,
+0, 1, 2, 3, 4, 5, 6, 7,
+0, 1, 2, 3, 4, 5, 6, 7 };
+
+/* No expansion (unknown alignment) */
+#define MY_MEM_CPY_N(N)\
+void my_mem_cpy_##N (char *b1, const char *b2) \
+{  \
+  __builtin_memcpy (b1, b2, N);\
+}
+
+/* No expansion (unknown alignment) */
+#define MY_MEM_CPY_CONST_N(N)  \
+void my_mem_cpy_const_##N (char *b1)   \
+{  \
+  __builtin_memcpy (b1, myconst##N, sizeof(myconst##N));\
+}
+
+MY_MEM_CPY_N(15)
+MY_MEM_CPY_CONST_N(15)
+
+MY_MEM_CPY_N(23)
+MY_MEM_CPY_CONST_N(23)
+
+MY_MEM_CPY_N(31)
+MY_MEM_CPY_CONST_N(31)
+
+/* { dg-final { scan-assembler-times "\t(call|tail)\tmemcpy" 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/cpymemsi-3.c 
b/gcc/testsuite/gcc.target/riscv/cpymemsi-3.c
new file mode 100644
index 000..803765195b2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/cpymemsi-3.c
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc" { target { rv32 } } } */
+/* { dg-options "-march=rv64gc" { target { rv64 } } } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Og" "-Oz" } } */
+
+#include 
+#define aligned32 __attribute__ ((aligned (32)))
+
+const char myconst15[] aligned32 = { 1, 2, 3, 4, 5, 6, 7,
+0, 1, 2, 3, 4, 5, 6, 7 };
+const char myconst23[] aligned32 = { 1, 2, 3, 4, 5, 6, 7,
+0, 1, 2, 3, 4, 5, 6, 7,
+0, 1, 2, 3, 4, 5, 6, 7 };
+const char myconst31[] aligned32 = { 1, 2, 3, 4, 5, 6, 7,
+0, 1, 2, 3, 4, 5, 6, 7,
+0, 1, 2, 3, 4, 5, 6, 7,
+0, 1, 2, 3, 4, 5, 6, 7 };
+
+#define MY_MEM_CPY_ALIGNED_N(N)\
+void my_mem_cpy

[PATCH 3/3] RISC-V: Add memset-zero expansion to cbo.zero

2024-05-07 Thread Christoph Müllner
The Zicboz extension offers the cbo.zero instruction, which can be used
to clean a memory region corresponding to a cache block.
The Zic64b extension defines the cache block size to 64 byte.
If both extensions are available, it is possible to use cbo.zero
to clear memory, if the alignment and size constraints are met.
This patch implements this.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_expand_block_clear): New prototype.
* config/riscv/riscv-string.cc (riscv_expand_block_clear_zicboz_zic64b):
New function to expand a block-clear with cbo.zero.
(riscv_expand_block_clear): New RISC-V block-clear expansion function.
* config/riscv/riscv.md (setmem): New setmem expansion.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cmo-zicboz-zic64-1.c: New test.

Signed-off-by: Christoph Müllner 
---
 gcc/config/riscv/riscv-protos.h   |  1 +
 gcc/config/riscv/riscv-string.cc  | 59 +++
 gcc/config/riscv/riscv.md | 24 
 .../gcc.target/riscv/cmo-zicboz-zic64-1.c | 43 ++
 4 files changed, 127 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicboz-zic64-1.c

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index e5aebf3fc3d..255fd6a0de9 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -189,6 +189,7 @@ rtl_opt_pass * make_pass_vsetvl (gcc::context *ctxt);
 
 /* Routines implemented in riscv-string.c.  */
 extern bool riscv_expand_block_move (rtx, rtx, rtx);
+extern bool riscv_expand_block_clear (rtx, rtx);
 
 /* Information about one CPU we know about.  */
 struct riscv_cpu_info {
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index b09b51d7526..cf92256bc4e 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -787,6 +787,65 @@ riscv_expand_block_move (rtx dest, rtx src, rtx length)
   return false;
 }
 
+/* Expand a block-clear instruction via cbo.zero instructions.  */
+
+static bool
+riscv_expand_block_clear_zicboz_zic64b (rtx dest, rtx length)
+{
+  unsigned HOST_WIDE_INT hwi_length;
+  unsigned HOST_WIDE_INT align;
+  const unsigned HOST_WIDE_INT cbo_bytes = 64;
+
+  gcc_assert (TARGET_ZICBOZ && TARGET_ZIC64B);
+
+  if (!CONST_INT_P (length))
+return false;
+
+  hwi_length = UINTVAL (length);
+  if (hwi_length < cbo_bytes)
+return false;
+
+  align = MEM_ALIGN (dest) / BITS_PER_UNIT;
+  if (align < cbo_bytes)
+return false;
+
+  /* We don't emit loops.  Instead apply move-bytes limitation.  */
+  unsigned HOST_WIDE_INT max_bytes = RISCV_MAX_MOVE_BYTES_STRAIGHT /
+ UNITS_PER_WORD * cbo_bytes;
+  if (hwi_length > max_bytes)
+return false;
+
+  unsigned HOST_WIDE_INT offset = 0;
+  while (offset + cbo_bytes <= hwi_length)
+{
+  rtx mem = adjust_address (dest, BLKmode, offset);
+  rtx addr = force_reg (Pmode, XEXP (mem, 0));
+  emit_insn (gen_riscv_zero_di (addr));
+  offset += cbo_bytes;
+}
+
+  if (offset < hwi_length)
+{
+  rtx mem = adjust_address (dest, BLKmode, offset);
+  clear_by_pieces (mem, hwi_length - offset, align);
+}
+
+  return true;
+}
+
+bool
+riscv_expand_block_clear (rtx dest, rtx length)
+{
+  /* Only use setmem-zero expansion for Zicboz + Zic64b.  */
+  if (!TARGET_ZICBOZ || !TARGET_ZIC64B)
+return false;
+
+  if (optimize_function_for_size_p (cfun))
+return false;
+
+  return riscv_expand_block_clear_zicboz_zic64b (dest, length);
+}
+
 /* --- Vector expanders --- */
 
 namespace riscv_vector {
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index d4676507b45..729c102812c 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2598,6 +2598,30 @@ (define_expand "cpymem"
 FAIL;
 })
 
+;; Fill memory with constant byte.
+;; Argument 0 is the destination
+;; Argument 1 is the constant byte
+;; Argument 2 is the length
+;; Argument 3 is the alignment
+
+(define_expand "setmem"
+  [(parallel [(set (match_operand:BLK 0 "memory_operand")
+  (match_operand:QI 2 "const_int_operand"))
+ (use (match_operand:P 1 ""))
+ (use (match_operand:SI 3 "const_int_operand"))])]
+ ""
+ {
+  /* If value to set is not zero, use the library routine.  */
+  if (operands[2] != const0_rtx)
+FAIL;
+
+  if (riscv_expand_block_clear (operands[0], operands[1]))
+DONE;
+  else
+FAIL;
+})
+
+
 ;; Expand in-line code to clear the instruction cache between operand[0] and
 ;; operand[1].
 (define_expand "clear_cache"
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicboz-zic64-1.c 
b/gcc/testsuite/gcc.target/riscv/cmo-zicboz-zic64-1.c
new file mode 100644
index 000..c2d79eb7ae6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicboz-zic64-1.c
@@ -0,0 +1,43 @@
+/* { dg-d

[PATCH 2/3] RISC-V: testsuite: Make cmo tests LTO safe

2024-05-07 Thread Christoph Müllner
Let's add '\t' to the instruction match pattern to avoid false positive
matches when compiling with -flto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cmo-zicbom-1.c: Add \t to test pattern.
* gcc.target/riscv/cmo-zicbom-2.c: Likewise.
* gcc.target/riscv/cmo-zicbop-1.c: Likewise.
* gcc.target/riscv/cmo-zicbop-2.c: Likewise.
* gcc.target/riscv/cmo-zicboz-1.c: Likewise.
* gcc.target/riscv/cmo-zicboz-2.c: Likewise.

Signed-off-by: Christoph Müllner 
---
 gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c | 6 +++---
 gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c | 6 +++---
 gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c | 6 +++---
 gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c | 6 +++---
 gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c | 2 +-
 gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c | 2 +-
 6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c 
b/gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c
index 6341f7874d3..02c38e201fa 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c
@@ -24,6 +24,6 @@ void foo3()
 __builtin_riscv_zicbom_cbo_inval((void*)0x111);
 }
 
-/* { dg-final { scan-assembler-times "cbo.clean" 3 } } */
-/* { dg-final { scan-assembler-times "cbo.flush" 3 } } */
-/* { dg-final { scan-assembler-times "cbo.inval" 3 } } */
+/* { dg-final { scan-assembler-times "cbo.clean\t" 3 } } */
+/* { dg-final { scan-assembler-times "cbo.flush\t" 3 } } */
+/* { dg-final { scan-assembler-times "cbo.inval\t" 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c 
b/gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c
index a04f106c8b0..040b96952bc 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c
@@ -24,6 +24,6 @@ void foo3()
 __builtin_riscv_zicbom_cbo_inval((void*)0x111);
 }
 
-/* { dg-final { scan-assembler-times "cbo.clean" 3 } } */
-/* { dg-final { scan-assembler-times "cbo.flush" 3 } } */
-/* { dg-final { scan-assembler-times "cbo.inval" 3 } } */
+/* { dg-final { scan-assembler-times "cbo.clean\t" 3 } } */
+/* { dg-final { scan-assembler-times "cbo.flush\t" 3 } } */
+/* { dg-final { scan-assembler-times "cbo.inval\t" 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c 
b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c
index c5d78c1763d..97181154d85 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c
@@ -18,6 +18,6 @@ int foo1()
   return __builtin_riscv_zicbop_cbo_prefetchi(1);
 }
 
-/* { dg-final { scan-assembler-times "prefetch.i" 1 } } */
-/* { dg-final { scan-assembler-times "prefetch.r" 4 } } */
-/* { dg-final { scan-assembler-times "prefetch.w" 4 } } */
+/* { dg-final { scan-assembler-times "prefetch.i\t" 1 } } */
+/* { dg-final { scan-assembler-times "prefetch.r\t" 4 } } */
+/* { dg-final { scan-assembler-times "prefetch.w\t" 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c 
b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c
index 6576365b39c..4871a97b21a 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c
@@ -18,6 +18,6 @@ int foo1()
   return __builtin_riscv_zicbop_cbo_prefetchi(1);
 }
 
-/* { dg-final { scan-assembler-times "prefetch.i" 1 } } */
-/* { dg-final { scan-assembler-times "prefetch.r" 4 } } */
-/* { dg-final { scan-assembler-times "prefetch.w" 4 } } */ 
+/* { dg-final { scan-assembler-times "prefetch.i\t" 1 } } */
+/* { dg-final { scan-assembler-times "prefetch.r\t" 4 } } */
+/* { dg-final { scan-assembler-times "prefetch.w\t" 4 } } */ 
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c 
b/gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c
index 5eb78ab94b5..63b8782bf89 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c
@@ -10,4 +10,4 @@ void foo1()
 __builtin_riscv_zicboz_cbo_zero((void*)0x121);
 }
 
-/* { dg-final { scan-assembler-times "cbo.zero" 3 } } */ 
+/* { dg-final { scan-assembler-times "cbo.zero\t" 3 } } */ 
diff --git a/gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c 
b/gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c
index fdc9c719669..cc3bd505ec0 100644
--- a/gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c
+++ b/gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c
@@ -10,4 +10,4 @@ void foo1()
 __builtin_riscv_zicboz_cbo_zero((void*)0x121);
 }
 
-/* { dg-final { scan-assembler-times "cbo.zero" 3 } } */ 
+/* { dg-final { scan-assembler-times "cbo.zero\t" 3 } } */ 
-- 
2.44.0



[PATCH 1/3] expr: Export clear_by_pieces()

2024-05-07 Thread Christoph Müllner
Make clear_by_pieces() available to other parts of the compiler,
similar to store_by_pieces().

gcc/ChangeLog:

* expr.cc (clear_by_pieces): Remove static from clear_by_pieces.
* expr.h (clear_by_pieces): Add prototype for clear_by_pieces.

Signed-off-by: Christoph Müllner 
---
 gcc/expr.cc | 6 +-
 gcc/expr.h  | 5 +
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/gcc/expr.cc b/gcc/expr.cc
index d4414e242cb..eaf86d3d842 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -85,7 +85,6 @@ static void emit_block_move_via_sized_loop (rtx, rtx, rtx, 
unsigned, unsigned);
 static void emit_block_move_via_oriented_loop (rtx, rtx, rtx, unsigned, 
unsigned);
 static rtx emit_block_cmp_via_loop (rtx, rtx, rtx, tree, rtx, bool,
unsigned, unsigned);
-static void clear_by_pieces (rtx, unsigned HOST_WIDE_INT, unsigned int);
 static rtx_insn *compress_float_constant (rtx, rtx);
 static rtx get_subtarget (rtx);
 static rtx store_field (rtx, poly_int64, poly_int64, poly_uint64, poly_uint64,
@@ -1832,10 +1831,7 @@ store_by_pieces (rtx to, unsigned HOST_WIDE_INT len,
 return to;
 }
 
-/* Generate several move instructions to clear LEN bytes of block TO.  (A MEM
-   rtx with BLKmode).  ALIGN is maximum alignment we can assume.  */
-
-static void
+void
 clear_by_pieces (rtx to, unsigned HOST_WIDE_INT len, unsigned int align)
 {
   if (len == 0)
diff --git a/gcc/expr.h b/gcc/expr.h
index 64956f63029..75181584108 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -245,6 +245,11 @@ extern bool can_store_by_pieces (unsigned HOST_WIDE_INT,
 extern rtx store_by_pieces (rtx, unsigned HOST_WIDE_INT, by_pieces_constfn,
void *, unsigned int, bool, memop_ret);
 
+/* Generate several move instructions to clear LEN bytes of block TO.  (A MEM
+   rtx with BLKmode).  ALIGN is maximum alignment we can assume.  */
+
+extern void clear_by_pieces (rtx, unsigned HOST_WIDE_INT, unsigned int);
+
 /* If can_store_by_pieces passes for worst-case values near MAX_LEN, call
store_by_pieces within conditionals so as to handle variable LEN 
efficiently,
storing VAL, if non-NULL_RTX, or valc instead.  */
-- 
2.44.0



[PATCH 0/3] RISC-V: Add memset-zero expansion with Zicboz+Zic64b

2024-05-07 Thread Christoph Müllner
I've mentioned this patchset a few weeks ago in the RISC-V call.
Sending it now, as the release is out.

Christoph Müllner (3):
  expr: Export clear_by_pieces()
  RISC-V: testsuite: Make cmo tests LTO safe
  RISC-V: Add memset-zero expansion to cbo.zero

 gcc/config/riscv/riscv-protos.h   |  1 +
 gcc/config/riscv/riscv-string.cc  | 59 +++
 gcc/config/riscv/riscv.md | 24 
 gcc/expr.cc   |  6 +-
 gcc/expr.h|  5 ++
 gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c |  6 +-
 gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c |  6 +-
 gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c |  6 +-
 gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c |  6 +-
 gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c |  2 +-
 gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c |  2 +-
 .../gcc.target/riscv/cmo-zicboz-zic64-1.c | 43 ++
 12 files changed, 147 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicboz-zic64-1.c

-- 
2.44.0



[PATCH 4/4] RISC-V: Allow by-pieces to do overlapping accesses in block_move_straight

2024-05-07 Thread Christoph Müllner
The current implementation of riscv_block_move_straight() emits a couple
of loads/stores with with maximum width (e.g. 8-byte for RV64).
The remainder is handed over to move_by_pieces().
The by-pieces framework utilizes target hooks to decide about the emitted
instructions (e.g. unaligned accesses or overlapping accesses).

Since the current implementation will always request less than XLEN bytes
to be handled by the by-pieces infrastructure, it is impossible that
overlapping memory accesses can ever be emitted (the by-pieces code does
not know of any previous instructions that were emitted by the backend).

This patch changes the implementation of riscv_block_move_straight()
such, that it utilizes the by-pieces framework if the remaining data
is less than 2*XLEN bytes, which is sufficient to enable overlapping
memory accesses (if the requirements for them are given).

The changes in the expansion can be seen in the adjustments of the
cpymem-NN-ooo test cases. The changes in the cpymem-NN tests are
caused by the different instruction ordering of the code emitted
by the by-pieces infrastructure, which emits alternating load/store
sequences.

gcc/ChangeLog:

* config/riscv/riscv-string.cc (riscv_block_move_straight):
Hand over up to 2xXLEN bytes to move_by_pieces().

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cpymem-32-ooo.c: Adjustments for overlapping
access.
* gcc.target/riscv/cpymem-32.c: Adjustments for code emitted by
by-pieces.
* gcc.target/riscv/cpymem-64-ooo.c: Adjustments for overlapping
access.
* gcc.target/riscv/cpymem-64.c: Adjustments for code emitted by
by-pieces.

Signed-off-by: Christoph Müllner 
---
 gcc/config/riscv/riscv-string.cc   |  6 +++---
 gcc/testsuite/gcc.target/riscv/cpymem-32-ooo.c | 16 
 gcc/testsuite/gcc.target/riscv/cpymem-32.c | 10 --
 gcc/testsuite/gcc.target/riscv/cpymem-64-ooo.c |  8 
 gcc/testsuite/gcc.target/riscv/cpymem-64.c |  9 +++--
 5 files changed, 22 insertions(+), 27 deletions(-)

diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 8fc082f..38cf60eb9cf 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -630,18 +630,18 @@ riscv_block_move_straight (rtx dest, rtx src, unsigned 
HOST_WIDE_INT length,
   delta = bits / BITS_PER_UNIT;
 
   /* Allocate a buffer for the temporary registers.  */
-  regs = XALLOCAVEC (rtx, length / delta);
+  regs = XALLOCAVEC (rtx, length / delta - 1);
 
   /* Load as many BITS-sized chunks as possible.  Use a normal load if
  the source has enough alignment, otherwise use left/right pairs.  */
-  for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++)
+  for (offset = 0, i = 0; offset + 2 * delta <= length; offset += delta, i++)
 {
   regs[i] = gen_reg_rtx (mode);
   riscv_emit_move (regs[i], adjust_address (src, mode, offset));
 }
 
   /* Copy the chunks to the destination.  */
-  for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++)
+  for (offset = 0, i = 0; offset + 2 * delta <= length; offset += delta, i++)
 riscv_emit_move (adjust_address (dest, mode, offset), regs[i]);
 
   /* Mop up any left-over bytes.  */
diff --git a/gcc/testsuite/gcc.target/riscv/cpymem-32-ooo.c 
b/gcc/testsuite/gcc.target/riscv/cpymem-32-ooo.c
index 947d58c30fa..2a48567353a 100644
--- a/gcc/testsuite/gcc.target/riscv/cpymem-32-ooo.c
+++ b/gcc/testsuite/gcc.target/riscv/cpymem-32-ooo.c
@@ -91,8 +91,8 @@ COPY_ALIGNED_N(11)
 **...
 **sw\t[at][0-9],0\([at][0-9]\)
 **...
-**lbu\t[at][0-9],14\([at][0-9]\)
-**sb\t[at][0-9],14\([at][0-9]\)
+**lw\t[at][0-9],11\([at][0-9]\)
+**sw\t[at][0-9],11\([at][0-9]\)
 **...
 */
 COPY_N(15)
@@ -104,8 +104,8 @@ COPY_N(15)
 **...
 **sw\t[at][0-9],0\([at][0-9]\)
 **...
-**lbu\t[at][0-9],14\([at][0-9]\)
-**sb\t[at][0-9],14\([at][0-9]\)
+**lw\t[at][0-9],11\([at][0-9]\)
+**sw\t[at][0-9],11\([at][0-9]\)
 **...
 */
 COPY_ALIGNED_N(15)
@@ -117,8 +117,8 @@ COPY_ALIGNED_N(15)
 **...
 **sw\t[at][0-9],20\([at][0-9]\)
 **...
-**lbu\t[at][0-9],26\([at][0-9]\)
-**sb\t[at][0-9],26\([at][0-9]\)
+**lw\t[at][0-9],23\([at][0-9]\)
+**sw\t[at][0-9],23\([at][0-9]\)
 **...
 */
 COPY_N(27)
@@ -130,8 +130,8 @@ COPY_N(27)
 **...
 **sw\t[at][0-9],20\([at][0-9]\)
 **...
-**lbu\t[at][0-9],26\([at][0-9]\)
-**sb\t[at][0-9],26\([at][0-9]\)
+**lw\t[at][0-9],23\([at][0-9]\)
+**sw\t[at][0-9],23\([at][0-9]\)
 **...
 */
 COPY_ALIGNED_N(27)
diff --git a/gcc/testsuite/gcc.target/riscv/cpymem-32.c 
b/gcc/testsuite/gcc.target/riscv/cpymem-32.c
index 44ba14a1d51..2030a39ca97 100644
--- a/gcc/testsuite/gcc.target/riscv/cpymem-32.c
+++ b/gcc/testsuite/gcc.target/riscv/cpymem-32.c
@@ -24,10 +24,10 @@ void copy_aligned_##N (void *to

[PATCH 3/4] RISC-V: tune: Add setting for overlapping mem ops to tuning struct

2024-05-07 Thread Christoph Müllner
This patch adds the field overlap_op_by_pieces to the struct
riscv_tune_param, which is used by the TARGET_OVERLAP_OP_BY_PIECES_P()
hook. This hook is used by the by-pieces infrastructure to decide
if overlapping memory accesses should be emitted.

The new property is set to false in all tune structs except for
generic-ooo.

The changes in the expansion can be seen in the adjustments of the
cpymem test cases. These tests also reveal a limitation in the
RISC-V cpymem expansion that prevents this optimization as only
by-pieces cpymem expansions emit overlapping memory accesses.

gcc/ChangeLog:

* config/riscv/riscv.cc (struct riscv_tune_param): New field
overlap_op_by_pieces.
(riscv_overlap_op_by_pieces): New function.
(TARGET_OVERLAP_OP_BY_PIECES_P): Connect to
riscv_overlap_op_by_pieces.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cpymem-32-ooo.c: Adjust for overlapping
access.
* gcc.target/riscv/cpymem-64-ooo.c: Likewise.

Signed-off-by: Christoph Müllner 
---
 gcc/config/riscv/riscv.cc | 20 +++
 .../gcc.target/riscv/cpymem-32-ooo.c  | 20 +--
 .../gcc.target/riscv/cpymem-64-ooo.c  | 33 +++
 3 files changed, 40 insertions(+), 33 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 44945d47fd6..793ec3155b9 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -286,6 +286,7 @@ struct riscv_tune_param
   unsigned short memory_cost;
   unsigned short fmv_cost;
   bool slow_unaligned_access;
+  bool overlap_op_by_pieces;
   bool use_divmod_expansion;
   unsigned int fusible_ops;
   const struct cpu_vector_cost *vec_costs;
@@ -425,6 +426,7 @@ static const struct riscv_tune_param rocket_tune_info = {
   5,   /* memory_cost */
   8,   /* fmv_cost */
   true,/* 
slow_unaligned_access */
+  false,   /* overlap_op_by_pieces */
   false,   /* use_divmod_expansion */
   RISCV_FUSE_NOTHING,   /* fusible_ops */
   NULL,/* vector cost */
@@ -442,6 +444,7 @@ static const struct riscv_tune_param sifive_7_tune_info = {
   3,   /* memory_cost */
   8,   /* fmv_cost */
   true,/* 
slow_unaligned_access */
+  false,   /* overlap_op_by_pieces */
   false,   /* use_divmod_expansion */
   RISCV_FUSE_NOTHING,   /* fusible_ops */
   NULL,/* vector cost */
@@ -459,6 +462,7 @@ static const struct riscv_tune_param sifive_p400_tune_info 
= {
   3,   /* memory_cost */
   4,   /* fmv_cost */
   true,/* 
slow_unaligned_access */
+  false,   /* overlap_op_by_pieces */
   false,   /* use_divmod_expansion */
   RISCV_FUSE_LUI_ADDI | RISCV_FUSE_AUIPC_ADDI,  /* fusible_ops */
   _vector_cost,/* vector cost */
@@ -476,6 +480,7 @@ static const struct riscv_tune_param sifive_p600_tune_info 
= {
   3,   /* memory_cost */
   4,   /* fmv_cost */
   true,/* 
slow_unaligned_access */
+  false,   /* overlap_op_by_pieces */
   false,   /* use_divmod_expansion */
   RISCV_FUSE_LUI_ADDI | RISCV_FUSE_AUIPC_ADDI,  /* fusible_ops */
   _vector_cost,/* vector cost */
@@ -493,6 +498,7 @@ static const struct riscv_tune_param thead_c906_tune_info = 
{
   5,/* memory_cost */
   8,   /* fmv_cost */
   false,/* slow_unaligned_access */
+  false,   /* overlap_op_by_pieces */
   false,   /* use_divmod_expansion */
   RISCV_FUSE_NOTHING,   /* fusible_ops */
   NULL,/* vector cost */
@@ -510,6 +516,7 @@ static const struct riscv_tune_param 
xiangshan_nanhu_tune_info = {
   3,   /* memory_cost */
   3,   /* fmv_cost */
   true,/* 
slow_unaligned_access */
+  false,   /* overlap_op_by_pieces */
   false,   /* use_divmod_expansion */
   RISCV_FUSE_ZEXTW

[PATCH 2/4] RISC-V: Allow unaligned accesses in cpymemsi expansion

2024-05-07 Thread Christoph Müllner
The RISC-V cpymemsi expansion is called, whenever the by-pieces
infrastructure will not take care of the builtin expansion.
The code emitted by the by-pieces infrastructure may emits code,
that includes unaligned accesses if riscv_slow_unaligned_access_p
is false.

The RISC-V cpymemsi expansion is handled via riscv_expand_block_move().
The current implementation of this function does not check
riscv_slow_unaligned_access_p and never emits unaligned accesses.

Since by-pieces emits unaligned accesses, it is reasonable to implement
the same behaviour in the cpymemsi expansion. And that's what this patch
is doing.

The patch checks riscv_slow_unaligned_access_p at the entry and sets
the allowed alignment accordingly. This alignment is then propagated
down to the routines that emit the actual instructions.

The changes introduced by this patch can be seen in the adjustments
of the cpymem tests.

gcc/ChangeLog:

* config/riscv/riscv-string.cc (riscv_block_move_straight): Add
parameter align.
(riscv_adjust_block_mem): Replace parameter length by align.
(riscv_block_move_loop): Add parameter align.
(riscv_expand_block_move_scalar): Set alignment properly if the
target has fast unaligned access.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cpymem-32-ooo.c: Adjust for unaligned access.
* gcc.target/riscv/cpymem-64-ooo.c: Likewise.

Signed-off-by: Christoph Müllner 
---
 gcc/config/riscv/riscv-string.cc  | 53 +++
 .../gcc.target/riscv/cpymem-32-ooo.c  | 20 +--
 .../gcc.target/riscv/cpymem-64-ooo.c  | 14 -
 3 files changed, 59 insertions(+), 28 deletions(-)

diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index b09b51d7526..8fc082f 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -610,11 +610,13 @@ riscv_expand_strlen (rtx result, rtx src, rtx 
search_char, rtx align)
   return false;
 }
 
-/* Emit straight-line code to move LENGTH bytes from SRC to DEST.
+/* Emit straight-line code to move LENGTH bytes from SRC to DEST
+   with accesses that are ALIGN bytes aligned.
Assume that the areas do not overlap.  */
 
 static void
-riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length)
+riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length,
+  unsigned HOST_WIDE_INT align)
 {
   unsigned HOST_WIDE_INT offset, delta;
   unsigned HOST_WIDE_INT bits;
@@ -622,8 +624,7 @@ riscv_block_move_straight (rtx dest, rtx src, unsigned 
HOST_WIDE_INT length)
   enum machine_mode mode;
   rtx *regs;
 
-  bits = MAX (BITS_PER_UNIT,
- MIN (BITS_PER_WORD, MIN (MEM_ALIGN (src), MEM_ALIGN (dest;
+  bits = MAX (BITS_PER_UNIT, MIN (BITS_PER_WORD, align));
 
   mode = mode_for_size (bits, MODE_INT, 0).require ();
   delta = bits / BITS_PER_UNIT;
@@ -648,21 +649,20 @@ riscv_block_move_straight (rtx dest, rtx src, unsigned 
HOST_WIDE_INT length)
 {
   src = adjust_address (src, BLKmode, offset);
   dest = adjust_address (dest, BLKmode, offset);
-  move_by_pieces (dest, src, length - offset,
- MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), RETURN_BEGIN);
+  move_by_pieces (dest, src, length - offset, align, RETURN_BEGIN);
 }
 }
 
 /* Helper function for doing a loop-based block operation on memory
-   reference MEM.  Each iteration of the loop will operate on LENGTH
-   bytes of MEM.
+   reference MEM.
 
Create a new base register for use within the loop and point it to
the start of MEM.  Create a new memory reference that uses this
-   register.  Store them in *LOOP_REG and *LOOP_MEM respectively.  */
+   register and has an alignment of ALIGN.  Store them in *LOOP_REG
+   and *LOOP_MEM respectively.  */
 
 static void
-riscv_adjust_block_mem (rtx mem, unsigned HOST_WIDE_INT length,
+riscv_adjust_block_mem (rtx mem, unsigned HOST_WIDE_INT align,
rtx *loop_reg, rtx *loop_mem)
 {
   *loop_reg = copy_addr_to_reg (XEXP (mem, 0));
@@ -670,15 +670,17 @@ riscv_adjust_block_mem (rtx mem, unsigned HOST_WIDE_INT 
length,
   /* Although the new mem does not refer to a known location,
  it does keep up to LENGTH bytes of alignment.  */
   *loop_mem = change_address (mem, BLKmode, *loop_reg);
-  set_mem_align (*loop_mem, MIN (MEM_ALIGN (mem), length * BITS_PER_UNIT));
+  set_mem_align (*loop_mem, align);
 }
 
 /* Move LENGTH bytes from SRC to DEST using a loop that moves BYTES_PER_ITER
-   bytes at a time.  LENGTH must be at least BYTES_PER_ITER.  Assume that
-   the memory regions do not overlap.  */
+   bytes at a time.  LENGTH must be at least BYTES_PER_ITER.  The alignment
+   of the access can be set by ALIGN.  Assume that the memory regions do not
+   overlap.  */
 
 static void
 riscv_block_move_loop (rtx dest, rtx src, unsigned HOST_WIDE_INT length,
+  unsigned HOST_WIDE_INT align

[PATCH 1/4] RISC-V: Add test cases for cpymem expansion

2024-05-07 Thread Christoph Müllner
We have two mechanisms in the RISC-V backend that expand
cpymem pattern: a) by-pieces, b) riscv_expand_block_move()
in riscv-string.cc. The by-pieces framework has higher priority
and emits a sequence of up to 15 instructions
(see use_by_pieces_infrastructure_p() for more details).

As a rule-of-thumb, by-pieces emits alternating load/store sequences
and the setmem expansion in the backend emits a sequence of loads
followed by a sequence of stores.

Let's add some test cases to document the current behaviour
and to have tests to identify regressions.

Signed-off-by: Christoph Müllner 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cpymem-32-ooo.c: New test.
* gcc.target/riscv/cpymem-32.c: New test.
* gcc.target/riscv/cpymem-64-ooo.c: New test.
* gcc.target/riscv/cpymem-64.c: New test.
---
 .../gcc.target/riscv/cpymem-32-ooo.c  | 131 +
 gcc/testsuite/gcc.target/riscv/cpymem-32.c| 138 ++
 .../gcc.target/riscv/cpymem-64-ooo.c  | 129 
 gcc/testsuite/gcc.target/riscv/cpymem-64.c| 138 ++
 4 files changed, 536 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cpymem-32-ooo.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cpymem-32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cpymem-64-ooo.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cpymem-64.c

diff --git a/gcc/testsuite/gcc.target/riscv/cpymem-32-ooo.c 
b/gcc/testsuite/gcc.target/riscv/cpymem-32-ooo.c
new file mode 100644
index 000..33fb9891d82
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/cpymem-32-ooo.c
@@ -0,0 +1,131 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target rv32 } */
+/* { dg-options "-march=rv32gc -mabi=ilp32d -mtune=generic-ooo" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+/* { dg-allow-blank-lines-in-output 1 } */
+
+#define COPY_N(N)  \
+void copy_##N (void *to, void *from)   \
+{  \
+  __builtin_memcpy (to, from, N);  \
+}
+
+#define COPY_ALIGNED_N(N)  \
+void copy_aligned_##N (void *to, void *from)   \
+{  \
+  to = __builtin_assume_aligned(to, sizeof(long)); \
+  from = __builtin_assume_aligned(from, sizeof(long)); \
+  __builtin_memcpy (to, from, N);  \
+}
+
+/*
+**copy_7:
+**...
+**lw\t[at][0-9],0\([at][0-9]\)
+**sw\t[at][0-9],0\([at][0-9]\)
+**...
+**lbu\t[at][0-9],6\([at][0-9]\)
+**sb\t[at][0-9],6\([at][0-9]\)
+**...
+*/
+COPY_N(7)
+
+/*
+**copy_aligned_7:
+**...
+**lw\t[at][0-9],0\([at][0-9]\)
+**sw\t[at][0-9],0\([at][0-9]\)
+**...
+**lbu\t[at][0-9],6\([at][0-9]\)
+**sb\t[at][0-9],6\([at][0-9]\)
+**...
+*/
+COPY_ALIGNED_N(7)
+
+/*
+**copy_8:
+**...
+**lw\ta[0-9],0\(a[0-9]\)
+**sw\ta[0-9],0\(a[0-9]\)
+**...
+*/
+COPY_N(8)
+
+/*
+**copy_aligned_8:
+**...
+**lw\ta[0-9],0\(a[0-9]\)
+**sw\ta[0-9],0\(a[0-9]\)
+**...
+*/
+COPY_ALIGNED_N(8)
+
+/*
+**copy_11:
+**...
+**lbu\t[at][0-9],0\([at][0-9]\)
+**...
+**lbu\t[at][0-9],10\([at][0-9]\)
+**...
+**sb\t[at][0-9],0\([at][0-9]\)
+**...
+**sb\t[at][0-9],10\([at][0-9]\)
+**...
+*/
+COPY_N(11)
+
+/*
+**copy_aligned_11:
+**...
+**lw\t[at][0-9],0\([at][0-9]\)
+**...
+**sw\t[at][0-9],0\([at][0-9]\)
+**...
+**lbu\t[at][0-9],10\([at][0-9]\)
+**sb\t[at][0-9],10\([at][0-9]\)
+**...
+*/
+COPY_ALIGNED_N(11)
+
+/*
+**copy_15:
+**...
+**(call|tail)\tmemcpy
+**...
+*/
+COPY_N(15)
+
+/*
+**copy_aligned_15:
+**...
+**lw\t[at][0-9],0\([at][0-9]\)
+**...
+**sw\t[at][0-9],0\([at][0-9]\)
+**...
+**lbu\t[at][0-9],14\([at][0-9]\)
+**sb\t[at][0-9],14\([at][0-9]\)
+**...
+*/
+COPY_ALIGNED_N(15)
+
+/*
+**copy_27:
+**...
+**(call|tail)\tmemcpy
+**...
+*/
+COPY_N(27)
+
+/*
+**copy_aligned_27:
+**...
+**lw\t[at][0-9],20\([at][0-9]\)
+**...
+**sw\t[at][0-9],20\([at][0-9]\)
+**...
+**lbu\t[at][0-9],26\([at][0-9]\)
+**sb\t[at][0-9],26\([at][0-9]\)
+**...
+*/
+COPY_ALIGNED_N(27)
diff --git a/gcc/testsuite/gcc.target/riscv/cpymem-32.c 
b/gcc/testsuite/gcc.target/riscv/cpymem-32.c
new file mode 100644
index 000..44ba14a1d51
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/cpymem-32.c
@@ -0,0 +1,138 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target rv32 } */
+/* { dg-options "-march=rv32gc -mabi=ilp32d -mtune=rocket" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
+/* { dg-final { check-function-bo

[PATCH 0/4] RISC-V: Enhance unaligned/overlapping codegen

2024-05-07 Thread Christoph Müllner
I've mentioned some improvements for unaligned and overlapping code
generation in the RISC-V call a few weeks ago.  Sending this patches
now, as the release is out.

Christoph Müllner (4):
  RISC-V: Add test cases for cpymem expansion
  RISC-V: Allow unaligned accesses in cpymemsi expansion
  RISC-V: tune: Add setting for overlapping mem ops to tuning struct
  RISC-V: Allow by-pieces to do overlapping accesses in
block_move_straight

 gcc/config/riscv/riscv-string.cc  |  59 +---
 gcc/config/riscv/riscv.cc |  20 +++
 .../gcc.target/riscv/cpymem-32-ooo.c  | 137 ++
 gcc/testsuite/gcc.target/riscv/cpymem-32.c| 136 +
 .../gcc.target/riscv/cpymem-64-ooo.c  | 130 +
 gcc/testsuite/gcc.target/riscv/cpymem-64.c| 135 +
 6 files changed, 593 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cpymem-32-ooo.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cpymem-32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cpymem-64-ooo.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cpymem-64.c

-- 
2.44.0



[PATCH] RISC-V: Add zero_extract support for rv64gc

2024-05-06 Thread Christoph Müllner
The combiner attempts to optimize a zero-extension of a logical right shift
using zero_extract. We already utilize this optimization for those cases
that result in a single instructions.  Let's add a insn_and_split
pattern that also matches the generic case, where we can emit an
optimized sequence of a slli/srli.

Tested with SPEC CPU 2017 (rv64gc).

PR 111501

gcc/ChangeLog:

* config/riscv/riscv.md (*lshr3_zero_extend_4): New
pattern for zero-extraction.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr111501.c: New test.
* gcc.target/riscv/zero-extend-rshift-32.c: New test.
* gcc.target/riscv/zero-extend-rshift-64.c: New test.
* gcc.target/riscv/zero-extend-rshift.c: New test.

Signed-off-by: Christoph Müllner 
---
 gcc/config/riscv/riscv.md |  30 +
 gcc/testsuite/gcc.target/riscv/pr111501.c |  32 +
 .../gcc.target/riscv/zero-extend-rshift-32.c  |  37 ++
 .../gcc.target/riscv/zero-extend-rshift-64.c  |  63 ++
 .../gcc.target/riscv/zero-extend-rshift.c | 119 ++
 5 files changed, 281 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr111501.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index d4676507b45..80cbecb78e8 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2792,6 +2792,36 @@ (define_insn "*lshrsi3_zero_extend_3"
   [(set_attr "type" "shift")
(set_attr "mode" "SI")])
 
+;; Canonical form for a zero-extend of a logical right shift.
+;; Special cases are handled above.
+;; Skip for single-bit extraction (Zbs/XTheadBs) and th.extu (XTheadBb)
+(define_insn_and_split "*lshr3_zero_extend_4"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+(zero_extract:GPR
+   (match_operand:GPR 1 "register_operand" " r")
+   (match_operand 2 "const_int_operand")
+   (match_operand 3 "const_int_operand")))
+   (clobber (match_scratch:GPR  4 "="))]
+  "!((TARGET_ZBS || TARGET_XTHEADBS) && (INTVAL (operands[2]) == 1))
+   && !TARGET_XTHEADBB"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 4)
+ (ashift:GPR (match_dup 1) (match_dup 2)))
+   (set (match_dup 0)
+ (lshiftrt:GPR (match_dup 4) (match_dup 3)))]
+{
+  int regbits = GET_MODE_BITSIZE (GET_MODE (operands[0])).to_constant ();
+  int sizebits = INTVAL (operands[2]);
+  int startbits = INTVAL (operands[3]);
+  int lshamt = regbits - sizebits - startbits;
+  int rshamt = lshamt + startbits;
+  operands[2] = GEN_INT (lshamt);
+  operands[3] = GEN_INT (rshamt);
+}
+  [(set_attr "type" "shift")
+   (set_attr "mode" "")])
+
 ;; Handle AND with 2^N-1 for N from 12 to XLEN.  This can be split into
 ;; two logical shifts.  Otherwise it requires 3 instructions: lui,
 ;; xor/addi/srli, and.
diff --git a/gcc/testsuite/gcc.target/riscv/pr111501.c 
b/gcc/testsuite/gcc.target/riscv/pr111501.c
new file mode 100644
index 000..9355be242e7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr111501.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target rv64 } */
+/* { dg-options "-march=rv64gc" { target { rv64 } } } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+/* { dg-allow-blank-lines-in-output 1 } */
+
+/*
+**do_shift:
+**...
+**slli\ta[0-9],a[0-9],16
+**srli\ta[0-9],a[0-9],48
+**...
+*/
+unsigned int
+do_shift(unsigned long csum)
+{
+  return (unsigned short)(csum >> 32);
+}
+
+/*
+**do_shift2:
+**...
+**slli\ta[0-9],a[0-9],16
+**srli\ta[0-9],a[0-9],48
+**...
+*/
+unsigned int
+do_shift2(unsigned long csum)
+{
+  return (csum << 16) >> 48;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c 
b/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
new file mode 100644
index 000..2824d6fe074
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target rv32 } */
+/* { dg-options "-march=rv32gc" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#define URT_ZE_UCT_RSHIFT_N_UAT(RT,CT,N,AT)\
+unsigned RT u##RT##_ze_u##CT##_rshift_##N##_u##AT(unsigned AT v) 

Re: [PATCH] RISC-V: Fix parsing of Zic* extensions

2024-04-29 Thread Christoph Müllner
On Mon, Apr 29, 2024 at 5:58 AM Kito Cheng  wrote:
>
> OK for trunk, and my understanding is that flag isn't really used in
> code gen yet, so it's not necessary to port to GCC 14 branch?

Pushed to master.

Since the riscv_zi_subext masks of the affected extensions are applied
to higher bits of riscv_zicmo_subext (beyond Zicbom, Zicbop, Zicboz),
this triggers indeed no issue in GCC 14 (because the TARGET_ZI* macros
are not used). Therefore, no backport to GCC 14.

Thanks!

>
> On Mon, Apr 29, 2024 at 7:05 AM Christoph Müllner
>  wrote:
> >
> > The extension parsing table entries for a range of Zic* extensions
> > does not match the mask definition in riscv.opt.
> > This results in broken TARGET_ZIC* macros, because the values of
> > riscv_zi_subext and riscv_zicmo_subext are set wrong.
> >
> > This patch fixes this by moving Zic64b into riscv_zicmo_subext
> > and all other affected Zic* extensions to riscv_zi_subext.
> >
> > gcc/ChangeLog:
> >
> > * common/config/riscv/riscv-common.cc: Move ziccamoa, ziccif,
> > zicclsm, and ziccrse into riscv_zi_subext.
> > * config/riscv/riscv.opt: Define MASK_ZIC64B for
> > riscv_ziccmo_subext.
> >
> > Signed-off-by: Christoph Müllner 
> > ---
> >  gcc/common/config/riscv/riscv-common.cc | 8 
> >  gcc/config/riscv/riscv.opt  | 4 ++--
> >  2 files changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/gcc/common/config/riscv/riscv-common.cc 
> > b/gcc/common/config/riscv/riscv-common.cc
> > index 43b7549e3ec..8cc0e727737 100644
> > --- a/gcc/common/config/riscv/riscv-common.cc
> > +++ b/gcc/common/config/riscv/riscv-common.cc
> > @@ -1638,15 +1638,15 @@ static const riscv_ext_flag_table_t 
> > riscv_ext_flag_table[] =
> >
> >{"zihintntl", _options::x_riscv_zi_subext, MASK_ZIHINTNTL},
> >{"zihintpause", _options::x_riscv_zi_subext, MASK_ZIHINTPAUSE},
> > +  {"ziccamoa", _options::x_riscv_zi_subext, MASK_ZICCAMOA},
> > +  {"ziccif", _options::x_riscv_zi_subext, MASK_ZICCIF},
> > +  {"zicclsm", _options::x_riscv_zi_subext, MASK_ZICCLSM},
> > +  {"ziccrse", _options::x_riscv_zi_subext, MASK_ZICCRSE},
> >
> >{"zicboz", _options::x_riscv_zicmo_subext, MASK_ZICBOZ},
> >{"zicbom", _options::x_riscv_zicmo_subext, MASK_ZICBOM},
> >{"zicbop", _options::x_riscv_zicmo_subext, MASK_ZICBOP},
> >{"zic64b", _options::x_riscv_zicmo_subext, MASK_ZIC64B},
> > -  {"ziccamoa", _options::x_riscv_zicmo_subext, MASK_ZICCAMOA},
> > -  {"ziccif", _options::x_riscv_zicmo_subext, MASK_ZICCIF},
> > -  {"zicclsm", _options::x_riscv_zicmo_subext, MASK_ZICCLSM},
> > -  {"ziccrse", _options::x_riscv_zicmo_subext, MASK_ZICCRSE},
> >
> >{"zve32x",   _options::x_target_flags, MASK_VECTOR},
> >{"zve32f",   _options::x_target_flags, MASK_VECTOR},
> > diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
> > index b14888e9816..ee824756381 100644
> > --- a/gcc/config/riscv/riscv.opt
> > +++ b/gcc/config/riscv/riscv.opt
> > @@ -237,8 +237,6 @@ Mask(ZIHINTPAUSE) Var(riscv_zi_subext)
> >
> >  Mask(ZICOND)  Var(riscv_zi_subext)
> >
> > -Mask(ZIC64B)  Var(riscv_zi_subext)
> > -
> >  Mask(ZICCAMOA)Var(riscv_zi_subext)
> >
> >  Mask(ZICCIF)  Var(riscv_zi_subext)
> > @@ -390,6 +388,8 @@ Mask(ZICBOM) Var(riscv_zicmo_subext)
> >
> >  Mask(ZICBOP) Var(riscv_zicmo_subext)
> >
> > +Mask(ZIC64B) Var(riscv_zicmo_subext)
> > +
> >  TargetVariable
> >  int riscv_zf_subext
> >
> > --
> > 2.44.0
> >


[PATCH] RISC-V: Fix parsing of Zic* extensions

2024-04-28 Thread Christoph Müllner
The extension parsing table entries for a range of Zic* extensions
does not match the mask definition in riscv.opt.
This results in broken TARGET_ZIC* macros, because the values of
riscv_zi_subext and riscv_zicmo_subext are set wrong.

This patch fixes this by moving Zic64b into riscv_zicmo_subext
and all other affected Zic* extensions to riscv_zi_subext.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Move ziccamoa, ziccif,
zicclsm, and ziccrse into riscv_zi_subext.
* config/riscv/riscv.opt: Define MASK_ZIC64B for
riscv_ziccmo_subext.

Signed-off-by: Christoph Müllner 
---
 gcc/common/config/riscv/riscv-common.cc | 8 
 gcc/config/riscv/riscv.opt  | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 43b7549e3ec..8cc0e727737 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1638,15 +1638,15 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
 
   {"zihintntl", _options::x_riscv_zi_subext, MASK_ZIHINTNTL},
   {"zihintpause", _options::x_riscv_zi_subext, MASK_ZIHINTPAUSE},
+  {"ziccamoa", _options::x_riscv_zi_subext, MASK_ZICCAMOA},
+  {"ziccif", _options::x_riscv_zi_subext, MASK_ZICCIF},
+  {"zicclsm", _options::x_riscv_zi_subext, MASK_ZICCLSM},
+  {"ziccrse", _options::x_riscv_zi_subext, MASK_ZICCRSE},
 
   {"zicboz", _options::x_riscv_zicmo_subext, MASK_ZICBOZ},
   {"zicbom", _options::x_riscv_zicmo_subext, MASK_ZICBOM},
   {"zicbop", _options::x_riscv_zicmo_subext, MASK_ZICBOP},
   {"zic64b", _options::x_riscv_zicmo_subext, MASK_ZIC64B},
-  {"ziccamoa", _options::x_riscv_zicmo_subext, MASK_ZICCAMOA},
-  {"ziccif", _options::x_riscv_zicmo_subext, MASK_ZICCIF},
-  {"zicclsm", _options::x_riscv_zicmo_subext, MASK_ZICCLSM},
-  {"ziccrse", _options::x_riscv_zicmo_subext, MASK_ZICCRSE},
 
   {"zve32x",   _options::x_target_flags, MASK_VECTOR},
   {"zve32f",   _options::x_target_flags, MASK_VECTOR},
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index b14888e9816..ee824756381 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -237,8 +237,6 @@ Mask(ZIHINTPAUSE) Var(riscv_zi_subext)
 
 Mask(ZICOND)  Var(riscv_zi_subext)
 
-Mask(ZIC64B)  Var(riscv_zi_subext)
-
 Mask(ZICCAMOA)Var(riscv_zi_subext)
 
 Mask(ZICCIF)  Var(riscv_zi_subext)
@@ -390,6 +388,8 @@ Mask(ZICBOM) Var(riscv_zicmo_subext)
 
 Mask(ZICBOP) Var(riscv_zicmo_subext)
 
+Mask(ZIC64B) Var(riscv_zicmo_subext)
+
 TargetVariable
 int riscv_zf_subext
 
-- 
2.44.0



Re: [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector

2024-03-22 Thread Christoph Müllner
On Fri, Mar 22, 2024 at 2:18 AM juzhe.zh...@rivai.ai
 wrote:
>
> LGTM.

Pushed.
Thanks!

>
> 
> juzhe.zh...@rivai.ai
>
>
> From: Christoph Müllner
> Date: 2024-03-22 07:45
> To: gcc-patches; Kito Cheng; Palmer Dabbelt; Andrew Waterman; Philipp 
> Tomsich; Camel Coder; Bruce Hoult; Juzhe-Zhong; Jun Sha; Xianmiao Qu; Jin Ma
> CC: Christoph Müllner
> Subject: [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for 
> XTheadVector
> The expansion of `memset` (via expand_builtin_memset_args())
> uses clear_by_pieces() and store_by_pieces() to avoid calls
> to the C runtime. To check if a type can be used for that purpose
> the function by_pieces_mode_supported_p() tests if a `mov` and
> a `vec_duplicate` INSN can be expaned by the backend.
>
> The `vec_duplicate` expansion takes arguments of type `V_VLS`.
> The `mov` expansions take arguments of type `V`, `VB`, `VT`,
> `VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact
> not types but type iterators) include fractional LMUL types.
> E.g. `V_VLS` includes `V`, which includes `VI`, which includes
> `RVVMF2QI`.
>
> This results in an attempt to use fractional LMUL-types for
> the `memset` expansion resulting in an ICE for XTheadVector,
> because that extension cannot handle fractional LMULs.
>
> This patch addresses this issue by splitting the definition
> of the `VI` mode itereator into `VI_NOFRAC` (without fractional
> LMUL types) and `VI_FRAC` (only fractional LMUL types).
> Further, it defines `V_VLS` such, that `VI_FRAC` types are only
> included if XTheadVector is not enabled.
>
> The effect is demonstrated by a new test case that shows
> that the by-pieces framework now emits `sb` instructions
> instead of triggering an ICE.
>
> Signed-off-by: Christoph Müllner 
>
> PR 114194
>
> gcc/ChangeLog:
>
> * config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_NOFRAC.
> Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test.
>
> Signed-off-by: Christoph Müllner 
> ---
> gcc/config/riscv/vector-iterators.md  | 19 +--
> .../riscv/rvv/xtheadvector/pr114194.c | 56 +++
> 2 files changed, 69 insertions(+), 6 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
>
> diff --git a/gcc/config/riscv/vector-iterators.md 
> b/gcc/config/riscv/vector-iterators.md
> index c2ea7e8b10a..a24e1bf078f 100644
> --- a/gcc/config/riscv/vector-iterators.md
> +++ b/gcc/config/riscv/vector-iterators.md
> @@ -108,17 +108,24 @@ (define_c_enum "unspecv" [
>UNSPECV_FRM_RESTORE_EXIT
> ])
> -(define_mode_iterator VI [
> -  RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI 
> "TARGET_MIN_VLEN > 32")
> -
> -  RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> -
> -  RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32")
> +;; Subset of VI with fractional LMUL types
> +(define_mode_iterator VI_FRAC [
> +  RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> +  RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
> +  (RVVMF2SI "TARGET_MIN_VLEN > 32")
> +])
> +;; Subset of VI with non-fractional LMUL types
> +(define_mode_iterator VI_NOFRAC [
> +  RVVM8QI RVVM4QI RVVM2QI RVVM1QI
> +  RVVM8HI RVVM4HI RVVM2HI RVVM1HI
> +  RVVM8SI RVVM4SI RVVM2SI RVVM1SI
>(RVVM8DI "TARGET_VECTOR_ELEN_64") (RVVM4DI "TARGET_VECTOR_ELEN_64")
>(RVVM2DI "TARGET_VECTOR_ELEN_64") (RVVM1DI "TARGET_VECTOR_ELEN_64")
> ])
> +(define_mode_iterator VI [ VI_NOFRAC (VI_FRAC "!TARGET_XTHEADVECTOR") ])
> +
> ;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
> ;; changed to TARGET_ZVFH.  TARGET_VECTOR_ELEN_FP_16 is also true for
> ;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> new file mode 100644
> index 000..fc2d1349425
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> @@ -0,0 +1,56 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gc_xtheadvector" { target { rv32 } } } */
> +/* { dg-options "-march=rv64gc_xtheadvector" { target { rv64 } } } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** foo0_1:
> +** sb\tzero,0([a-x0-9]+)
> +** ret
> +*/
> +void foo0_1 (void *p)
> +{
> + 

Re: [PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector

2024-03-22 Thread Christoph Müllner
On Fri, Mar 22, 2024 at 4:43 AM Bruce Hoult  wrote:
>
> > The effect is demonstrated by a new test case that shows
> that the by-pieces framework now emits `sb` instructions
> instead of triggering an ICE
>
> So these small memset() now don't use RVV at all if xtheadvector is enabled?

Yes, but not directly.
The patch just prevents fractional LMUL modes from being considered
for XTheadVector.
That's necessary because further lowering memory moves with a
fractional LMUL mode
cannot be done for XTheadVector (that's the reason for the ICE).

> I don't have evidence whether the use of RVV (whether V or
> xtheadvector) for these memsets is a win or not, but the treatment
> should probably be consistent.
>
> I don't know why RVV 1.0 uses a fractional LMUL at all here. It would
> work perfectly well with LMUL=1 and just setting vl to the appropriate
> length (which is always less than 16 bytes). Use of fractional LMUL
> doesn't save any resources.

The compiler can consider fractional LMUL values for expansion for RVV,
but that does not mean it will be used in the emitted instruction sequence.
Details like cost model and data alignment also matter.

During testing, I observed that RVV and XTheadVector will both emit sequences
of 'sd' for short memsets with known length, known data to set,
and unknown alignment of the data to be written.
However, I have not excessively tested using all possible tuning parameters,
as my primary goal was to eliminate the reason for the ICE with XTheadVector.

>
> On Fri, Mar 22, 2024 at 12:46 PM Christoph Müllner
>  wrote:
> >
> > The expansion of `memset` (via expand_builtin_memset_args())
> > uses clear_by_pieces() and store_by_pieces() to avoid calls
> > to the C runtime. To check if a type can be used for that purpose
> > the function by_pieces_mode_supported_p() tests if a `mov` and
> > a `vec_duplicate` INSN can be expaned by the backend.
> >
> > The `vec_duplicate` expansion takes arguments of type `V_VLS`.
> > The `mov` expansions take arguments of type `V`, `VB`, `VT`,
> > `VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact
> > not types but type iterators) include fractional LMUL types.
> > E.g. `V_VLS` includes `V`, which includes `VI`, which includes
> > `RVVMF2QI`.
> >
> > This results in an attempt to use fractional LMUL-types for
> > the `memset` expansion resulting in an ICE for XTheadVector,
> > because that extension cannot handle fractional LMULs.
> >
> > This patch addresses this issue by splitting the definition
> > of the `VI` mode itereator into `VI_NOFRAC` (without fractional
> > LMUL types) and `VI_FRAC` (only fractional LMUL types).
> > Further, it defines `V_VLS` such, that `VI_FRAC` types are only
> > included if XTheadVector is not enabled.
> >
> > The effect is demonstrated by a new test case that shows
> > that the by-pieces framework now emits `sb` instructions
> > instead of triggering an ICE.
> >
> > Signed-off-by: Christoph Müllner 
> >
> > PR 114194
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/vector-iterators.md: Split VI into VI_FRAC and 
> > VI_NOFRAC.
> > Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test.
> >
> > Signed-off-by: Christoph Müllner 
> > ---
> >  gcc/config/riscv/vector-iterators.md  | 19 +--
> >  .../riscv/rvv/xtheadvector/pr114194.c | 56 +++
> >  2 files changed, 69 insertions(+), 6 deletions(-)
> >  create mode 100644 
> > gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
> >
> > diff --git a/gcc/config/riscv/vector-iterators.md 
> > b/gcc/config/riscv/vector-iterators.md
> > index c2ea7e8b10a..a24e1bf078f 100644
> > --- a/gcc/config/riscv/vector-iterators.md
> > +++ b/gcc/config/riscv/vector-iterators.md
> > @@ -108,17 +108,24 @@ (define_c_enum "unspecv" [
> >UNSPECV_FRM_RESTORE_EXIT
> >  ])
> >
> > -(define_mode_iterator VI [
> > -  RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI 
> > "TARGET_MIN_VLEN > 32")
> > -
> > -  RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 
> > 32")
> > -
> > -  RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32")
> > +;; Subset of VI with fractional LMUL types
> > +(define_mode_iterator VI_FRAC [
> > +  RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
> > +  RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32&q

[PATCH] RISC-V: Don't add fractional LMUL types to V_VLS for XTheadVector

2024-03-21 Thread Christoph Müllner
The expansion of `memset` (via expand_builtin_memset_args())
uses clear_by_pieces() and store_by_pieces() to avoid calls
to the C runtime. To check if a type can be used for that purpose
the function by_pieces_mode_supported_p() tests if a `mov` and
a `vec_duplicate` INSN can be expaned by the backend.

The `vec_duplicate` expansion takes arguments of type `V_VLS`.
The `mov` expansions take arguments of type `V`, `VB`, `VT`,
`VLS_AVL_IMM`, and `VLS_AVL_REG`. Some of these types (in fact
not types but type iterators) include fractional LMUL types.
E.g. `V_VLS` includes `V`, which includes `VI`, which includes
`RVVMF2QI`.

This results in an attempt to use fractional LMUL-types for
the `memset` expansion resulting in an ICE for XTheadVector,
because that extension cannot handle fractional LMULs.

This patch addresses this issue by splitting the definition
of the `VI` mode itereator into `VI_NOFRAC` (without fractional
LMUL types) and `VI_FRAC` (only fractional LMUL types).
Further, it defines `V_VLS` such, that `VI_FRAC` types are only
included if XTheadVector is not enabled.

The effect is demonstrated by a new test case that shows
that the by-pieces framework now emits `sb` instructions
instead of triggering an ICE.

Signed-off-by: Christoph Müllner 

PR 114194

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Split VI into VI_FRAC and VI_NOFRAC.
Only include VI_NOFRAC in V_VLS without TARGET_XTHEADVECTOR.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/xtheadvector/pr114194.c: New test.

Signed-off-by: Christoph Müllner 
---
 gcc/config/riscv/vector-iterators.md  | 19 +--
 .../riscv/rvv/xtheadvector/pr114194.c | 56 +++
 2 files changed, 69 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c

diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index c2ea7e8b10a..a24e1bf078f 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -108,17 +108,24 @@ (define_c_enum "unspecv" [
   UNSPECV_FRM_RESTORE_EXIT
 ])
 
-(define_mode_iterator VI [
-  RVVM8QI RVVM4QI RVVM2QI RVVM1QI RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN 
> 32")
-
-  RVVM8HI RVVM4HI RVVM2HI RVVM1HI RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
-
-  RVVM8SI RVVM4SI RVVM2SI RVVM1SI (RVVMF2SI "TARGET_MIN_VLEN > 32")
+;; Subset of VI with fractional LMUL types
+(define_mode_iterator VI_FRAC [
+  RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_MIN_VLEN > 32")
+  RVVMF2HI (RVVMF4HI "TARGET_MIN_VLEN > 32")
+  (RVVMF2SI "TARGET_MIN_VLEN > 32")
+])
 
+;; Subset of VI with non-fractional LMUL types
+(define_mode_iterator VI_NOFRAC [
+  RVVM8QI RVVM4QI RVVM2QI RVVM1QI
+  RVVM8HI RVVM4HI RVVM2HI RVVM1HI
+  RVVM8SI RVVM4SI RVVM2SI RVVM1SI
   (RVVM8DI "TARGET_VECTOR_ELEN_64") (RVVM4DI "TARGET_VECTOR_ELEN_64")
   (RVVM2DI "TARGET_VECTOR_ELEN_64") (RVVM1DI "TARGET_VECTOR_ELEN_64")
 ])
 
+(define_mode_iterator VI [ VI_NOFRAC (VI_FRAC "!TARGET_XTHEADVECTOR") ])
+
 ;; This iterator is the same as above but with TARGET_VECTOR_ELEN_FP_16
 ;; changed to TARGET_ZVFH.  TARGET_VECTOR_ELEN_FP_16 is also true for
 ;; TARGET_ZVFHMIN while we actually want to disable all instructions apart
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c 
b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
new file mode 100644
index 000..fc2d1349425
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr114194.c
@@ -0,0 +1,56 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_xtheadvector" { target { rv32 } } } */
+/* { dg-options "-march=rv64gc_xtheadvector" { target { rv64 } } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+** foo0_1:
+** sb\tzero,0([a-x0-9]+)
+** ret
+*/
+void foo0_1 (void *p)
+{
+  __builtin_memset (p, 0, 1);
+}
+
+/*
+** foo0_7:
+** sb\tzero,0([a-x0-9]+)
+** sb\tzero,1([a-x0-9]+)
+** sb\tzero,2([a-x0-9]+)
+** sb\tzero,3([a-x0-9]+)
+** sb\tzero,4([a-x0-9]+)
+** sb\tzero,5([a-x0-9]+)
+** sb\tzero,6([a-x0-9]+)
+** ret
+*/
+void foo0_7 (void *p)
+{
+  __builtin_memset (p, 0, 7);
+}
+
+/*
+** foo1_1:
+** li\t[a-x0-9]+,1
+** sb\t[a-x0-9]+,0([a-x0-9]+)
+** ret
+*/
+void foo1_1 (void *p)
+{
+  __builtin_memset (p, 1, 1);
+}
+
+/*
+** foo1_5:
+** li\t[a-x0-9]+,1
+** sb\t[a-x0-9]+,0([a-x0-9]+)
+** sb\t[a-x0-9]+,1([a-x0-9]+)
+** sb\t[a-x0-9]+,2([a-x0-9]+)
+** sb\t[a-x0-9]+,3([a-x0-9]+)
+** sb\t[a-x0-9]+,4([a-x0-9]+)
+** ret
+*/
+void foo1_5 (void *p)
+{
+  __builtin_memset (p, 1, 5);
+}
-- 
2.44.0



Re: [PATCH 02/11] riscv: xtheadmempair: Fix CFA reg notes

2024-03-17 Thread Christoph Müllner
On Fri, Apr 28, 2023 at 8:12 AM Christoph Muellner
 wrote:
>
> From: Christoph Müllner 
>
> The current implementation triggers an assertion in
> dwarf2out_frame_debug_cfa_offset() under certain circumstances.
> The standard code uses REG_FRAME_RELATED_EXPR notes instead
> of REG_CFA_OFFSET notes when saving registers on the stack.
> So let's do this as well.
>
> gcc/ChangeLog:
>
> * config/riscv/thead.cc (th_mempair_save_regs):
> Emit REG_FRAME_RELATED_EXPR notes in prologue.
>
> Signed-off-by: Christoph Müllner 

This patch applies cleanly on GCC 13.
Ok to backport to GCC 13 (fixing PR114160)?

> ---
>  gcc/config/riscv/thead.cc | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/riscv/thead.cc b/gcc/config/riscv/thead.cc
> index 75203805310..d7e3cf80d9b 100644
> --- a/gcc/config/riscv/thead.cc
> +++ b/gcc/config/riscv/thead.cc
> @@ -368,8 +368,12 @@ th_mempair_save_regs (rtx operands[4])
>rtx set2 = gen_rtx_SET (operands[2], operands[3]);
>rtx insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set1, 
> set2)));
>RTX_FRAME_RELATED_P (insn) = 1;
> -  add_reg_note (insn, REG_CFA_OFFSET, copy_rtx (set1));
> -  add_reg_note (insn, REG_CFA_OFFSET, copy_rtx (set2));
> +
> +  REG_NOTES (insn) = alloc_EXPR_LIST (REG_FRAME_RELATED_EXPR,
> + copy_rtx (set1), REG_NOTES (insn));
> +
> +  REG_NOTES (insn) = alloc_EXPR_LIST (REG_FRAME_RELATED_EXPR,
> + copy_rtx (set2), REG_NOTES (insn));
>  }
>
>  /* Similar like riscv_restore_reg, but restores two registers from memory
> --
> 2.40.1
>


Re: [PATCH] RISC-V: Add new option -march=help to print all supported extensions

2024-02-15 Thread Christoph Müllner
On Thu, Feb 15, 2024 at 10:56 AM Kito Cheng  wrote:
>
> The output of -march=help is like below:
>
> ```
> All available -march extensions for RISC-V:
> NameVersion
> i   2.0, 2.1
> e   2.0
> m   2.0
> a   2.0, 2.1
> f   2.0, 2.2
> d   2.0, 2.2
> ...
> ```
>
> Also support -print-supported-extensions and --print-supported-extensions for
> clang compatibility.

If I remember correctly, then this feature was requested several times
in the past.
Thanks for working on this!

Reviewed-by: Christoph Müllner 

I have done a quick feature test (no bootstrapping, no check for
compiler warnings) as well.
Below you find all supported RISC-V extension in today's master branch:

All available -march extensions for RISC-V:
NameVersion
i   2.0, 2.1
e   2.0
m   2.0
a   2.0, 2.1
f   2.0, 2.2
d   2.0, 2.2
c   2.0
v   1.0
h   1.0
zic64b  1.0
zicbom  1.0
zicbop  1.0
zicboz  1.0
ziccamoa1.0
ziccif  1.0
zicclsm 1.0
ziccrse 1.0
zicntr  2.0
zicond  1.0
zicsr   2.0
zifencei2.0
zihintntl   1.0
zihintpause 2.0
zihpm   2.0
zmmul   1.0
za128rs 1.0
za64rs  1.0
zawrs   1.0
zfa 1.0
zfh 1.0
zfhmin  1.0
zfinx   1.0
zdinx   1.0
zca 1.0
zcb 1.0
zcd 1.0
zce 1.0
zcf 1.0
zcmp1.0
zcmt1.0
zba 1.0
zbb 1.0
zbc 1.0
zbkb1.0
zbkc1.0
zbkc1.0
zbkx1.0
zbs 1.0
zk  1.0
zkn 1.0
zknd1.0
zkne1.0
zknh1.0
zkr 1.0
zks 1.0
zksed   1.0
zksh1.0
zkt 1.0
ztso1.0
zvbb1.0
zvbc1.0
zve32f  1.0
zve32x  1.0
zve64d  1.0
zve64f  1.0
zve64x  1.0
zvfbfmin1.0
zvfh1.0
zvfhmin 1.0
zvkb1.0
zvkg1.0
zvkn1.0
zvknc   1.0
zvkned  1.0
zvkng   1.0
zvknha  1.0
zvknhb  1.0
zvks1.0
zvksc   1.0
zvksed  1.0
zvksg   1.0
zvksh   1.0
zvkt1.0
zvl1024b1.0
zvl128b 1.0
zvl16384b   1.0
zvl2048b1.0
zvl256b 1.0
zvl32768b   1.0
zvl32b  1.0
zvl4096b1.0
zvl512b 1.0
zvl64b  1.0
zvl65536b   1.0
zvl8192b1.0
zhinx   1.0
zhinxmin1.0
smaia   1.0
smepmp  1.0
smstateen   1.0
ssaia   1.0
sscofpmf1.0
ssstateen   1.0
sstc1.0
svinval 1.0
svnapot 1.0
svpbmt  1.0
xcvalu  1.0
xcvelw  1.0
xcvmac  1.0
xcvsimd 

Re: [PATCH v2] RISC-V: THEAD: Fix improper immediate value for MODIFY_DISP instruction on 32-bit systems.

2024-02-05 Thread Christoph Müllner
On Mon, Feb 5, 2024 at 3:56 PM Jeff Law  wrote:
>
>
>
> On 2/5/24 05:00, Christoph Müllner wrote:
> > On Sat, Feb 3, 2024 at 2:11 PM Andreas Schwab 
> > wrote:
> >>
> >> On Jan 30 2024, Christoph Müllner wrote:
> >>
> >>> retested
> >>
> >> Nope.
> >
> > Sorry for this. I tested for no regressions in the test suite with a
> > cross-build and QEMU and did not do a Werror bootstrap build. I'll
> > provide a fix for this later today (also breaking the line as it is
> > longer than needed).
> Right.  And that's pretty standard given the state of the RISC-V
> platforms.  We've got a platform here that can bootstrap in a reasonable
> amount of time, but I haven't set that up in the CI system yet.
>
> Until such systems are common, these niggling issues are bound to show up.
>
> It's just whitespace around the HOST_WIDE_INT_PRINT_DEC and wrapping the
> long line, right?  I've got that in my tree that's bootstrapping now.  I
> don't mind committing it later today.  But if you get to it before my
> bootstrap is done, feel free to commit as pre-approved.

Pushed:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=184978cd74f962712e813030d58edc109ad9a92d

>
> jeff


Re: [PATCH v2] RISC-V: THEAD: Fix improper immediate value for MODIFY_DISP instruction on 32-bit systems.

2024-02-05 Thread Christoph Müllner
On Sat, Feb 3, 2024 at 2:11 PM Andreas Schwab  wrote:
>
> On Jan 30 2024, Christoph Müllner wrote:
>
> > retested
>
> Nope.

Sorry for this.
I tested for no regressions in the test suite with a cross-build and
QEMU and did not do a Werror bootstrap build.
I'll provide a fix for this later today (also breaking the line as it
is longer than needed).


>
> ../../gcc/config/riscv/thead.cc:1144:22: error: invalid suffix on literal; 
> C++11 requires a space between literal and string macro 
> [-Werror=literal-suffix]
>  1144 |   fprintf (file, "(%s),"HOST_WIDE_INT_PRINT_DEC",%u", 
> reg_names[REGNO (addr.reg)],
>   |  ^
> cc1plus: all warnings being treated as errors
> make[3]: *** [../../gcc/config/riscv/t-riscv:127: thead.o] Error 1
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."


[PATCH] riscv: Move UNSPEC_XTHEAD* from unspecv to unspec

2024-01-30 Thread Christoph Müllner
The UNSPEC_XTHEAD* macros ended up in the unspecv enum,
which broke gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c.
The INSNs expect these unspecs to be not volatile.
Further, there is not reason to have them defined volatile.
So let's simply move the macros into the unspec enum.

With this patch we have again 0 fails in riscv.exp.

gcc/ChangeLog:

* config/riscv/riscv.md: Move UNSPEC_XTHEADFMV* to unspec enum.

Signed-off-by: Christoph Müllner 
---
 gcc/config/riscv/riscv.md | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index edcaec4a786..b320ad0210e 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -89,6 +89,10 @@ (define_c_enum "unspec" [
 
   ;; Workaround for HFmode without hardware extension
   UNSPEC_FMV_SFP16_X
+
+  ;; XTheadFmv moves
+  UNSPEC_XTHEADFMV
+  UNSPEC_XTHEADFMV_HW
 ])
 
 (define_c_enum "unspecv" [
@@ -127,10 +131,6 @@ (define_c_enum "unspecv" [
   ;; Zihintpause unspec
   UNSPECV_PAUSE
 
-  ;; XTheadFmv unspec
-  UNSPEC_XTHEADFMV
-  UNSPEC_XTHEADFMV_HW
-
   ;; XTheadInt unspec
   UNSPECV_XTHEADINT_PUSH
   UNSPECV_XTHEADINT_POP
-- 
2.43.0



Re: [PATCH v2] RISC-V: THEAD: Fix improper immediate value for MODIFY_DISP instruction on 32-bit systems.

2024-01-29 Thread Christoph Müllner
On Mon, Jan 29, 2024 at 1:32 PM Kito Cheng  wrote:
>
> LGTM

I've rebased, retested (rv64+rv32) and merged this patch.

Thanks!

>
> Jin Ma  於 2024年1月29日 週一 17:57 寫道:
>>
>> When using  '%ld' to print 'long long int' variable, 'fprintf' will
>> produce messy output on a 32-bit system, in an incorrect instruction
>> being generated, such as 'th.lwib a1,(a0),-16,4294967295'. And the
>> following error occurred during compilation:
>>
>> Assembler messages:
>> Error: improper immediate value (18446744073709551615)
>>
>> gcc/ChangeLog:
>>
>> * config/riscv/thead.cc (th_print_operand_address): Change %ld
>> to %lld.
>> ---
>>  gcc/config/riscv/thead.cc | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gcc/config/riscv/thead.cc b/gcc/config/riscv/thead.cc
>> index 2955bc5f8a9..e4b8c37bc28 100644
>> --- a/gcc/config/riscv/thead.cc
>> +++ b/gcc/config/riscv/thead.cc
>> @@ -1141,7 +1141,7 @@ th_print_operand_address (FILE *file, machine_mode 
>> mode, rtx x)
>>return true;
>>
>>  case ADDRESS_REG_WB:
>> -  fprintf (file, "(%s),%ld,%u", reg_names[REGNO (addr.reg)],
>> +  fprintf (file, "(%s),"HOST_WIDE_INT_PRINT_DEC",%u", reg_names[REGNO 
>> (addr.reg)],
>>INTVAL (addr.offset) >> addr.shift, addr.shift);
>> return true;
>>
>> --
>> 2.17.1
>>


Re: [PATCH v5] RISC-V: Support XTheadVector extension

2024-01-18 Thread Christoph Müllner
On Fri, Jan 12, 2024 at 4:18 AM Jun Sha (Joshua)
 wrote:
>
> This patch series presents gcc implementation of the XTheadVector
> extension [1].
>
> [1] https://github.com/T-head-Semi/thead-extension-spec/
>
> For some vector patterns that cannot be avoided, we use
> "!TARGET_XTHEADVECTOR" to disable them in order not to
> generate instructions that xtheadvector does not support,
> causing 10 changes in vector.md.
>
> For the th. prefix issue, we use current_output_insn and
> the ASM_OUTPUT_OPCODE hook instead of directly modifying
> patterns in vector.md.
>
> We have run the GCC test suite and can confirm that there
> are no regressions.
>
> Furthermore, we have run the tests in
> https://github.com/riscv-non-isa/rvv-intrinsic-doc/tree/main/examples,
> and all the tests passed.
>
> Co-authored-by: Jin Ma 
> Co-authored-by: Xianmiao Qu 
> Co-authored-by: Christoph Müllner 
>
> [PATCH v4] RISC-V: Introduce XTheadVector as a subset of V1.0.0
> [PATCH v5] RISC-V: Adds the prefix "th." for the instructions of XTheadVector
> [PATCH v6] RISC-V: Handle differences between XTheadvector and Vector
> [PATCH v6] RISC-V: Add support for xtheadvector-specific intrinsics
> [PATCH v6] RISC-V: Fix register overlap issue for some xtheadvector 
> instructions
> [PATCH v5] RISC-V: Rewrite some instructions using ASM targethook

All patches of this series got either "LGTM" or "OK":
* https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643339.html
* https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642798.html
* https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642799.html
* https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642800.html
* https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642801.html
* https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642802.html

As mentioned earlier, I have rebased the patches, retested them locally and
(after ensuring there are no regressions) pushed them.

To all involved people: thank you very much!
A special 'thank you' goes to Juzhe, who did a great job in reviewing
the patches
and providing suggestions to get the code into shape!


Re: [PATCH v4 0/3] RISC-V: Add intrinsics for Bitmanip and Scalar Crypto extensions

2024-01-15 Thread Christoph Müllner
On Mon, Jan 15, 2024 at 4:35 PM Kito Cheng  wrote:
>
> Ok :)

I've re-created changelog entries in commit messages (commit hook
rejected the commits)
and pushed.

Thanks,
Christoph


>
>
> Christoph Müllner  於 2024年1月15日 週一 23:17 寫道:
>>
>> On Mon, Jan 15, 2024 at 9:35 AM Liao Shihua  wrote:
>> >
>> > Update v3 -> v4:
>> >   1.Typo fix.
>> >   2.Only test *intrinsic-32 on rv32 and *intrinsic-64 on rv64.
>> >   3.Update Copyright year to 2024.
>>
>> Thanks, for fixing the rv32/rv64 issues!
>> I've tested this series: no regressions and all new tests pass.
>> I've also reviewed this series again, and I think it is ready.
>> I can push once a maintainer approves (e.g. Kito or Jeff).
>>
>> Thanks for working on this!
>>
>> >
>> > Update v2 -> v3:
>> >   1. Change pattern mode form X to GPR in orcb, clmul, and brev8.
>> >   2. Add emulated testsuite.
>> >   3. Removed duplicate testsuite between built-in and intrinsic.
>> >   4. Typo fix.
>> >
>> > Update v1 -> v2:
>> >   1. Rename *_intrinsic-* to *_intrinsic-XLEN.
>> >   2. Typo fix.
>> >   3. Intrinsics with immediate arguments will use marcos at O0 .
>> >
>> > It's a little patch add just provides a mapping from the RV intrinsics to 
>> > the builtin
>> > names within GCC.
>> >
>> > Liao Shihua (3):
>> >   RISC-V: Remove the Scalar Bitmanip and Crypto Built-In function
>> > testsuites
>> >   RISC-V: Add C intrinsic for Scalar Crypto Extension
>> >   RISC-V: Add C intrinsic for Scalar Bitmanip Extension
>> >
>> >  gcc/config.gcc|   2 +-
>> >  gcc/config/riscv/bitmanip.md  |  10 +-
>> >  gcc/config/riscv/crypto.md|   4 +-
>> >  gcc/config/riscv/riscv-builtins.cc|  22 ++
>> >  gcc/config/riscv/riscv-cmo.def|  12 +-
>> >  gcc/config/riscv/riscv-ftypes.def |   2 +
>> >  gcc/config/riscv/riscv-scalar-crypto.def  |  22 +-
>> >  gcc/config/riscv/riscv_bitmanip.h | 297 +
>> >  gcc/config/riscv/riscv_crypto.h   | 309 ++
>> >  .../riscv/scalar_bitmanip_intrinsic-32.c  |  97 ++
>> >  .../scalar_bitmanip_intrinsic-64-emulated.c   |  33 ++
>> >  .../riscv/scalar_bitmanip_intrinsic-64.c  | 115 +++
>> >  .../riscv/scalar_crypto_intrinsic-32.c| 115 +++
>> >  .../riscv/scalar_crypto_intrinsic-64.c| 123 +++
>> >  .../gcc.target/riscv/zbb_32_bswap-1.c |  11 -
>> >  gcc/testsuite/gcc.target/riscv/zbb_bswap-1.c  |  11 -
>> >  gcc/testsuite/gcc.target/riscv/zbb_bswap-2.c  |  12 -
>> >  .../riscv/{zbb_32_bswap-2.c => zbb_bswap16.c} |   3 +-
>> >  gcc/testsuite/gcc.target/riscv/zbbw.c |  26 --
>> >  gcc/testsuite/gcc.target/riscv/zbc32.c|  23 --
>> >  gcc/testsuite/gcc.target/riscv/zbc64.c|  23 --
>> >  gcc/testsuite/gcc.target/riscv/zbkb32.c   |  18 -
>> >  gcc/testsuite/gcc.target/riscv/zbkb64.c   |   5 -
>> >  gcc/testsuite/gcc.target/riscv/zbkc32.c   |  17 -
>> >  gcc/testsuite/gcc.target/riscv/zbkc64.c   |  17 -
>> >  gcc/testsuite/gcc.target/riscv/zbkx32.c   |  18 -
>> >  gcc/testsuite/gcc.target/riscv/zbkx64.c   |  18 -
>> >  gcc/testsuite/gcc.target/riscv/zknd32-2.c |  28 --
>> >  gcc/testsuite/gcc.target/riscv/zknd64-2.c |  42 ---
>> >  gcc/testsuite/gcc.target/riscv/zkne32-2.c |  28 --
>> >  gcc/testsuite/gcc.target/riscv/zkne64-2.c |  34 --
>> >  .../gcc.target/riscv/zknh-sha256-32.c |  10 -
>> >  .../gcc.target/riscv/zknh-sha256-64.c |  28 --
>> >  .../gcc.target/riscv/zknh-sha512-32.c |  42 ---
>> >  .../gcc.target/riscv/zknh-sha512-64.c |  31 --
>> >  gcc/testsuite/gcc.target/riscv/zksed32-2.c|  29 --
>> >  gcc/testsuite/gcc.target/riscv/zksed64-2.c|  29 --
>> >  gcc/testsuite/gcc.target/riscv/zksh32.c   |  19 --
>> >  gcc/testsuite/gcc.target/riscv/zksh64.c   |  19 --
>> >  39 files changed, 1149 insertions(+), 555 deletions(-)
>> >  create mode 100644 gcc/config/riscv/riscv_bitmanip.h
>> >  create mode 100644 gcc/config/riscv/riscv_crypto.h
>> >  create mode 100644 
>> > gcc/testsuite/gcc.target/riscv/scalar_bitmanip_intrinsic-32.c
>> >  create mode 100644 
>> > gcc/testsuite/gcc.target/riscv/scalar_bitmanip

Re: [PATCH v4 0/3] RISC-V: Add intrinsics for Bitmanip and Scalar Crypto extensions

2024-01-15 Thread Christoph Müllner
On Mon, Jan 15, 2024 at 9:35 AM Liao Shihua  wrote:
>
> Update v3 -> v4:
>   1.Typo fix.
>   2.Only test *intrinsic-32 on rv32 and *intrinsic-64 on rv64.
>   3.Update Copyright year to 2024.

Thanks, for fixing the rv32/rv64 issues!
I've tested this series: no regressions and all new tests pass.
I've also reviewed this series again, and I think it is ready.
I can push once a maintainer approves (e.g. Kito or Jeff).

Thanks for working on this!

>
> Update v2 -> v3:
>   1. Change pattern mode form X to GPR in orcb, clmul, and brev8.
>   2. Add emulated testsuite.
>   3. Removed duplicate testsuite between built-in and intrinsic.
>   4. Typo fix.
>
> Update v1 -> v2:
>   1. Rename *_intrinsic-* to *_intrinsic-XLEN.
>   2. Typo fix.
>   3. Intrinsics with immediate arguments will use marcos at O0 .
>
> It's a little patch add just provides a mapping from the RV intrinsics to the 
> builtin
> names within GCC.
>
> Liao Shihua (3):
>   RISC-V: Remove the Scalar Bitmanip and Crypto Built-In function
> testsuites
>   RISC-V: Add C intrinsic for Scalar Crypto Extension
>   RISC-V: Add C intrinsic for Scalar Bitmanip Extension
>
>  gcc/config.gcc|   2 +-
>  gcc/config/riscv/bitmanip.md  |  10 +-
>  gcc/config/riscv/crypto.md|   4 +-
>  gcc/config/riscv/riscv-builtins.cc|  22 ++
>  gcc/config/riscv/riscv-cmo.def|  12 +-
>  gcc/config/riscv/riscv-ftypes.def |   2 +
>  gcc/config/riscv/riscv-scalar-crypto.def  |  22 +-
>  gcc/config/riscv/riscv_bitmanip.h | 297 +
>  gcc/config/riscv/riscv_crypto.h   | 309 ++
>  .../riscv/scalar_bitmanip_intrinsic-32.c  |  97 ++
>  .../scalar_bitmanip_intrinsic-64-emulated.c   |  33 ++
>  .../riscv/scalar_bitmanip_intrinsic-64.c  | 115 +++
>  .../riscv/scalar_crypto_intrinsic-32.c| 115 +++
>  .../riscv/scalar_crypto_intrinsic-64.c| 123 +++
>  .../gcc.target/riscv/zbb_32_bswap-1.c |  11 -
>  gcc/testsuite/gcc.target/riscv/zbb_bswap-1.c  |  11 -
>  gcc/testsuite/gcc.target/riscv/zbb_bswap-2.c  |  12 -
>  .../riscv/{zbb_32_bswap-2.c => zbb_bswap16.c} |   3 +-
>  gcc/testsuite/gcc.target/riscv/zbbw.c |  26 --
>  gcc/testsuite/gcc.target/riscv/zbc32.c|  23 --
>  gcc/testsuite/gcc.target/riscv/zbc64.c|  23 --
>  gcc/testsuite/gcc.target/riscv/zbkb32.c   |  18 -
>  gcc/testsuite/gcc.target/riscv/zbkb64.c   |   5 -
>  gcc/testsuite/gcc.target/riscv/zbkc32.c   |  17 -
>  gcc/testsuite/gcc.target/riscv/zbkc64.c   |  17 -
>  gcc/testsuite/gcc.target/riscv/zbkx32.c   |  18 -
>  gcc/testsuite/gcc.target/riscv/zbkx64.c   |  18 -
>  gcc/testsuite/gcc.target/riscv/zknd32-2.c |  28 --
>  gcc/testsuite/gcc.target/riscv/zknd64-2.c |  42 ---
>  gcc/testsuite/gcc.target/riscv/zkne32-2.c |  28 --
>  gcc/testsuite/gcc.target/riscv/zkne64-2.c |  34 --
>  .../gcc.target/riscv/zknh-sha256-32.c |  10 -
>  .../gcc.target/riscv/zknh-sha256-64.c |  28 --
>  .../gcc.target/riscv/zknh-sha512-32.c |  42 ---
>  .../gcc.target/riscv/zknh-sha512-64.c |  31 --
>  gcc/testsuite/gcc.target/riscv/zksed32-2.c|  29 --
>  gcc/testsuite/gcc.target/riscv/zksed64-2.c|  29 --
>  gcc/testsuite/gcc.target/riscv/zksh32.c   |  19 --
>  gcc/testsuite/gcc.target/riscv/zksh64.c   |  19 --
>  39 files changed, 1149 insertions(+), 555 deletions(-)
>  create mode 100644 gcc/config/riscv/riscv_bitmanip.h
>  create mode 100644 gcc/config/riscv/riscv_crypto.h
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/scalar_bitmanip_intrinsic-32.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/scalar_bitmanip_intrinsic-64-emulated.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/scalar_bitmanip_intrinsic-64.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/scalar_crypto_intrinsic-32.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/scalar_crypto_intrinsic-64.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/zbb_32_bswap-1.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/zbb_bswap-1.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/zbb_bswap-2.c
>  rename gcc/testsuite/gcc.target/riscv/{zbb_32_bswap-2.c => zbb_bswap16.c} 
> (59%)
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/zbbw.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/zbc32.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/zbc64.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/zbkc32.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/zbkc64.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/zbkx32.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/zbkx64.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/zknd32-2.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/zknd64-2.c
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/zkne32-2.c
>  delete mode 100644 

Re: [PATCH] RISC-V: THEAD: Fix ICE caused by split optimizations for XTheadFMemIdx.

2024-01-11 Thread Christoph Müllner
On Thu, Jan 11, 2024 at 4:36 PM Kito Cheng  wrote:
>
> LGTM
>
> On Thu, Jan 11, 2024 at 7:23 PM Jin Ma  wrote:
> >
> > Due to the premature split optimizations for XTheadFMemIdx, GPR
> > is allocated when reload allocates registers, resulting in the
> > following insn.

LGTM.
This was most likely not detected so far, because it only affects rv32
(test focus on xthead(f)memidx was rv64).
I've rebased, retested (no regressions) and pushed.

Thanks,
Christoph

> >
> > (insn 66 21 64 5 (set (reg:DF 14 a4 [orig:136  ] [136])
> > (mem:DF (plus:SI (reg/f:SI 15 a5 [141])
> > (ashift:SI (reg/v:SI 10 a0 [orig:137 i ] [137])
> > (const_int 3 [0x3]))) [0  S8 A64])) 218 
> > {*movdf_hardfloat_rv32}
> >  (nil))
> >
> > Since we currently do not support adjustments to th_m_mir/th_m_miu,
> > which will trigger ICE. So it is recommended to place the split
> > optimizations after reload to ensure FPR when registers are allocated.
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/thead.md: Add limits for splits.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/xtheadfmemidx-medany.c: New test.
> > ---
> >  gcc/config/riscv/thead.md | 22 ---
> >  .../gcc.target/riscv/xtheadfmemidx-medany.c   | 38 +++
> >  2 files changed, 54 insertions(+), 6 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadfmemidx-medany.c
> >
> > diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
> > index e370774d518..5c7d4beb1b6 100644
> > --- a/gcc/config/riscv/thead.md
> > +++ b/gcc/config/riscv/thead.md
> > @@ -933,14 +933,17 @@ (define_insn_and_split "*th_fmemidx_I_a"
> > && pow2p_hwi (INTVAL (operands[2]))
> > && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)"
> >"#"
> > -  "&& 1"
> > +  "&& reload_completed"
> >[(set (match_dup 0)
> >  (mem:TH_M_NOEXTF (plus:X
> >(match_dup 3)
> >(ashift:X (match_dup 1) (match_dup 2)]
> >{ operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
> >}
> > -)
> > +  [(set_attr "move_type" "fpload")
> > +   (set_attr "mode" "")
> > +   (set_attr "type" "fmove")
> > +   (set (attr "length") (const_int 16))])
> >
> >  (define_insn_and_split "*th_fmemidx_I_c"
> >[(set (mem:TH_M_ANYF (plus:X
> > @@ -977,7 +980,7 @@ (define_insn_and_split "*th_fmemidx_US_a"
> > && CONST_INT_P (operands[3])
> > && (INTVAL (operands[3]) >> exact_log2 (INTVAL (operands[2]))) == 
> > 0x"
> >"#"
> > -  "&& 1"
> > +  "&& reload_completed"
> >[(set (match_dup 0)
> >  (mem:TH_M_NOEXTF (plus:DI
> >(match_dup 4)
> > @@ -985,7 +988,10 @@ (define_insn_and_split "*th_fmemidx_US_a"
> >{ operands[1] = gen_lowpart (SImode, operands[1]);
> >  operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
> >}
> > -)
> > +  [(set_attr "move_type" "fpload")
> > +   (set_attr "mode" "")
> > +   (set_attr "type" "fmove")
> > +   (set (attr "length") (const_int 16))])
> >
> >  (define_insn_and_split "*th_fmemidx_US_c"
> >[(set (mem:TH_M_ANYF (plus:DI
> > @@ -1020,12 +1026,16 @@ (define_insn_and_split "*th_fmemidx_UZ_a"
> >"TARGET_64BIT && TARGET_XTHEADMEMIDX && TARGET_XTHEADFMEMIDX
> > && (!HARD_REGISTER_NUM_P (REGNO (operands[0])) || HARDFP_REG_P (REGNO 
> > (operands[0])))"
> >"#"
> > -  "&& 1"
> > +  "&& reload_completed"
> >[(set (match_dup 0)
> >  (mem:TH_M_NOEXTF (plus:DI
> >(match_dup 2)
> >(zero_extend:DI (match_dup 1)]
> > -)
> > +  ""
> > +  [(set_attr "move_type" "fpload")
> > +   (set_attr "mode" "")
> > +   (set_attr "type" "fmove")
> > +   (set (attr "length") (const_int 16))])
> >
> >  (define_insn_and_split "*th_fmemidx_UZ_c"
> >[(set (mem:TH_M_ANYF (plus:DI
> > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-medany.c 
> > b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-medany.c
> > new file mode 100644
> > index 000..0c8060d0632
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-medany.c
> > @@ -0,0 +1,38 @@
> > +/* { dg-do compile } */
> > +/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-O3" "-Og" "-Os" "-Oz"} } */
> > +/* { dg-options "-march=rv32gc_xtheadfmemidx_xtheadfmv_xtheadmemidx 
> > -mabi=ilp32d -mcmodel=medany -O2" } */
> > +
> > +typedef union {
> > +  double v;
> > +  unsigned w;
> > +} my_t;
> > +
> > +double z;
> > +
> > +double foo (int i, int j)
> > +{
> > +
> > +  if (j)
> > +{
> > +  switch (i)
> > +   {
> > +   case 0:
> > + return 1;
> > +   case 1:
> > + return 0;
> > +   case 2:
> > + return 3.0;
> > +   }
> > +}
> > +
> > +  if (i == 1)
> > +{
> > +  my_t u;
> > +  u.v = z;
> > +  u.w = 1;
> > +  z = u.v;
> > +}
> > +  return z;
> > +}
> > +
> > +/* { dg-final { scan-assembler-times {\mth\.flrd\M} 1 } } */
> > --
> > 2.17.1
> >


Re: [PATCH v2] RISC-V: T-HEAD: Add support for the XTheadInt ISA extension

2024-01-10 Thread Christoph Müllner
On Tue, Jan 9, 2024 at 6:59 PM Jeff Law  wrote:
>
>
>
> On 11/17/23 00:33, Jin Ma wrote:
> > The XTheadInt ISA extension provides acceleration interruption
> > instructions as defined in T-Head-specific:
> > * th.ipush
> > * th.ipop
> >
> > Ref:
> > https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.3.0/xthead-2023-11-10-2.3.0.pdf
> >
> > gcc/ChangeLog:
> >
> >   * config/riscv/riscv-protos.h (th_int_get_mask): New prototype.
> >   (th_int_get_save_adjustment): Likewise.
> >   (th_int_adjust_cfi_prologue): Likewise.
> >   * config/riscv/riscv.cc (TH_INT_INTERRUPT): New macro.
> >   (riscv_expand_prologue): Add the processing of XTheadInt.
> >   (riscv_expand_epilogue): Likewise.
> >   * config/riscv/riscv.md: New unspec.
> >   * config/riscv/thead.cc (BITSET_P): New macro.
> >   * config/riscv/thead.md (th_int_push): New pattern.
> >   (th_int_pop): New pattern.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/riscv/xtheadint-push-pop.c: New test.
> Thanks for the ping earlier today.  I've looked at this patch repeatedly
> over the last few weeks, but never enough to give it a full review.
>
>
> > diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
> > index 2babfafb23c..4d6e16c0edc 100644
> > --- a/gcc/config/riscv/thead.md
> > +++ b/gcc/config/riscv/thead.md
>
> > +(define_insn "th_int_pop"
> > +  [(unspec_volatile [(const_int 0)] UNSPECV_XTHEADINT_POP)
> > +   (clobber (reg:SI RETURN_ADDR_REGNUM))
> > +   (clobber (reg:SI T0_REGNUM))
> > +   (clobber (reg:SI T1_REGNUM))
> > +   (clobber (reg:SI T2_REGNUM))
> > +   (clobber (reg:SI A0_REGNUM))
> > +   (clobber (reg:SI A1_REGNUM))
> > +   (clobber (reg:SI A2_REGNUM))
> > +   (clobber (reg:SI A3_REGNUM))
> > +   (clobber (reg:SI A4_REGNUM))
> > +   (clobber (reg:SI A5_REGNUM))
> > +   (clobber (reg:SI A6_REGNUM))
> > +   (clobber (reg:SI A7_REGNUM))
> > +   (clobber (reg:SI T3_REGNUM))
> > +   (clobber (reg:SI T4_REGNUM))
> > +   (clobber (reg:SI T5_REGNUM))
> > +   (clobber (reg:SI T6_REGNUM))
> > +   (return)]
> > +  "TARGET_XTHEADINT && !TARGET_64BIT"
> > +  "th.ipop"
> > +  [(set_attr "type"  "ret")
> > +   (set_attr "mode"  "SI")])
> I probably would have gone with a load type since its the loads that are
> most likely to interact existing code in the pipeline.  But I doubt it
> really matters in practice.
>
>
> OK for the trunk.  Thanks for your patience.

I've retested this locally (no regressions), completed the ChangeLog
in the commit message and committed.

Thanks,
Christoph


Re: [PATCH V3 0/3] RISC-V: Add intrinsics for Bitmanip and Scalar Crypto extensions

2024-01-09 Thread Christoph Müllner
The tests still fail.

gcc: Unexpected fails for rv64gc lp64d medlow
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-32.c   -O0  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-32.c   -O1  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-32.c   -O2  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-32.c   -Os  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-32.c  -Oz  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-32.c   -O0  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-32.c   -O1  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-32.c   -O2  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-32.c   -Os  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-32.c  -Oz  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-64.c   -O0  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-64.c   -O1  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-64.c   -O2  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-64.c   -Os  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-64.c  -Oz  (test for
excess errors)

Note, this is not only a rv32/rv64 issue, because also -64.c tests fail.

gcc: Unexpected fails for rv32gc ilp32d medlow
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-64-emulated.c   -O1
(test for excess errors)
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-64-emulated.c   -O2
(test for excess errors)
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-64-emulated.c   -Os
(test for excess errors)
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-64-emulated.c  -Oz
(test for excess errors)
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-64.c   -O0  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-64.c   -O1  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-64.c   -O2  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-64.c   -Os  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_bitmanip_intrinsic-64.c  -Oz  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-64.c   -O0  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-64.c   -O1  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-64.c   -O2  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-64.c   -Os  (test for
excess errors)
FAIL: gcc.target/riscv/scalar_crypto_intrinsic-64.c  -Oz  (test for
excess errors)




On Tue, Dec 26, 2023 at 6:47 AM Liao Shihua  wrote:
>
> Update v2 -> v3:
>   1. Change pattern mode form X to GPR in orcb, clmul, and brev8.
>   2. Add emulated testsuite.
>   3. Removed duplicate testsuite between built-in and intrinsic.
>   4. Typo fix.
>
> Update v1 -> v2:
>   1. Rename *_intrinsic-* to *_intrinsic-XLEN.
>   2. Typo fix.
>   3. Intrinsics with immediate arguments will use marcos at O0 .
>
> It's a little patch add just provides a mapping from the RV intrinsics to the 
> builtin
> names within GCC.
>
> Liao Shihua (3):
>   RISC-V: Remove the Scalar Bitmanip and Crypto Built-In function
> testsuites
>   RISC-V: Add C intrinsic for Scalar Crypto Extension
>   RISC-V: Add C intrinsic for Scalar Bitmanip Extension
>
>  gcc/config.gcc|   2 +-
>  gcc/config/riscv/bitmanip.md  |  10 +-
>  gcc/config/riscv/crypto.md|   4 +-
>  gcc/config/riscv/riscv-builtins.cc|  22 ++
>  gcc/config/riscv/riscv-cmo.def|  12 +-
>  gcc/config/riscv/riscv-ftypes.def |   2 +
>  gcc/config/riscv/riscv-scalar-crypto.def  |  22 +-
>  gcc/config/riscv/riscv_bitmanip.h | 297 +
>  gcc/config/riscv/riscv_crypto.h   | 309 ++
>  .../riscv/scalar_bitmanip_intrinsic-32.c  |  96 ++
>  .../scalar_bitmanip_intrinsic-64-emulated.c   |  32 ++
>  .../riscv/scalar_bitmanip_intrinsic-64.c  | 114 +++
>  .../riscv/scalar_crypto_intrinsic-32.c| 114 +++
>  .../riscv/scalar_crypto_intrinsic-64.c| 122 +++
>  gcc/testsuite/gcc.target/riscv/zbbw.c |  26 --
>  gcc/testsuite/gcc.target/riscv/zbc32.c|  23 --
>  gcc/testsuite/gcc.target/riscv/zbc64.c|  23 --
>  gcc/testsuite/gcc.target/riscv/zbkb32.c   |  18 -
>  gcc/testsuite/gcc.target/riscv/zbkb64.c   |   5 -
>  gcc/testsuite/gcc.target/riscv/zbkc32.c   |  17 -
>  gcc/testsuite/gcc.target/riscv/zbkc64.c   |  17 -
>  gcc/testsuite/gcc.target/riscv/zbkx32.c   |  18 -
>  gcc/testsuite/gcc.target/riscv/zbkx64.c   |  18 -
>  gcc/testsuite/gcc.target/riscv/zknd32-2.c |  28 --
>  gcc/testsuite/gcc.target/riscv/zknd64-2.c |  42 ---
>  gcc/testsuite/gcc.target/riscv/zkne32-2.c |  28 --
>  

Re: Re: [PATCH v4] RISC-V: Adds the prefix "th." for the instructions of XTheadVector.

2024-01-04 Thread Christoph Müllner
On Thu, Jan 4, 2024 at 10:18 AM juzhe.zh...@rivai.ai
 wrote:
>
> \ No newline at end of file
>
> Each file needs newline.
>
> I am not able to review arch stuff. This needs kito.
>
> Besides, Andrew Pinski want us defer theadvector to GCC-15.

Maybe I misread this (sorry if so), but I though that was answered by Kito here:
  https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641723.html




>
> I have no strong opinion here.
>
> 
> juzhe.zh...@rivai.ai
>
>
> 发件人: joshua
> 发送时间: 2024-01-04 17:15
> 收件人: 钟居哲; Jeff Law; gcc-patches
> 抄送: jim.wilson.gcc; palmer; andrew; philipp.tomsich; Christoph Müllner; 
> jinma; Cooper Qu
> 主题: Re:Re: [PATCH v4] RISC-V: Adds the prefix "th." for the instructions of 
> XTheadVector.
> Hi Juzhe,
>
> So is the following patch that this patch relies on OK to commit?
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641533.html
>
> Joshua
>
>
>
>
> --
> 发件人:钟居哲 
> 发送时间:2024年1月2日(星期二) 06:57
> 收件人:Jeff Law; 
> "cooper.joshua"; 
> "gcc-patches"
> 抄 送:"jim.wilson.gcc"; palmer; 
> andrew; "philipp.tomsich"; 
> "Christoph Müllner"; 
> jinma; Cooper Qu
> 主 题:Re: Re: [PATCH v4] RISC-V: Adds the prefix "th." for the instructions of 
> XTheadVector.
>
>
> This is Ok from my side.
> But before commit this patch, I think we need this patch first:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641533.html
>
>
> I will be back to work so I will take a look at other patches today.
> juzhe.zh...@rivai.ai
>
>
> From: Jeff Law
> Date: 2024-01-01 01:43
> To: Jun Sha (Joshua); gcc-patches
> CC: jim.wilson.gcc; palmer; andrew; philipp.tomsich; christoph.muellner; 
> juzhe.zhong; Jin Ma; Xianmiao Qu
> Subject: Re: [PATCH v4] RISC-V: Adds the prefix "th." for the instructions of 
> XTheadVector.
>
>
>
> On 12/28/23 21:19, Jun Sha (Joshua) wrote:
> > This patch adds th. prefix to all XTheadVector instructions by
> > implementing new assembly output functions. We only check the
> > prefix is 'v', so that no extra attribute is needed.
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv-protos.h (riscv_asm_output_opcode):
> > New function to add assembler insn code prefix/suffix.
> > * config/riscv/riscv.cc (riscv_asm_output_opcode): Likewise.
> > * config/riscv/riscv.h (ASM_OUTPUT_OPCODE): Likewise.
> >
> > Co-authored-by: Jin Ma 
> > Co-authored-by: Xianmiao Qu 
> > Co-authored-by: Christoph Müllner 
> > ---
> >   gcc/config/riscv/riscv-protos.h|  1 +
> >   gcc/config/riscv/riscv.cc  | 14 ++
> >   gcc/config/riscv/riscv.h   |  4 
> >   .../gcc.target/riscv/rvv/xtheadvector/prefix.c | 12 
> >   4 files changed, 31 insertions(+)
> >   create mode 100644 
> > gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/prefix.c
> >
> > diff --git a/gcc/config/riscv/riscv-protos.h 
> > b/gcc/config/riscv/riscv-protos.h
> > index 31049ef7523..5ea54b45703 100644
> > --- a/gcc/config/riscv/riscv-protos.h
> > +++ b/gcc/config/riscv/riscv-protos.h
> > @@ -102,6 +102,7 @@ struct riscv_address_info {
> >   };
> >
> >   /* Routines implemented in riscv.cc.  */
> > +extern const char *riscv_asm_output_opcode (FILE *asm_out_file, const char 
> > *p);
> >   extern enum riscv_symbol_type riscv_classify_symbolic_expression (rtx);
> >   extern bool riscv_symbolic_constant_p (rtx, enum riscv_symbol_type *);
> >   extern int riscv_float_const_rtx_index_for_fli (rtx);
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index 0d1cbc5cb5f..ea1d59d9cf2 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -5636,6 +5636,20 @@ riscv_get_v_regno_alignment (machine_mode mode)
> > return lmul;
> >   }
> >
> > +/* Define ASM_OUTPUT_OPCODE to do anything special before
> > +   emitting an opcode.  */
> > +const char *
> > +riscv_asm_output_opcode (FILE *asm_out_file, const char *p)
> > +{
> > +  /* We need to add th. prefix to all the xtheadvector
> > + insturctions here.*/
> > +  if (TARGET_XTHEADVECTOR && current_output_insn != NULL_RTX &&
> > +  p[0] == 'v')
> > +fputs ("th.", asm_out_file);
> > +
> > +  return p;
> Just a formatting nit. The GNU standards break lines before the
> operator, not after.  So
>if (TARGET_XTHEADVECTOR
>&& current_output_insn != NULL
>&& p[0] == 'v')
>
> Note that current_output_insn is "extern rtx_insn *", so use NULL, not
> NULL_RTX.
>
> Neither of these nits require a new version for review.  Just fix them.
>
> If Juzhe is fine with this, so am I.  We can refine it if necessary later.
>
> jeff
>
>
>


Re: [PATCH v4] RISC-V: Change csr_operand into vector_length_operand for vsetvl patterns.

2024-01-02 Thread Christoph Müllner
On Tue, Jan 2, 2024 at 2:35 AM juzhe.zh...@rivai.ai
 wrote:
>
> LGTM assume you have passed the regression.

Committed.
I've rebased this patch, validated that there are no regressions with the patch,
and reworded the commit message a bit before that.

>
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: Jun Sha (Joshua)
> Date: 2023-12-29 12:10
> To: gcc-patches
> CC: jim.wilson.gcc; palmer; andrew; philipp.tomsich; jeffreyalaw; 
> christoph.muellner; juzhe.zhong; Jun Sha (Joshua); Jin Ma; Xianmiao Qu
> Subject: [PATCH v4] RISC-V: Change csr_operand into vector_length_operand for 
> vsetvl patterns.
> This patch use vector_length_operand instead of csr_operand for
> vsetvl patterns, so that changes for vector will not affect scalar
> patterns using csr_operand in riscv.md.
>
> gcc/ChangeLog:
>
> * config/riscv/vector.md:
> Use vector_length_operand for vsetvl patterns.
>
> Co-authored-by: Jin Ma 
> Co-authored-by: Xianmiao Qu 
> Co-authored-by: Christoph Müllner 
> ---
> gcc/config/riscv/vector.md | 8 
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
> index f607d768b26..b5a9055cdc4 100644
> --- a/gcc/config/riscv/vector.md
> +++ b/gcc/config/riscv/vector.md
> @@ -1496,7 +1496,7 @@
> (define_insn "@vsetvl"
>[(set (match_operand:P 0 "register_operand" "=r")
> - (unspec:P [(match_operand:P 1 "csr_operand" "rK")
> + (unspec:P [(match_operand:P 1 "vector_length_operand" "rK")
>(match_operand 2 "const_int_operand" "i")
>(match_operand 3 "const_int_operand" "i")
>(match_operand 4 "const_int_operand" "i")
> @@ -1542,7 +1542,7 @@
> ;; in vsetvl instruction pattern.
> (define_insn "@vsetvl_discard_result"
>[(set (reg:SI VL_REGNUM)
> - (unspec:SI [(match_operand:P 0 "csr_operand" "rK")
> + (unspec:SI [(match_operand:P 0 "vector_length_operand" "rK")
> (match_operand 1 "const_int_operand" "i")
> (match_operand 2 "const_int_operand" "i")] UNSPEC_VSETVL))
> (set (reg:SI VTYPE_REGNUM)
> @@ -1564,7 +1564,7 @@
> ;; such pattern can allow us gain benefits of these optimizations.
> (define_insn_and_split "@vsetvl_no_side_effects"
>[(set (match_operand:P 0 "register_operand" "=r")
> - (unspec:P [(match_operand:P 1 "csr_operand" "rK")
> + (unspec:P [(match_operand:P 1 "vector_length_operand" "rK")
>(match_operand 2 "const_int_operand" "i")
>(match_operand 3 "const_int_operand" "i")
>(match_operand 4 "const_int_operand" "i")
> @@ -1608,7 +1608,7 @@
>[(set (match_operand:DI 0 "register_operand")
>  (sign_extend:DI
>(subreg:SI
> - (unspec:DI [(match_operand:P 1 "csr_operand")
> + (unspec:DI [(match_operand:P 1 "vector_length_operand")
> (match_operand 2 "const_int_operand")
> (match_operand 3 "const_int_operand")
> (match_operand 4 "const_int_operand")
> --
> 2.17.1
>
>


Re: [PR target/110201] Fix operand types for various scalar crypto insns

2023-12-14 Thread Christoph Müllner
On Fri, Dec 15, 2023 at 12:36 AM Jeff Law  wrote:
>
>
>
> On 12/14/23 02:46, Christoph Müllner wrote:
> > On Tue, Jun 20, 2023 at 12:34 AM Jeff Law via Gcc-patches
> >  wrote:
> >>
> >>
> >> A handful of the scalar crypto instructions are supposed to take a
> >> constant integer argument 0..3 inclusive.  A suitable constraint was
> >> created and used for this purpose (D03), but the operand's predicate is
> >> "register_operand".  That's just wrong.
> >>
> >> This patch adds a new predicate "const_0_3_operand" and fixes the
> >> relevant insns to use it.  One could argue the constraint is redundant
> >> now (and you'd be correct).  I wouldn't lose sleep if someone wanted
> >> that removed, in which case I'll spin up a V2.
> >>
> >> The testsuite was broken in a way that made it consistent with the
> >> compiler, so the tests passed, when they really should have been issuing
> >> errors all along.
> >>
> >> This patch adjusts the existing tests so that they all expect a
> >> diagnostic on the invalid operand usage (including out of range
> >> constants).  It adds new tests with proper constants, testing the
> >> extremes of valid values.
> >>
> >> OK for the trunk, or should we remove the D03 constraint?
> >
> > Reviewed-by: Christoph Muellner 
> >
> > The patch does not apply cleanly anymore, because there were some
> > small changes in crypto.md.
> Here's an update to that old patch that also takes care of the pattern
> where we allow 0..10 inclusive, but not registers.
>
> Regression tested on rv64gc without new failures.  It'll need a
> ChangeLog when approved, but that's easy to adjust.

Looks good and tests pass for rv64gc and rv32gc.

Reviewed-by: Christoph Muellner 
Tested-by: Christoph Muellner 


Re: [PATCH] RISC-V: fix scalar crypto pattern

2023-12-14 Thread Christoph Müllner
On Thu, Dec 14, 2023 at 1:40 AM Jeff Law  wrote:
> On 12/13/23 02:03, Christoph Müllner wrote:
> > On Wed, Dec 13, 2023 at 9:22 AM Liao Shihua  wrote:
> >>
> >> In Scalar Crypto Built-In functions, some require immediate parameters,
> >> But register_operand are incorrectly used in the pattern.
> >>
> >> E.g.:
> >> __builtin_riscv_aes64ks1i(rs1,1)
> >> Before:
> >>li a5,1
> >>aes64ks1i a0,a0,a5
> >>
> >>Assembler messages:
> >>Error: instruction aes64ks1i requires absolute expression
> >>
> >> After:
> >>aes64ks1i a0,a0,1
> >
> > Looks good to me (also tested with rv32 and rv64).
> > (I was actually surprised that the D03 constraint was not sufficient)
> >
> > Reviewed-by: Christoph Muellner 
> > Tested-by: Christoph Muellner 
> >
> > Nit: I would prefer to separate arguments with a comma followed by a space.
> > Even if the existing code was not written like that.
> > E.g. __builtin_riscv_sm4ed(rs1,rs2,1); -> __builtin_riscv_sm4ed(rs1, rs2, 
> > 1);
> >
> > I propose to remove the builtin tests for scalar crypto and scalar bitmanip
> > as part of the patchset that adds the intrinsic tests (no value in
> > duplicated tests).
> >
> >> gcc/ChangeLog:
> >>
> >>  * config/riscv/crypto.md: Use immediate_operand instead of 
> >> register_operand.
> You should mention the actual patterns changed.
>
> I would strongly recommend adding some tests that out of range cases are
> rejected (out of range constants as well as a variable for that last
> argument).  I did that in my patch from June to fix this problem (which
> was never acked/reviewed).

Sorry, I was not aware of this patch.
Since Jeff's patch was here first and also includes more tests, I
propose to move forward with his patch (but I'm not a maintainer!).
Therefore, I've reviewed Jeff's patch and replied to his email.

FWIW: Jeff's patch can be found here:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622233.html


Re: [PR target/110201] Fix operand types for various scalar crypto insns

2023-12-14 Thread Christoph Müllner
On Tue, Jun 20, 2023 at 12:34 AM Jeff Law via Gcc-patches
 wrote:
>
>
> A handful of the scalar crypto instructions are supposed to take a
> constant integer argument 0..3 inclusive.  A suitable constraint was
> created and used for this purpose (D03), but the operand's predicate is
> "register_operand".  That's just wrong.
>
> This patch adds a new predicate "const_0_3_operand" and fixes the
> relevant insns to use it.  One could argue the constraint is redundant
> now (and you'd be correct).  I wouldn't lose sleep if someone wanted
> that removed, in which case I'll spin up a V2.
>
> The testsuite was broken in a way that made it consistent with the
> compiler, so the tests passed, when they really should have been issuing
> errors all along.
>
> This patch adjusts the existing tests so that they all expect a
> diagnostic on the invalid operand usage (including out of range
> constants).  It adds new tests with proper constants, testing the
> extremes of valid values.
>
> OK for the trunk, or should we remove the D03 constraint?

Reviewed-by: Christoph Muellner 

The patch does not apply cleanly anymore, because there were some
small changes in crypto.md.


Re: [PATCH] RISC-V: fix scalar crypto pattern

2023-12-13 Thread Christoph Müllner
On Wed, Dec 13, 2023 at 9:22 AM Liao Shihua  wrote:
>
> In Scalar Crypto Built-In functions, some require immediate parameters,
> But register_operand are incorrectly used in the pattern.
>
> E.g.:
>__builtin_riscv_aes64ks1i(rs1,1)
>Before:
>   li a5,1
>   aes64ks1i a0,a0,a5
>
>   Assembler messages:
>   Error: instruction aes64ks1i requires absolute expression
>
>After:
>   aes64ks1i a0,a0,1

Looks good to me (also tested with rv32 and rv64).
(I was actually surprised that the D03 constraint was not sufficient)

Reviewed-by: Christoph Muellner 
Tested-by: Christoph Muellner 

Nit: I would prefer to separate arguments with a comma followed by a space.
Even if the existing code was not written like that.
E.g. __builtin_riscv_sm4ed(rs1,rs2,1); -> __builtin_riscv_sm4ed(rs1, rs2, 1);

I propose to remove the builtin tests for scalar crypto and scalar bitmanip
as part of the patchset that adds the intrinsic tests (no value in
duplicated tests).

> gcc/ChangeLog:
>
> * config/riscv/crypto.md: Use immediate_operand instead of 
> register_operand.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zknd32.c: Use immediate instead of parameter.
> * gcc.target/riscv/zknd64.c: Ditto.
> * gcc.target/riscv/zkne32.c: Ditto.
> * gcc.target/riscv/zkne64.c: Ditto.
> * gcc.target/riscv/zksed32.c: Ditto.
> * gcc.target/riscv/zksed64.c: Ditto.
>
> ---
>  gcc/config/riscv/crypto.md   | 16 
>  gcc/testsuite/gcc.target/riscv/zknd32.c  |  8 
>  gcc/testsuite/gcc.target/riscv/zknd64.c  |  4 ++--
>  gcc/testsuite/gcc.target/riscv/zkne32.c  |  8 
>  gcc/testsuite/gcc.target/riscv/zkne64.c  |  4 ++--
>  gcc/testsuite/gcc.target/riscv/zksed32.c |  8 
>  gcc/testsuite/gcc.target/riscv/zksed64.c |  8 
>  7 files changed, 28 insertions(+), 28 deletions(-)
>
> diff --git a/gcc/config/riscv/crypto.md b/gcc/config/riscv/crypto.md
> index 03a1d03397d..c45f12e421f 100644
> --- a/gcc/config/riscv/crypto.md
> +++ b/gcc/config/riscv/crypto.md
> @@ -148,7 +148,7 @@
>[(set (match_operand:SI 0 "register_operand" "=r")
>  (unspec:SI [(match_operand:SI 1 "register_operand" "r")
> (match_operand:SI 2 "register_operand" "r")
> -   (match_operand:SI 3 "register_operand" "D03")]
> +   (match_operand:SI 3 "immediate_operand" "D03")]
> UNSPEC_AES_DSI))]
>"TARGET_ZKND && !TARGET_64BIT"
>"aes32dsi\t%0,%1,%2,%3"
> @@ -158,7 +158,7 @@
>[(set (match_operand:SI 0 "register_operand" "=r")
>  (unspec:SI [(match_operand:SI 1 "register_operand" "r")
> (match_operand:SI 2 "register_operand" "r")
> -   (match_operand:SI 3 "register_operand" "D03")]
> +   (match_operand:SI 3 "immediate_operand" "D03")]
> UNSPEC_AES_DSMI))]
>"TARGET_ZKND && !TARGET_64BIT"
>"aes32dsmi\t%0,%1,%2,%3"
> @@ -193,7 +193,7 @@
>  (define_insn "riscv_aes64ks1i"
>[(set (match_operand:DI 0 "register_operand" "=r")
>  (unspec:DI [(match_operand:DI 1 "register_operand" "r")
> -   (match_operand:SI 2 "register_operand" "DsA")]
> +   (match_operand:SI 2 "immediate_operand" "DsA")]
> UNSPEC_AES_KS1I))]
>"(TARGET_ZKND || TARGET_ZKNE) && TARGET_64BIT"
>"aes64ks1i\t%0,%1,%2"
> @@ -214,7 +214,7 @@
>[(set (match_operand:SI 0 "register_operand" "=r")
>  (unspec:SI [(match_operand:SI 1 "register_operand" "r")
> (match_operand:SI 2 "register_operand" "r")
> -   (match_operand:SI 3 "register_operand" "D03")]
> +   (match_operand:SI 3 "immediate_operand" "D03")]
> UNSPEC_AES_ESI))]
>"TARGET_ZKNE && !TARGET_64BIT"
>"aes32esi\t%0,%1,%2,%3"
> @@ -224,7 +224,7 @@
>[(set (match_operand:SI 0 "register_operand" "=r")
>  (unspec:SI [(match_operand:SI 1 "register_operand" "r")
> (match_operand:SI 2 "register_operand" "r")
> -   (match_operand:SI 3 "register_operand" "D03")]
> +   (match_operand:SI 3 "immediate_operand" "D03")]
> UNSPEC_AES_ESMI))]
>"TARGET_ZKNE && !TARGET_64BIT"
>"aes32esmi\t%0,%1,%2,%3"
> @@ -431,7 +431,7 @@
>[(set (match_operand:SI 0 "register_operand" "=r")
>  (unspec:SI [(match_operand:SI 1 "register_operand" "r")
> (match_operand:SI 2 "register_operand" "r")
> -   (match_operand:SI 3 "register_operand" "D03")]
> +   (match_operand:SI 3 "immediate_operand" "D03")]
> SM4_OP))]
>"TARGET_ZKSED && !TARGET_64BIT"
>"\t%0,%1,%2,%3"
> @@ -442,7 +442,7 @@
>  (sign_extend:DI
>   (unspec:SI [(match_operand:SI 1 "register_operand" "r")
>  (match_operand:SI 2 "register_operand" "r")
> - 

Re: [PATCH v2] RISC-V: Supports RISC-V Profiles in '-march' option.

2023-12-12 Thread Christoph Müllner
On Tue, Dec 12, 2023 at 1:08 PM Jiawei  wrote:
>
> Supports RISC-V profiles[1] in -march option.
>
> Default input set the profile is before other formal extensions.
>
> V2: Fixes some format errors and adds code comments for parse function
> Thanks for Jeff Law's review and comments.
>
> [1]https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc (struct riscv_profiles):
>   New struct.
> (riscv_subset_list::parse_profiles): New function.
> (riscv_subset_list::parse): New table.
> * config/riscv/riscv-subset.h: New protype.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/arch-31.c: New test.
> * gcc.target/riscv/arch-32.c: New test.
> * gcc.target/riscv/arch-33.c: New test.
> * gcc.target/riscv/arch-34.c: New test.

For the positive tests (-31.c and -33.c) it would be great to test if
the enabled extension's test macros are set.
Something like this would do:
#if (!(defined __riscv_zicsr) || \
  !(defined __riscv_...))
#error "Feature macros not defined"
#endif

Also, positive tests for RVI20U32 and RVI20U64 would be nice.

>
> ---
>  gcc/common/config/riscv/riscv-common.cc  | 83 +++-
>  gcc/config/riscv/riscv-subset.h  |  2 +
>  gcc/testsuite/gcc.target/riscv/arch-31.c |  5 ++
>  gcc/testsuite/gcc.target/riscv/arch-32.c |  5 ++
>  gcc/testsuite/gcc.target/riscv/arch-33.c |  5 ++
>  gcc/testsuite/gcc.target/riscv/arch-34.c |  7 ++
>  6 files changed, 106 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/arch-31.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/arch-32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/arch-33.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/arch-34.c
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index 4d5a2f874a2..8b674a4a280 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -195,6 +195,12 @@ struct riscv_ext_version
>int minor_version;
>  };
>
> +struct riscv_profiles
> +{
> +  const char *profile_name;
> +  const char *profile_string;
> +};
> +
>  /* All standard extensions defined in all supported ISA spec.  */
>  static const struct riscv_ext_version riscv_ext_version_table[] =
>  {
> @@ -379,6 +385,42 @@ static const struct riscv_ext_version 
> riscv_combine_info[] =
>{NULL, ISA_SPEC_CLASS_NONE, 0, 0}
>  };
>
> +/* This table records the mapping form RISC-V Profiles into march string.  */
> +static const riscv_profiles riscv_profiles_table[] =
> +{
> +  /* RVI20U only contains the base extesnion 'i' as mandatory extension.  */
> +  {"RVI20U64", "rv64i"},
> +  {"RVI20U32", "rv32i"},
> +
> +  /* RVA20U contains the 'i,m,a,f,d,c,zicsr' as mandatory extensions.
> + Currently we don't have zicntr,ziccif,ziccrse,ziccamoa,
> + zicclsm,za128rs yet.   */
> +  {"RVA20U64", "rv64imafdc_zicsr"},
> +
> +  /* RVA20S64 mandatory include all the extensions in RVA20U64 and
> + additonal 'zifencei' as mandatory extensions.
> + Notes that ss1p11, svbare, sv39, svade, sscptr, ssvecd, sstvala should
> + control by binutils.  */
> +  {"RVA20S64", "rv64imafdc_zicsr_zifencei"},
> +
> +  /* RVA22U contains the 'i,m,a,f,d,c,zicsr,zihintpause,zba,zbb,zbs,
> + zicbom,zicbop,zicboz,zfhmin,zkt' as mandatory extensions.
> + Currently we don't have zicntr,zihpm,ziccif,ziccrse,ziccamoa,
> + zicclsm,zic64b,za64rs yet.  */

I would prefer that we implement the missing extensions that start
with 'z' as "dummy" extensions.
I.e., they (currently?) don't affect code generation, but they will be
passed on to the assembler and
will become part of the Tag_RISCV_arch string.

I admit that such "dummy" extensions may not be preferred by
maintainers, but we already
have precedence with Zkt.

I consider an incomplete expansion of a profile as misleading.
And later changes to complete the expansion could be called out as
"breaking changes".

> +  {"RVA22U64", "rv64imafdc_zicsr_zihintpause_zba_zbb_zbs"
>   \
> +   "_zicbom_zicbop_zicboz_zfhmin_zkt"},
> +
> +  /* RVA22S64 mandatory include all the extensions in RVA22U64 and
> + additonal 'zifencei,svpbmt,svinval' as mandatory extensions.
> + Notes that ss1p12, svbare, sv39, svade, sscptr, ssvecd, sstvala,
> + scounterenw extentions should control by binutils.  */

Typo: extentions -> extensions

I want to challenge the implementation of RVA22S64 support
(or in general all S-mode and M-mode profile support) in toolchains:
* Adding 's*'/'m*' extensions as dummy extensions won't have much use
* Having an incomplete extension is misleading (see above)
* I doubt that RVA22S64 would find many users
Therefore, I would not add support for S-mode and M-mode profiles.

> +  {"RVA22S64","rv64imafdc_zicsr_zifencei_zihintpause"
>   \
> +   

Re: [PATCH V2 0/2] RISC-V: Add intrinsics for Bitmanip and Scalar Crypto extensions

2023-12-07 Thread Christoph Müllner
On Thu, Dec 7, 2023 at 11:18 AM Liao Shihua  wrote:
>
> In accordance with the suggestions of Christoph Müllner, the following 
> amendments are made
>
> Update v1 -> v2:
>   1. Rename *_intrinsic-* to *_intrinsic-XLEN.
>   2. Typo fix.
>   3. Intrinsics with immediate arguments will use marcos at O0 .
>
> It's a little patch add just provides a mapping from the RV intrinsics to the 
> builtin
> names within GCC.

Thanks for the update!

I think this patchset was not properly tested as I see the tests failing.

$ /opt/riscv-mainline/bin/riscv64-unknown-linux-gnu-gcc
-march=rv64gc_zbb_zbc_zbkb_zbkc_zbkx -mabi=lp64d
/home/cm/src/gcc/riscv-mainline/gcc/testsuite/gcc.target/riscv/scalar_bitmanip_intrinsic-64.c
In file included from
/home/cm/src/gcc/riscv-mainline/gcc/testsuite/gcc.target/riscv/scalar_bitmanip_intrinsic-64.c:5:
/opt/riscv-mainline/lib/gcc/riscv64-unknown-linux-gnu/14.0.0/include/riscv_bitmanip.h:
In function '__riscv_orc_b_32':
/opt/riscv-mainline/lib/gcc/riscv64-unknown-linux-gnu/14.0.0/include/riscv_bitmanip.h:61:10:
error: implicit declaration of function '__builtin_riscv_orc_b_32';
did you mean '__builtin_riscv_orc_b_64'?
[-Wimplicit-function-declaration]
   61 |   return __builtin_riscv_orc_b_32 (x);
  |  ^~~~
  |  __builtin_riscv_orc_b_64

The spec says: Emulated with rev8+sext.w on RV64.
But I think this is a bug in the spec and should be "orc.b + sext.w".
Still, you need to handle that somehow.

$ /opt/riscv-mainline/bin/riscv64-unknown-linux-gnu-gcc
-march=rv64gc_zknd_zkne_zknh_zksed_zksh -mabi=lp64 -mabi=lp64d
/home/cm/src/gcc/riscv-mainline/gcc/testsuite/gcc.target/riscv/scalar_crypto_intrinsic-64.c
/tmp/ccynQLn2.s: Assembler messages:
/tmp/ccynQLn2.s:127: Error: instruction aes64ks1i requires absolute expression
/tmp/ccynQLn2.s:593: Error: instruction sm4ed requires absolute expression
/tmp/ccynQLn2.s:633: Error: instruction sm4ks requires absolute expression

The absolute expression means that you cannot use a variable but must
use an immediate.
E.g.:
uint64_t foo4(uint64_t rs1)
{
return __riscv_aes64ks1i(rs1, 3);
}
Here the 3 will be encoded into the instruction.

There are probably more issues, but I stopped investigating after these two.

Also, there are some missing spaces to separate arguments. E.g.:
  return __riscv_aes64ks1i(rs1,rnum);
...should be...
  return __riscv_aes64ks1i(rs1, rnum);

Please make sure to test these patches for RV32 and RV64 before
sending a new revision.
If you run into issues that you can't resolve, then just reach out.

BR
Christoph

>
>
> Liao Shihua (2):
>   Add C intrinsics of Scalar Crypto Extension
>   Add C intrinsics of Bitmanip Extension
>
>  gcc/config.gcc|   2 +-
>  gcc/config/riscv/riscv-builtins.cc|  22 ++
>  gcc/config/riscv/riscv-ftypes.def |   2 +
>  gcc/config/riscv/riscv-scalar-crypto.def  |  18 +
>  gcc/config/riscv/riscv_bitmanip.h | 297 +
>  gcc/config/riscv/riscv_crypto.h   | 309 ++
>  .../riscv/scalar_bitmanip_intrinsic-32.c  |  97 ++
>  .../riscv/scalar_bitmanip_intrinsic-64.c  | 115 +++
>  .../riscv/scalar_crypto_intrinsic-32.c| 115 +++
>  .../riscv/scalar_crypto_intrinsic-64.c| 122 +++
>  10 files changed, 1098 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/config/riscv/riscv_bitmanip.h
>  create mode 100644 gcc/config/riscv/riscv_crypto.h
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/scalar_bitmanip_intrinsic-32.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/scalar_bitmanip_intrinsic-64.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/scalar_crypto_intrinsic-32.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/scalar_crypto_intrinsic-64.c
>
> --
> 2.34.1
>


[PATCH] RISC-V: xtheadmemidx: Document inline asm issue with memory constraint

2023-12-05 Thread Christoph Müllner
The XTheadMemIdx support relies on the fact that memory operands that
can be expressed by XTheadMemIdx instructions, will only appear as
operands of such instructions.  For internal instruction generation
this is guaranteed by the implemenation.  However, in case of inline
assembly, this guarantee is not given and we cannot differentiate
these two cases when printing the operand:

  asm volatile ("sd %1,%0" : "=m"(*tmp) : "r"(val));
  asm volatile ("th.srd %1,%0" : "=m"(*tmp) : "r"(val));

If XTheadMemIdx is enabled, then the address will be printed as if an
XTheadMemIdx instruction is emitted, which is obviously wrong in the
first case.

There might be solutions to handle this (e.g. using TARGET_MEM_CONSTRAINT
or extending the mnemonics to accept the standard operands for
XTheadMemIdx instructions), but let's document this behavior for now
as a known issue by adding xfail tests until we have an acceptable fix.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadmemidx-inline-asm-1.c: New test.

Reported-by: Jin Ma 
Signed-off-by: Christoph Müllner 
---
 .../riscv/xtheadmemidx-inline-asm-1.c | 26 +++
 1 file changed, 26 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmemidx-inline-asm-1.c

diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-inline-asm-1.c 
b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-inline-asm-1.c
new file mode 100644
index 000..da52433feb7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-inline-asm-1.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadmemidx" } */
+
+/* XTheadMemIdx support is implemented such that reg+reg addressing mode
+   loads/stores are preferred over standard loads/stores.
+   If this order changed using inline assembly, the result will be invalid
+   instructions.  This test serves the purpose of documenting this
+   limitation until a solution is available.  */
+
+void foo (void *p, unsigned long off, unsigned long val)
+{
+  unsigned long *tmp = (unsigned long*)(p + off);
+  asm volatile ("sd%1,%0" : "=m"(*tmp) : "r"(val));
+}
+
+void bar (void *p, unsigned long off, unsigned long val)
+{
+  unsigned long *tmp = (unsigned long*)(p + off);
+  asm volatile ("th.srd%1,%0" : "=m"(*tmp) : "r"(val));
+}
+
+/* { dg-final { scan-assembler "sd\t\[a-z\]\[0-9\]+,0\\(\[a-z\]\[0-9\]+\\)" { 
xfail *-*-* } } } */
+/* { dg-final { scan-assembler-not 
"sd\t\[a-z\]\[0-9\]+,\[a-z\]\[0-9\]+,\[a-z\]\[0-9\]+,0" { xfail *-*-* } } } */
+/* { dg-final { scan-assembler 
"th\.srd\t\[a-z\]\[0-9\]+,\[a-z\]\[0-9\]+,\[a-z\]\[0-9\]+,0" } } */
+/* { dg-final { scan-assembler-not 
"th\.srd\t\[a-z\]\[0-9\]+,0\\(\[a-z\]\[0-9\]+\\)" } } */
-- 
2.43.0



[PATCH] RISC-V: xtheadfmemidx: Disable if xtheadmemidx is not available

2023-12-05 Thread Christoph Müllner
XTheadMemIdx provides register-register offsets for GP register
loads/stores.  XTheadFMemIdx does the same for FP registers.

We've observed an issue with XTheadFMemIdx-only builds, where FP
registers have been promoted to GP registers:

(insn 26 22 51 (set (reg:DF 15 a5 [orig:136  ] [136])
(mem/u:DF (plus:DI (reg/f:DI 15 a5 [141])
(reg:DI 10 a0 [144])) [1 CSWTCH.2[_10]+0 S8 A64])) 217 
{*movdf_hardfloat_rv64}
 (expr_list:REG_DEAD (reg:DI 10 a0 [144])
(nil)))

This results in the following assembler error:
  Assembler messages:
  Error: unrecognized opcode `th.lrd a5,a5,a0,0', extension `xtheadmemidx' 
required

There seems to be a (reasonable) assumption, that addressing modes
for FP registers are compatible with those of GP registers.

We already ran into a similar issue during development of the
XTheadFMemIdx support patch, where we could trace the issue down to
the optimization splitters.  Back then we simply disabled them in case
XTheadMemIdx is not available.  But as it turned out, that was not
enough.

To ensure, we won't see such issues anymore, let's make the support
for XTheadFMemIdx depend on XTheadMemIdx.  I.e., if only XTheadFMemIdx
is available, then no instructions of this extension will be emitted.

While this looks a bit drastic at first view, it is the best practical
solution since XTheadFMemIdx without XTheadMemIdx does not exist in real
hardware and would be an odd thing to do.

gcc/ChangeLog:

* config/riscv/thead.cc (th_memidx_classify_address_index):
Require TARGET_XTHEADMEMIDX for FP modes.
* config/riscv/thead.md: Require TARGET_XTHEADMEMIDX for all
XTheadFMemIdx pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadfmemidx-without-xtheadmemidx.c: New test.

Reported-by: Jin Ma 
Signed-off-by: Christoph Müllner 
---
 gcc/config/riscv/thead.cc |  3 +-
 gcc/config/riscv/thead.md | 19 -
 .../xtheadfmemidx-without-xtheadmemidx.c  | 39 +++
 3 files changed, 51 insertions(+), 10 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/xtheadfmemidx-without-xtheadmemidx.c

diff --git a/gcc/config/riscv/thead.cc b/gcc/config/riscv/thead.cc
index bd9af7ecd60..20353995931 100644
--- a/gcc/config/riscv/thead.cc
+++ b/gcc/config/riscv/thead.cc
@@ -603,7 +603,8 @@ th_memidx_classify_address_index (struct riscv_address_info 
*info, rtx x,
 {
   /* Ensure that the mode is supported.  */
   if (!(TARGET_XTHEADMEMIDX && is_memidx_mode (mode))
-  && !(TARGET_XTHEADFMEMIDX && is_fmemidx_mode (mode)))
+  && !(TARGET_XTHEADMEMIDX
+  && TARGET_XTHEADFMEMIDX && is_fmemidx_mode (mode)))
 return false;
 
   if (GET_CODE (x) != PLUS)
diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
index 2babfafb23c..186ca468875 100644
--- a/gcc/config/riscv/thead.md
+++ b/gcc/config/riscv/thead.md
@@ -822,11 +822,19 @@ (define_insn_and_split "*th_memidx_UZ_c"
 )
 
 ;; XTheadFMemIdx
+;; Note, that we might get GP registers in FP-mode (reg:DF a2)
+;; which cannot be handled by the XTheadFMemIdx instructions.
+;; This might even happend after register allocation.
+;; We could implement splitters that undo the combiner results
+;; if "after_reload && !HARDFP_REG_P (operands[0])", but this
+;; raises even more questions (e.g. split into what?).
+;; So let's solve this by simply requiring XTheadMemIdx
+;; which provides the necessary instructions to cover this case.
 
 (define_insn "*th_fmemidx_movsf_hardfloat"
   [(set (match_operand:SF 0 "nonimmediate_operand" "=f,th_m_mir,f,th_m_miu")
(match_operand:SF 1 "move_operand" " th_m_mir,f,th_m_miu,f"))]
-  "TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX
+  "TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX && TARGET_XTHEADMEMIDX
&& (register_operand (operands[0], SFmode)
|| reg_or_0_operand (operands[1], SFmode))"
   { return riscv_output_move (operands[0], operands[1]); }
@@ -837,6 +845,7 @@ (define_insn "*th_fmemidx_movdf_hardfloat_rv64"
   [(set (match_operand:DF 0 "nonimmediate_operand" "=f,th_m_mir,f,th_m_miu")
(match_operand:DF 1 "move_operand" " th_m_mir,f,th_m_miu,f"))]
   "TARGET_64BIT && TARGET_DOUBLE_FLOAT && TARGET_XTHEADFMEMIDX
+   && TARGET_XTHEADMEMIDX
&& (register_operand (operands[0], DFmode)
|| reg_or_0_operand (operands[1], DFmode))"
   { return riscv_output_move (operands[0], operands[1]); }
@@ -845,14 +854,6 @@ (define_insn "*th_fmemidx_movdf_hardfloat_rv64"
 
 ;; XTheadFMemIdx optimizations
 ;; Similar like XTheadMemIdx optimizations, but less cases.
-;; Note, that we might get GP registers in FP-mode (reg:DF a2)
-;; which cannot be handled by the XTh

Re: [PATCH 0/2] RISC-V: Add intrinsics for Bitmanip and Scalar Crypto extensions.

2023-12-05 Thread Christoph Müllner
On Tue, Dec 5, 2023 at 1:05 PM Liao Shihua  wrote:
>
>
> It's a little patch add just provides a mapping from the RV intrinsic to the 
> builtin
> names within GCC.

Thanks for working on this!

I checked with ./contrib/check_GNU_style, which found a two issues:
* Trailing whitespace (most likely caused by the CRLF line terminators)
* There should be exactly one space between function name and parenthesis.

Also, I don't like so much the fact that the comments in the "#endif"
lines don't match the "#if".
E.g., we have "#if defined(__riscv_zksed)" and "#endif // ZKSED".
I would expect that, in this case, we have "#endif // __riscv_zksed".

In my comments for v1 I asked about the reasons for using macros vs
inline-functions.
Craig Topper answered this by clarifying that immediate won't be
propagated in case of -O0,
because inlining will be disabled (can be tested via __OPTIMIZE__).
He also linked to existing code that shows how this is handled in existing code:
  https://github.com/gcc-mirror/gcc/blob/master/gcc/config/i386/xmmintrin.h#L52
We should follow this pattern for intrinsics that have immediate
arguments (others are fine):
#ifdef __OPTIMIZE__
  // inline function
#else
  // macro
#endif

Regarding the tests, I think the dg-skip-if for "-g" and "-flto" is fine.
But the dg-options lines seem to be too restrictive (e.g. "-O0" is skipped).
The following should be enough for RV64 tests:
  /* { dg-options "-march=rv64gc_zknd_zkne_zknh_zksed_zksh -mabi=lp64" } */

Nit: I don't mind if we have a "-32" and a "-64" test file, but I
would rename them
accordingly ("-1.c" -> "-32.c" and "-2.c" -> "-64.c").
Alternatively, if you prefer to merge both files into one, this can be
done using
the following dg-options in one file:
  /* { dg-options "-march=rv64gc_zknd_zkne_zknh_zksed_zksh" { target {
rv64 } } } */
  /* { dg-options "-march=rv32gc_zknd_zkne_zknh_zksed_zksh" { target {
rv32 } } } */
The scan-assembler lines can then be adjusted with the "{ target {
rvNN } }"  as well:
  /* { dg-final { scan-assembler-times "aes64ds\t" 1 { target { rv64 } } } } */
And the tests can be separated for RV32/RV64 using
  #if __riscv_xlen == 64
  // tests for RV64 only
  #else
  // tests for RV32 only
  #endif // __riscv_xlen == 64

BR
Christoph


>
> Liao Shihua (2):
>   Add C intrinsics of Scalar Crypto Extension
>   Add C intrinsics of Bitmanip Extension
>
>  gcc/config.gcc|   2 +-
>  gcc/config/riscv/riscv-builtins.cc|  22 ++
>  gcc/config/riscv/riscv-types.def |   2 +
>  gcc/config/riscv/riscv-scalar-crypto.def  |  18 ++
>  gcc/config/riscv/riscv_bitmanip.h | 297 ++
>  gcc/config/riscv/riscv_crypto.h   | 280 +
>  .../riscv/scalar_bitmanip_intrinsic-1.c   |  97 ++
>  .../riscv/scalar_bitmanip_intrinsic-2.c   | 115 +++
>  .../riscv/scalar_crypto_intrinsic-1.c | 115 +++
>  .../riscv/scalar_crypto_intrinsic-2.c | 122 +++
>  10 files changed, 1069 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/config/riscv/riscv_bitmanip.h
>  create mode 100644 gcc/config/riscv/riscv_crypto.h
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/scalar_bitmanip_intrinsic-1.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/scalar_bitmanip_intrinsic-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/scalar_crypto_intrinsic-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/scalar_crypto_intrinsic-2.c
>
> --
> 2.34.1
>


Re: [PATCH] RISC-V: Document optimization parameter riscv-strcmp-inline-limit

2023-12-04 Thread Christoph Müllner
On Mon, Dec 4, 2023 at 4:46 AM Kito Cheng  wrote:
>
> Wait, I got this on my machine?
>
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/doc/invoke.texi:29774: 
> misplaced }
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/doc/invoke.texi:29786: 
> misplaced }

@{n} should be @var{n}.
I was too optimistic and sent the patch before the build finished (or
in this case failed).
Sorry for that.

I have sent a v2 that builds fine:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639142.html

>
>
> On Mon, Dec 4, 2023 at 10:43 AM Kito Cheng  wrote:
> >
> > LGTM
> >
> > On Sun, Dec 3, 2023 at 5:16 AM Christoph Müllner 
> >  wrote:
> >>
> >> This patch documents the optimization parameter
> >> riscv-strcmp-inline-limit, which can be used to tweak the behaviour
> >> of -minline-strcmp and -minline-strncmp.
> >>
> >> gcc/ChangeLog:
> >>
> >> PR target/112650
> >> * doc/invoke.texi: Document riscv-strcmp-inline-limit.
> >>
> >> Signed-off-by: Christoph Müllner 
> >> ---
> >>  gcc/doc/invoke.texi | 8 
> >>  1 file changed, 8 insertions(+)
> >>
> >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> >> index 2fab4c5d71f..ba2d843b484 100644
> >> --- a/gcc/doc/invoke.texi
> >> +++ b/gcc/doc/invoke.texi
> >> @@ -29846,6 +29846,10 @@ Inlining will only be done if the strings are 
> >> properly aligned
> >>  and instructions for accelerated processing are available.
> >>  The default is to not inline strcmp calls.
> >>
> >> +The @option{--param riscv-strcmp-inline-limit=@{n}} parameter controls
> >> +the maximum number of bytes compared by the inlined code.
> >> +The default value is 64.
> >> +
> >>  @opindex minline-strncmp
> >>  @item -minline-strncmp
> >>  @itemx -mno-inline-strncmp
> >> @@ -29854,6 +29858,10 @@ Inlining will only be done if the strings are 
> >> properly aligned
> >>  and instructions for accelerated processing are available.
> >>  The default is to not inline strncmp calls.
> >>
> >> +The @option{--param riscv-strcmp-inline-limit=@{n}} parameter controls
> >> +the maximum number of bytes compared by the inlined code.
> >> +The default value is 64.
> >> +
> >>  @opindex mshorten-memrefs
> >>  @item -mshorten-memrefs
> >>  @itemx -mno-shorten-memrefs
> >> --
> >> 2.41.0
> >>


[PATCH v2] RISC-V: Document optimization parameter riscv-strcmp-inline-limit

2023-12-04 Thread Christoph Müllner
This patch documents the optimization parameter
riscv-strcmp-inline-limit, which can be used to tweak the behaviour
of -minline-strcmp and -minline-strncmp.

gcc/ChangeLog:

PR target/112650
* doc/invoke.texi: Document riscv-strcmp-inline-limit.

Signed-off-by: Christoph Müllner 
---
 gcc/doc/invoke.texi | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6fe63b5f999..2b51ff304f6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -29846,6 +29846,10 @@ Inlining will only be done if the strings are properly 
aligned
 and instructions for accelerated processing are available.
 The default is to not inline strcmp calls.
 
+The @option{--param riscv-strcmp-inline-limit=@var{n}} parameter controls
+the maximum number of bytes compared by the inlined code.
+The default value is 64.
+
 @opindex minline-strncmp
 @item -minline-strncmp
 @itemx -mno-inline-strncmp
@@ -29854,6 +29858,10 @@ Inlining will only be done if the strings are properly 
aligned
 and instructions for accelerated processing are available.
 The default is to not inline strncmp calls.
 
+The @option{--param riscv-strcmp-inline-limit=@var{n}} parameter controls
+the maximum number of bytes compared by the inlined code.
+The default value is 64.
+
 @opindex mshorten-memrefs
 @item -mshorten-memrefs
 @itemx -mno-shorten-memrefs
-- 
2.43.0



Re: [PATCH] RISC-V: Check if zcd conflicts with zcmt and zcmp

2023-12-04 Thread Christoph Müllner
On Mon, Dec 4, 2023 at 8:48 AM Kito Cheng  wrote:

LGTM

I've double-checked this in the Zc-1.0.4-3.pdf:
* Zcmp is incompatible with Zcd
* Zcmp depends on Zca
* Zcmt is incompatible with Zcd
* Zcmt depends on Zca and Zicsr

The implies-relations are already implemented.
This patch enforces the incompatibility-relations.

>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc
> (riscv_subset_list::check_conflict_ext): Check and conflicts
> with zcmt and zcmp.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/arch-29.c: New test.
> * gcc.target/riscv/arch-30.c: New test.
> ---
>  gcc/common/config/riscv/riscv-common.cc  | 8 
>  gcc/testsuite/gcc.target/riscv/arch-29.c | 7 +++
>  gcc/testsuite/gcc.target/riscv/arch-30.c | 7 +++
>  3 files changed, 22 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/arch-29.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/arch-30.c
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index aecb342b164..bfb41827f7a 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -1230,6 +1230,14 @@ riscv_subset_list::check_conflict_ext ()
>/* 'H' hypervisor extension requires base ISA with 32 registers.  */
>if (lookup ("e") && lookup ("h"))
>  error_at (m_loc, "%<-march=%s%>: h extension requires i extension", 
> m_arch);
> +
> +  if (lookup ("zcd"))
> +{
> +  if (lookup ("zcmt"))
> +   error_at (m_loc, "%<-march=%s%>: zcd conflicts with zcmt", m_arch);
> +  if (lookup ("zcmp"))
> +   error_at (m_loc, "%<-march=%s%>: zcd conflicts with zcmp", m_arch);
> +}
>  }
>
>  /* Parsing function for multi-letter extensions.
> diff --git a/gcc/testsuite/gcc.target/riscv/arch-29.c 
> b/gcc/testsuite/gcc.target/riscv/arch-29.c
> new file mode 100644
> index 000..f8281275878
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/arch-29.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64id_zcd_zcmt -mabi=lp64d" } */
> +int foo()
> +{
> +}
> +
> +/* { dg-error "zcd conflicts with zcmt" "" { target *-*-* } 0 } */
> diff --git a/gcc/testsuite/gcc.target/riscv/arch-30.c 
> b/gcc/testsuite/gcc.target/riscv/arch-30.c
> new file mode 100644
> index 000..3e67ea0bb06
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/arch-30.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64id_zcd_zcmp -mabi=lp64d" } */
> +int foo()
> +{
> +}
> +
> +/* { dg-error "zcd conflicts with zcmp" "" { target *-*-* } 0 } */
> --
> 2.40.1
>


[PATCH] RISC-V: Document optimization parameter riscv-strcmp-inline-limit

2023-12-02 Thread Christoph Müllner
This patch documents the optimization parameter
riscv-strcmp-inline-limit, which can be used to tweak the behaviour
of -minline-strcmp and -minline-strncmp.

gcc/ChangeLog:

PR target/112650
* doc/invoke.texi: Document riscv-strcmp-inline-limit.

Signed-off-by: Christoph Müllner 
---
 gcc/doc/invoke.texi | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2fab4c5d71f..ba2d843b484 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -29846,6 +29846,10 @@ Inlining will only be done if the strings are properly 
aligned
 and instructions for accelerated processing are available.
 The default is to not inline strcmp calls.
 
+The @option{--param riscv-strcmp-inline-limit=@{n}} parameter controls
+the maximum number of bytes compared by the inlined code.
+The default value is 64.
+
 @opindex minline-strncmp
 @item -minline-strncmp
 @itemx -mno-inline-strncmp
@@ -29854,6 +29858,10 @@ Inlining will only be done if the strings are properly 
aligned
 and instructions for accelerated processing are available.
 The default is to not inline strncmp calls.
 
+The @option{--param riscv-strcmp-inline-limit=@{n}} parameter controls
+the maximum number of bytes compared by the inlined code.
+The default value is 64.
+
 @opindex mshorten-memrefs
 @item -mshorten-memrefs
 @itemx -mno-shorten-memrefs
-- 
2.41.0



Re: [RFC PATCH] RISC-V: Remove f{r,s}flags builtins

2023-11-29 Thread Christoph Müllner
On Wed, Nov 29, 2023 at 8:24 PM Patrick O'Neill  wrote:
>
> Hi Christoph,
>
> The precommit-ci is seeing a large number of ICE segmentation faults as a 
> result of this patch:
> https://github.com/ewlu/gcc-precommit-ci/issues/796#issuecomment-1831853523
>
> The failures aren't in riscv.exp testsuite files so that's likely why you 
> didn't run into them in your testing.

Oh, I see.
Then keeping things like they are is probably the best idea.
Sorry for the noise!

BR
Christoph

>
> Debug log:
>
> /home/runner/work/gcc-precommit-ci/gcc-precommit-ci/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.dg/c11-atomic-2.c:110:3:
>  internal compiler error: Segmentation fault
> 0x133afb3 crash_signal
> ../../../gcc/gcc/toplev.cc:316
> 0x1678d1f contains_struct_check(tree_node*, tree_node_structure_enum, char 
> const*, int, char const*)
> ../../../gcc/gcc/tree.h:3747
> 0x1678d1f build_call_expr_loc_array(unsigned int, tree_node*, int, 
> tree_node**)
> ../../../gcc/gcc/tree.cc:10815
> 0x1679043 build_call_expr(tree_node*, int, ...)
> ../../../gcc/gcc/tree.cc:10865
> 0x17f816e riscv_atomic_assign_expand_fenv(tree_node**, tree_node**, 
> tree_node**)
> ../../../gcc/gcc/config/riscv/riscv-builtins.cc:420
> 0xc5209b build_atomic_assign
> ../../../gcc/gcc/c/c-typeck.cc:4289
> 0xc60a47 build_modify_expr(unsigned int, tree_node*, tree_node*, tree_code, 
> unsigned int, tree_node*, tree_node*)
> ../../../gcc/gcc/c/c-typeck.cc:6406
> 0xc85a61 c_parser_expr_no_commas
> ../../../gcc/gcc/c/c-parser.cc:9112
> 0xc85db1 c_parser_expression
> ../../../gcc/gcc/c/c-parser.cc:12725
> 0xc862bb c_parser_expression_conv
> ../../../gcc/gcc/c/c-parser.cc:12765
> 0xca3607 c_parser_statement_after_labels
> ../../../gcc/gcc/c/c-parser.cc:7755
> 0xc9f27e c_parser_compound_statement_nostart
> ../../../gcc/gcc/c/c-parser.cc:7242
> 0xc9f804 c_parser_compound_statement
> ../../../gcc/gcc/c/c-parser.cc:6527
> 0xca359c c_parser_statement_after_labels
> ../../../gcc/gcc/c/c-parser.cc:7590
> 0xca5713 c_parser_statement
> ../../../gcc/gcc/c/c-parser.cc:7561
> 0xca5713 c_parser_c99_block_statement
> ../../../gcc/gcc/c/c-parser.cc:7820
> 0xca6a2c c_parser_do_statement
> ../../../gcc/gcc/c/c-parser.cc:8194
> 0xca3d51 c_parser_statement_after_labels
> ../../../gcc/gcc/c/c-parser.cc:7605
> 0xc9f27e c_parser_compound_statement_nostart
> ../../../gcc/gcc/c/c-parser.cc:7242
> 0xc9f804 c_parser_compound_statement
> ../../../gcc/gcc/c/c-parser.cc:6527
> Please submit a full bug report, with preprocessed source (by using 
> -freport-bug).
> Please include the complete backtrace with any bug report.
> See <https://gcc.gnu.org/bugs/> for instructions.
> compiler exited with status 1
> FAIL: gcc.dg/c11-atomic-2.c (internal compiler error: Segmentation fault)
>
> Let me know if you need any additional info/investigation from me.
>
> Thanks,
> Patrick
>
> On 11/29/23 03:49, Christoph Muellner wrote:
>
> From: Christoph Müllner 
>
> We have two builtins which are undocumented and have no known users.
> Further, they don't exist in LLVM (so are no portable).
> This means they are in an unclear state of being supported or not.
> Let's remove them get them out of this undecided state.
>
> A discussion about making these builtins available in all
> compilers was held many years ago with the decision to
> not document them in the RISC-V C API documentation:
>   https://github.com/riscv-non-isa/riscv-c-api-doc/pull/3
>
> This is an RFC patch as this breaks existing code that uses
> these builtins, even if we don't know if such code exists.
>
> An alternative to this patch would be to document them
> in gcc/doc/extend.texi (like has been done with __builtin_riscv_pause)
> and put them into a supported state.
>
> This patch removes two tests for these builtins.
> A test of this patch did not trigger any regressions in riscv.exp.
>
> Signed-off-by: Christoph Müllner 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-builtins.cc: Remove the builtins
> __builtin_riscv_frflags and __builtin_riscv_fsflags.
>
> gcc/testsuite/ChangeLog:
>
> * g++.target/riscv/frflags.C: Removed.
> * gcc.target/riscv/fsflags.c: Removed.
> ---
>  gcc/config/riscv/riscv-builtins.cc   |  2 --
>  gcc/testsuite/g++.target/riscv/frflags.C |  7 ---
>  gcc/testsuite/gcc.target/riscv/fsflags.c | 16 
>  3 files changed, 25 deletions(-)
>  delete mode 100644 gcc/testsuite/g++.target/riscv/frflags.C
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/fsflags.c
>
> diff --git a/gcc/config/riscv/riscv-builtins.cc 
> b/gcc/config/riscv/riscv-builtins.cc
> index fc3976f3ba1..1655492b246 100644
> --- a/gcc/config/riscv/riscv-builtins.cc
> +++ b/gcc

Re: [PATCH] Add C intrinsics for scalar crypto extension

2023-11-29 Thread Christoph Müllner
On Wed, Nov 29, 2023 at 5:49 PM Liao Shihua  wrote:
>
>
> 在 2023/11/29 23:03, Christoph Müllner 写道:
>
> On Mon, Nov 27, 2023 at 9:36 AM Liao Shihua  wrote:
>
> This patch add C intrinsics for scalar crypto extension.
> Because of riscv-c-api 
> (https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44/files) includes 
> zbkb/zbkc/zbkx's
> intrinsics in bit manipulation extension, this patch only support zkn*/zks*'s 
> intrinsics.
>
> Thanks for working on this!
> Looking forward to seeing the second patch (covering bitmanip) soon as well!
> A couple of comments can be found below.
>
>
> Thanks for your comments, Christoph. Typos will be corrected in the next 
> patch.
>
> The implementation of intrinsic is belonged to the implementation in the 
> LLVM.(It does look a little strange)
>
> I will unify the implementation method in the next patch.
>
>
>
> gcc/ChangeLog:
>
> * config.gcc: Add riscv_crypto.h
> * config/riscv/riscv_crypto.h: New file.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zknd32.c: Use intrinsics instead of builtins.
> * gcc.target/riscv/zknd64.c: Likewise.
> * gcc.target/riscv/zkne32.c: Likewise.
> * gcc.target/riscv/zkne64.c: Likewise.
> * gcc.target/riscv/zknh-sha256-32.c: Likewise.
> * gcc.target/riscv/zknh-sha256-64.c: Likewise.
> * gcc.target/riscv/zknh-sha512-32.c: Likewise.
> * gcc.target/riscv/zknh-sha512-64.c: Likewise.
> * gcc.target/riscv/zksed32.c: Likewise.
> * gcc.target/riscv/zksed64.c: Likewise.
> * gcc.target/riscv/zksh32.c: Likewise.
> * gcc.target/riscv/zksh64.c: Likewise.
>
> ---
>  gcc/config.gcc|   2 +-
>  gcc/config/riscv/riscv_crypto.h   | 219 ++
>  gcc/testsuite/gcc.target/riscv/zknd32.c   |   6 +-
>  gcc/testsuite/gcc.target/riscv/zknd64.c   |  12 +-
>  gcc/testsuite/gcc.target/riscv/zkne32.c   |   6 +-
>  gcc/testsuite/gcc.target/riscv/zkne64.c   |  10 +-
>  .../gcc.target/riscv/zknh-sha256-32.c |  22 +-
>  .../gcc.target/riscv/zknh-sha256-64.c |  10 +-
>  .../gcc.target/riscv/zknh-sha512-32.c |  14 +-
>  .../gcc.target/riscv/zknh-sha512-64.c |  10 +-
>  gcc/testsuite/gcc.target/riscv/zksed32.c  |   6 +-
>  gcc/testsuite/gcc.target/riscv/zksed64.c  |   6 +-
>  gcc/testsuite/gcc.target/riscv/zksh32.c   |   6 +-
>  gcc/testsuite/gcc.target/riscv/zksh64.c   |   6 +-
>  14 files changed, 288 insertions(+), 47 deletions(-)
>  create mode 100644 gcc/config/riscv/riscv_crypto.h
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index b88591b6fd8..d67fe8b6a6f 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -548,7 +548,7 @@ riscv*)
> extra_objs="${extra_objs} riscv-vector-builtins.o 
> riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
> extra_objs="${extra_objs} thead.o riscv-target-attr.o"
> d_target_objs="riscv-d.o"
> -   extra_headers="riscv_vector.h"
> +   extra_headers="riscv_vector.h riscv_crypto.h"
> target_gtfiles="$target_gtfiles 
> \$(srcdir)/config/riscv/riscv-vector-builtins.cc"
> target_gtfiles="$target_gtfiles 
> \$(srcdir)/config/riscv/riscv-vector-builtins.h"
> ;;
> diff --git a/gcc/config/riscv/riscv_crypto.h b/gcc/config/riscv/riscv_crypto.h
> new file mode 100644
> index 000..149c1132e10
> --- /dev/null
> +++ b/gcc/config/riscv/riscv_crypto.h
> @@ -0,0 +1,219 @@
> +/* RISC-V 'K' Extension intrinsics include file.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published
> +   by the Free Software Foundation; either version 3, or (at your
> +   option) any later version.
> +
> +   GCC is distributed in the hope that it will be useful, but WITHOUT
> +   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> +   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +   License for more details.
> +
> +   Under Section 7 of GPL version 3, you are granted additional
> +   permissions described in the GCC Runtime Library Exception, version
> +   3.1, as published by the Free Software Foundation.
> +
> +   You should have received a copy of the GNU General Public License and
> +   a copy of the GCC Runtime Library Exception along with this program;
> +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +   <http:

Re: [PATCH] Add C intrinsics for scalar crypto extension

2023-11-29 Thread Christoph Müllner
On Mon, Nov 27, 2023 at 9:36 AM Liao Shihua  wrote:
>
> This patch add C intrinsics for scalar crypto extension.
> Because of riscv-c-api 
> (https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44/files) includes 
> zbkb/zbkc/zbkx's
> intrinsics in bit manipulation extension, this patch only support zkn*/zks*'s 
> intrinsics.

Thanks for working on this!
Looking forward to seeing the second patch (covering bitmanip) soon as well!
A couple of comments can be found below.

>
> gcc/ChangeLog:
>
> * config.gcc: Add riscv_crypto.h
> * config/riscv/riscv_crypto.h: New file.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zknd32.c: Use intrinsics instead of builtins.
> * gcc.target/riscv/zknd64.c: Likewise.
> * gcc.target/riscv/zkne32.c: Likewise.
> * gcc.target/riscv/zkne64.c: Likewise.
> * gcc.target/riscv/zknh-sha256-32.c: Likewise.
> * gcc.target/riscv/zknh-sha256-64.c: Likewise.
> * gcc.target/riscv/zknh-sha512-32.c: Likewise.
> * gcc.target/riscv/zknh-sha512-64.c: Likewise.
> * gcc.target/riscv/zksed32.c: Likewise.
> * gcc.target/riscv/zksed64.c: Likewise.
> * gcc.target/riscv/zksh32.c: Likewise.
> * gcc.target/riscv/zksh64.c: Likewise.
>
> ---
>  gcc/config.gcc|   2 +-
>  gcc/config/riscv/riscv_crypto.h   | 219 ++
>  gcc/testsuite/gcc.target/riscv/zknd32.c   |   6 +-
>  gcc/testsuite/gcc.target/riscv/zknd64.c   |  12 +-
>  gcc/testsuite/gcc.target/riscv/zkne32.c   |   6 +-
>  gcc/testsuite/gcc.target/riscv/zkne64.c   |  10 +-
>  .../gcc.target/riscv/zknh-sha256-32.c |  22 +-
>  .../gcc.target/riscv/zknh-sha256-64.c |  10 +-
>  .../gcc.target/riscv/zknh-sha512-32.c |  14 +-
>  .../gcc.target/riscv/zknh-sha512-64.c |  10 +-
>  gcc/testsuite/gcc.target/riscv/zksed32.c  |   6 +-
>  gcc/testsuite/gcc.target/riscv/zksed64.c  |   6 +-
>  gcc/testsuite/gcc.target/riscv/zksh32.c   |   6 +-
>  gcc/testsuite/gcc.target/riscv/zksh64.c   |   6 +-
>  14 files changed, 288 insertions(+), 47 deletions(-)
>  create mode 100644 gcc/config/riscv/riscv_crypto.h
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index b88591b6fd8..d67fe8b6a6f 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -548,7 +548,7 @@ riscv*)
> extra_objs="${extra_objs} riscv-vector-builtins.o 
> riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
> extra_objs="${extra_objs} thead.o riscv-target-attr.o"
> d_target_objs="riscv-d.o"
> -   extra_headers="riscv_vector.h"
> +   extra_headers="riscv_vector.h riscv_crypto.h"
> target_gtfiles="$target_gtfiles 
> \$(srcdir)/config/riscv/riscv-vector-builtins.cc"
> target_gtfiles="$target_gtfiles 
> \$(srcdir)/config/riscv/riscv-vector-builtins.h"
> ;;
> diff --git a/gcc/config/riscv/riscv_crypto.h b/gcc/config/riscv/riscv_crypto.h
> new file mode 100644
> index 000..149c1132e10
> --- /dev/null
> +++ b/gcc/config/riscv/riscv_crypto.h
> @@ -0,0 +1,219 @@
> +/* RISC-V 'K' Extension intrinsics include file.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published
> +   by the Free Software Foundation; either version 3, or (at your
> +   option) any later version.
> +
> +   GCC is distributed in the hope that it will be useful, but WITHOUT
> +   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> +   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +   License for more details.
> +
> +   Under Section 7 of GPL version 3, you are granted additional
> +   permissions described in the GCC Runtime Library Exception, version
> +   3.1, as published by the Free Software Foundation.
> +
> +   You should have received a copy of the GNU General Public License and
> +   a copy of the GCC Runtime Library Exception along with this program;
> +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +   .  */
> +
> +#ifndef __RISCV_CRYPTO_H
> +#define __RISCV_CRYPTO_H
> +
> +#include 
> +
> +#if defined (__cplusplus)
> +extern "C" {
> +#endif
> +
> +#if defined(__riscv_zknd)
> +#if __riscv_xlen == 32
> +#define __riscv_aes32dsi(x, y, bs) __builtin_riscv_aes32dsi(x, y, bs)
> +#define __riscv_aes32dsmi(x, y, bs) __builtin_riscv_aes32dsmi(x, y, bs)
> +#endif
> +
> +#if __riscv_xlen == 64
> +static __inline__ uint64_t __attribute__ ((__always_inline__, __nodebug__))
> +__riscv_aes64ds (uint64_t __x, uint64_t __y)
> +{
> +  return __builtin_riscv_aes64ds (__x, __y);
> +}

I don't understand why some intrinsic functions are implemented as
macros to builtins
and some are implemented as static inline wrappers around butilins.
Is there a particular reason that this 

Re: RISC-V: Support XTheadVector extensions

2023-11-22 Thread Christoph Müllner
On Wed, Nov 22, 2023 at 11:48 PM Kito Cheng  wrote:
>
> I am less worry about the thead vector combined with other zv extension, 
> instead we should reject those combinations at all.
>
> My reason is thead vector is transitional products, they won't have any 
> further new products with that longer, also it's not compatible with all 
> other zv extension in theory, zv extension requires at least zve32x which is 
> subset of v1p0, and I don't think it's valid to use thead vector as 
> replacement required extension - it should just introduce another thead 
> vector extension instead.

The "transitional products" argument is probably enough to add this restriction,
so we will add this to the first patch of the series.

Further, we'll implement approach 1 (emitting no "th." prefix for
instructions in vector.md)
with an additional patch on top, which implements the ASM_OUTPUT_OPCODE hook
(with a comment that clarifies why "ptr[0] == 'v'" is sufficient there).
So the decision about this can be postponed and we can focus on the rest
of the patchset as Jeff suggested.

Thanks for the inputs!

>
>
>
> Jeff Law  於 2023年11月23日 週四 06:27 寫道:
>>
>>
>>
>> On 11/22/23 07:24, Christoph Müllner wrote:
>> > On Wed, Nov 22, 2023 at 2:52 PM 钟居哲  wrote:
>> >>
>> >> I am totally ok to approve theadvector on GCC-14 before stage 3 close
>> >> as long as it doesn't touch the current RVV codes too much and binutils 
>> >> supports theadvector.
>> >>
>> >> I have provided the draft approach:
>> >> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637349.html
>> >> which turns out doesn't need to change any codes of vector.md.
>> >> I strongly suggest follow this draft. I can be actively review 
>> >> theadvector during stage 3.
>> >> And hopefully can help you land theadvector on GCC-14.
>> >
>> > I see now two approaches:
>> > 1) Let GCC emit RVV instructions for XTheadVector for instructions
>> > that are in both
>> > 2) Use the ASM_OUTPUT_OPCODE hook to output "th." for these instructions
>> >
>> > No doubt, the ASM_OUTPUT_OPCODE hook approach is better than our
>> > format-string approach, but would 1) not be the even better
>> > solution? It would also mean, that not a single test case is required
>> > for these overlapping instructions (only a few tests that ensure that
>> > we don't emit RVV instructions that are not available in
>> > XTheadVector). Besides that, letting GCC emit RVV instructions for
>> > XTheadVector is a very clever idea, because it fully utilizes the
>> > fact that both extensions overlap to a huge degree.
>> >
>> > The ASM_OUTPUT_OPCODE approach could lead to an issue if we enable
>> XTheadVector
>> > with any other vector extension, say Zvfoo. In this case the Zvfoo
>> > instructions will all be prefixed as well with "th.". I know that it
>> > is not likely to run into this problem (such a machine does not exist
>> > in real hardware), but it is possible to trigger this issue easily
>> > and approach 1) would not have this potential issue.
>> I'm not a big fan of the ASM_OUTPUT_OPCODE approach.While it is
>> simple, I worry a bit about it from a long term maintenance standpoint.
>> As you note we could well end up at some point with an extension that
>> has an mnenomic starting with "v" that would blow up.  But I certainly
>> see the appeal of such a simple test to support thead vector.
>>
>> Given there are at least 3 approaches that can fix that problem (%^,
>> assembler dialect or ASM_OUTPUT_OPCODE), maybe we could set that
>> discussion aside in the immediate term and see if there are other issues
>> that are potentially more substantial.
>>
>>
>>
>>
>> --
>>
>>
>>
>> More generally, I think I need to soften my prior statement about
>> deferring this to gcc-15.  This code was submitted in time for the
>> gcc-14 deadline, so it should be evaluated just like we do anything else
>> that makes the deadline.  There are various criteria we use to evaluate
>> if something should get integrated and we should just work through this
>> series like we always do and not treat it specially in any way.
>>
>>
>> jeff


Re: Re: RISC-V: Support XTheadVector extensions

2023-11-22 Thread Christoph Müllner
On Wed, Nov 22, 2023 at 2:52 PM 钟居哲  wrote:
>
> I am totally ok to approve theadvector on GCC-14 before stage 3 close
> as long as it doesn't touch the current RVV codes too much and binutils 
> supports theadvector.
>
> I have provided the draft approach:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637349.html
> which turns out doesn't need to change any codes of vector.md.
> I strongly suggest follow this draft. I can be actively review theadvector 
> during stage 3.
> And hopefully can help you land theadvector on GCC-14.

I see now two approaches:
1) Let GCC emit RVV instructions for XTheadVector for instructions
that are in both
2) Use the ASM_OUTPUT_OPCODE hook to output "th." for these instructions

No doubt, the ASM_OUTPUT_OPCODE hook approach is better than
our format-string approach, but would 1) not be the even better solution?
It would also mean, that not a single test case is required for these
overlapping instructions (only a few tests that ensure that we don't emit
RVV instructions that are not available in XTheadVector).
Besides that, letting GCC emit RVV instructions for XTheadVector is a
very clever idea,
because it fully utilizes the fact that both extensions overlap to a
huge degree.

The ASM_OUTPUT_OPCODE approach could lead to an issue if we enable XTheadVector
with any other vector extension, say Zvfoo. In this case the Zvfoo
instructions will
all be prefixed as well with "th.". I know that it is not likely to
run into this problem
(such a machine does not exist in real hardware), but it is possible
to trigger this
issue easily and approach 1) would not have this potential issue.

Thanks,
Christoph


>
> Thanks.
>
> ________
> juzhe.zh...@rivai.ai
>
>
> From: Christoph Müllner
> Date: 2023-11-22 18:07
> To: juzhe.zh...@rivai.ai
> CC: gcc-patches; kito.cheng; Kito.cheng; cooper.joshua; Robin Dapp; 
> jeffreyalaw; Philipp Tomsich; Cooper Qu; Jin Ma; Nelson Chu
> Subject: Re: RISC-V: Support XTheadVector extensions
> Hi Juzhe,
>
> Sorry for the late reply, but I was not on CC, so I missed this email.
>
> On Fri, Nov 17, 2023 at 2:41 PM juzhe.zh...@rivai.ai
>  wrote:
> >
> > Ok. I just read the theadvector extension.
> >
> > https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector.adoc
> >
> > Theadvector is not custom extension. Just a uarch to disable some of the 
> > RVV1.0 extension
> > Theadvector can be considered as subextension of 'V' extension with 
> > disabling some of the
> > instructions and adding some new thead vector target load/store (This is 
> > another story).
> >
> > So, for disabling the instruction that theadvector doesn't support.
> > You don't need to touch such many codes.
> >
> > Here is a much simpler approach to do (I think it's definitely working):
> > 1. Don't change any codes in vector.md and keep GCC generates ASM with 
> > "th." prefix.
> > 2. Add !TARGET_THEADVECTOR into vector-iterator.md to disable the mode you 
> > don't want.
> > For example , theadvector doesn't support fractional vector.
> >
> > Then it's pretty simple:
> >
> > RVVMF2SI "TARGET_VECTOR && !TARGET_THEADVECTOR".
> >
> > 3. Remove all the tests you add in this patch.
> > 4. You can add theadvector specific load/store for example, th.vlb 
> > instructions they are allowed.
> > 5. Modify binutils, and make th.vmulh.vv as the pseudo instruction of 
> > vmulh.vv
> > 6. So with compile option "-S", you will still see ASM as  "vmulh.vv". but 
> > with objdump, you will see th.vmulh.vv.
>
> Yes, all these points sound reasonable, to minimize the patchset size.
> I believe in point 1 you meant "without th. prefix".
>
> I've added Jin Ma (who is the main author of the Binutils patchset) so
> he is also aware
> of the proposal to use pseudo instructions to avoid duplication in Binutils.
>
> Thank you very much!
> Christoph
>
>
> >
> > After this change, you can send V2, then I can continue to review on GCC-15.
> >
> > Thanks.
> >
> > 
> > juzhe.zh...@rivai.ai
> >
> >
> > From: juzhe.zh...@rivai.ai
> > Date: 2023-11-17 19:39
> > To: gcc-patches
> > CC: kito.cheng; kito.cheng; cooper.joshua; Robin Dapp; jeffreyalaw
> > Subject: RISC-V: Support XTheadVector extensions
> > 90% theadvector extension reusing current RVV 1.0 instructions patterns:
> > Just change ASM, For example:
> >
> > @@ -2923,7 +2923,7 @@ (define_insn "*pred_mulh_scalar"
> >   (match_operand:VFULLI_D 3 "regi

Re: RISC-V: Support XTheadVector extensions

2023-11-22 Thread Christoph Müllner
Hi Juzhe,

Sorry for the late reply, but I was not on CC, so I missed this email.

On Fri, Nov 17, 2023 at 2:41 PM juzhe.zh...@rivai.ai
 wrote:
>
> Ok. I just read the theadvector extension.
>
> https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector.adoc
>
> Theadvector is not custom extension. Just a uarch to disable some of the 
> RVV1.0 extension
> Theadvector can be considered as subextension of 'V' extension with disabling 
> some of the
> instructions and adding some new thead vector target load/store (This is 
> another story).
>
> So, for disabling the instruction that theadvector doesn't support.
> You don't need to touch such many codes.
>
> Here is a much simpler approach to do (I think it's definitely working):
> 1. Don't change any codes in vector.md and keep GCC generates ASM with "th." 
> prefix.
> 2. Add !TARGET_THEADVECTOR into vector-iterator.md to disable the mode you 
> don't want.
> For example , theadvector doesn't support fractional vector.
>
> Then it's pretty simple:
>
> RVVMF2SI "TARGET_VECTOR && !TARGET_THEADVECTOR".
>
> 3. Remove all the tests you add in this patch.
> 4. You can add theadvector specific load/store for example, th.vlb 
> instructions they are allowed.
> 5. Modify binutils, and make th.vmulh.vv as the pseudo instruction of vmulh.vv
> 6. So with compile option "-S", you will still see ASM as  "vmulh.vv". but 
> with objdump, you will see th.vmulh.vv.

Yes, all these points sound reasonable, to minimize the patchset size.
I believe in point 1 you meant "without th. prefix".

I've added Jin Ma (who is the main author of the Binutils patchset) so
he is also aware
of the proposal to use pseudo instructions to avoid duplication in Binutils.

Thank you very much!
Christoph


>
> After this change, you can send V2, then I can continue to review on GCC-15.
>
> Thanks.
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: juzhe.zh...@rivai.ai
> Date: 2023-11-17 19:39
> To: gcc-patches
> CC: kito.cheng; kito.cheng; cooper.joshua; Robin Dapp; jeffreyalaw
> Subject: RISC-V: Support XTheadVector extensions
> 90% theadvector extension reusing current RVV 1.0 instructions patterns:
> Just change ASM, For example:
>
> @@ -2923,7 +2923,7 @@ (define_insn "*pred_mulh_scalar"
>   (match_operand:VFULLI_D 3 "register_operand"  "vr,vr, vr, vr")] VMULH)
>(match_operand:VFULLI_D 2 "vector_merge_operand" "vu, 0, vu,  0")))]
>"TARGET_VECTOR"
> -  "vmulh.vx\t%0,%3,%z4%p1"
> +  "%^vmulh.vx\t%0,%3,%z4%p1"
>[(set_attr "type" "vimul")
> (set_attr "mode" "")])
>
> +  if (letter == '^')
> +{
> +  if (TARGET_XTHEADVECTOR)
> + fputs ("th.", file);
> +  return;
> +}
>
>
> For almost all patterns, you just simply append "th." in the ASM prefix.
> like change "vmulh.vv" -> "th.vmulh.vv"
>
> Almost all theadvector instructions are not new features,  all same as RVV1.0.
> Why do you invent the such ISA doesn't include any features that RVV1.0 
> doesn't satisfy ?
>
> I am not explicitly object this patch. But I should know the reason.
>
> Btw, stage 1 will close soon.  So I will review this patch on GCC-15 as long 
> as all other RISC-V maintainers agree.
>
>
> 
> juzhe.zh...@rivai.ai


Re: RISC-V: Support XTheadVector extensions

2023-11-18 Thread Christoph Müllner
On Fri, Nov 17, 2023 at 12:40 PM juzhe.zh...@rivai.ai
 wrote:
>
> 90% theadvector extension reusing current RVV 1.0 instructions patterns:
> Just change ASM, For example:
>
> @@ -2923,7 +2923,7 @@ (define_insn "*pred_mulh_scalar"
>   (match_operand:VFULLI_D 3 "register_operand"  "vr,vr, vr, vr")] VMULH)
>(match_operand:VFULLI_D 2 "vector_merge_operand" "vu, 0, vu,  0")))]
>"TARGET_VECTOR"
> -  "vmulh.vx\t%0,%3,%z4%p1"
> +  "%^vmulh.vx\t%0,%3,%z4%p1"
>[(set_attr "type" "vimul")
> (set_attr "mode" "")])
>
> +  if (letter == '^')
> +{
> +  if (TARGET_XTHEADVECTOR)
> + fputs ("th.", file);
> +  return;
> +}
>
>
> For almost all patterns, you just simply append "th." in the ASM prefix.
> like change "vmulh.vv" -> "th.vmulh.vv"
>
> Almost all theadvector instructions are not new features,  all same as RVV1.0.
> Why do you invent the such ISA doesn't include any features that RVV1.0 
> doesn't satisfy ?
>
> I am not explicitly object this patch. But I should know the reason.

Palmer already outlined the reason why this has been implemented in HW.
I want to add some comments on the specification and the design of the
SW support.

We have to face the fact here, that T-Head implemented the 0.7.1 draft
version of RVV (plus a VLENB CSR).
I don't want to waste time and discuss who is to blame for that (this
has been done elsewhere in enough detail).
Also, there are mechanisms now in place to avoid that something like
this happens again.

When we call this extension "RVV-0.7.1-draft" (plus VLENB), then we
are facing arguments that
claim that a RVV "draft" cannot be treated as a ratified extension.
Further, there are arguments
that only one RVV extension exists (the one that was ratified).
Therefore, T-Head's vector extension was
several times described as a "custom-extension", which is
"non-conforming" (uses encoding space
of standard extension). Of course, this hides the fact that the
extension is identical to RVV 1.0 to a high degree.
Anyway, I don't think that these arguments and descriptions are wrong.

So, in order to avoid pointless discussions about what it is, and why
it is what it is,
we simply accepted this description and gave the extension the name
XTheadVector.
In RISC-V vendor instructions and CSRs need to have vendor prefixes ("th.").
The specification can be found (together with all other XThead*
extensions) here:
  
https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector.adoc
Some further details, which are worth mentioning here about this specification:
* Factional LMUL values are not supported.
* Zvlsseg was an extension in RVV 0.7.1, but is part in RVV 1.0.
  Since T-Head has these instructions as well, we followed the RVV 1.0
idea and made
  these instructions mandatory for XTheadVector (ie. avoiding
introduction of useless extensions).
* Zvamo was an extension in RVV 0.7.1, which was dropped in RVV 1.0.
  Since T-Head has these instructions as well, we defined XTheadZvamo.

So the result is that we have a custom extension, which uses the RVI
encoding space
and which "by accident" has a huge overlap with RVV 1.0.
We are all fine with this, as long as this is our ticket to get the
extension supported upstream
(in the sense that everyone's opinions are respected and a solution is
found which
will not trigger useless discussions about things that happened a long
time ago).

The implementation follows this idea: it is a vendor extension and is
kept as separate
as possible from standard extensions. However, avoid duplication was
one of our most important
goals, so we came up with reusing the overlapping functionality by
just adding the instruction prefixes.

For the intrinsics API, we use a more user-friendly (pragmatic) approach:
* state the set of supported RVV intrinsic functions
* state the missing support of fractional LMUL values
* list the extension-specific intrinsic functions for the additional
load/store functionality

I hope this gives a good overview of our reasoning.
Let me know if you have further questions.

BR
Christoph


>
> Btw, stage 1 will close soon.  So I will review this patch on GCC-15 as long 
> as all other RISC-V maintainers agree.
>
>
> 
> juzhe.zh...@rivai.ai


Re: [PATCH v2] RISC-V: Implement target attribute

2023-11-15 Thread Christoph Müllner
On Tue, Nov 14, 2023 at 3:15 PM Kito Cheng  wrote:
>
> The target attribute which proposed in [1], target attribute allow user
> to specify a local setting per-function basis.
>
> The syntax of target attribute is `__attribute__((target("")))`.
>
> and the syntax of `` describes below:
> ```
> ATTR-STRING := ATTR-STRING ';' ATTR
>  | ATTR
>
> ATTR:= ARCH-ATTR
>  | CPU-ATTR
>  | TUNE-ATTR
>
> ARCH-ATTR   := 'arch=' EXTENSIONS-OR-FULLARCH
>
> EXTENSIONS-OR-FULLARCH := 
> | 
>
> EXTENSIONS :=  ',' 
> | 
>
> FULLARCHSTR:= 
>
> EXTENSION  :=   
>
> OP := '+'
>
> VERSION:= [0-9]+ 'p' [0-9]+
> | [1-9][0-9]*
> |
>
> EXTENSION-NAME := Naming rule is defined in RISC-V ISA manual
>
> CPU-ATTR:= 'cpu=' 
> TUNE-ATTR   := 'tune=' 
> ```
>
> Changes since v1:
> - Use std::unique_ptr rather than alloca to prevent memory issue.
> - Error rather than warning when attribute duplicated.
>
> [1] https://github.com/riscv-non-isa/riscv-c-api-doc/pull/35

I've reviewed with a focus on the utilized backend hooks and macros.

Reviewed-by: Christoph Müllner 

Note, that in the changelog below there are quite many empty entries.

>
> gcc/ChangeLog:
>
> * config.gcc (riscv): Add riscv-target-attr.o.
> * config/riscv/riscv-protos.h (riscv_declare_function_size) New.
> (riscv_option_valid_attribute_p): New.
> (riscv_override_options_internal): New.
> (struct riscv_tune_info): New.
> (riscv_parse_tune): New.
> * config/riscv/riscv-target-attr.cc
> (class riscv_target_attr_parser): New.
> (struct riscv_attribute_info): New.
> (riscv_attributes): New.
> (riscv_target_attr_parser::parse_arch):
> (riscv_target_attr_parser::handle_arch):
> (riscv_target_attr_parser::handle_cpu):
> (riscv_target_attr_parser::handle_tune):
> (riscv_target_attr_parser::update_settings):
> (riscv_process_one_target_attr):
> (num_occurences_in_str):
> (riscv_process_target_attr):
> (riscv_option_valid_attribute_p):
> * config/riscv/riscv.cc: Include target-globals.h and
> riscv-subset.h.
> (struct riscv_tune_info): Move to riscv-protos.h.
> (get_tune_str):
> (riscv_parse_tune):
> (riscv_declare_function_size):
> (riscv_option_override): Build target_option_default_node and
> target_option_current_node.
> (riscv_save_restore_target_globals):
> (riscv_option_restore):
> (riscv_previous_fndecl):
> (riscv_set_current_function): Apply the target attribute.
> (TARGET_OPTION_RESTORE): Define.
> (TARGET_OPTION_VALID_ATTRIBUTE_P): Ditto.
> * config/riscv/riscv.h (SWITCHABLE_TARGET): Define to 1.
> (ASM_DECLARE_FUNCTION_SIZE) Define.
> * config/riscv/riscv.opt (mtune=): Add Save attribute.
> (mcpu=): Ditto.
> (mcmodel=): Ditto.
> * config/riscv/t-riscv: Add build rule for riscv-target-attr.o
> * doc/extend.texi: Add doc for target attribute.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/target-attr-01.c: New.
> * gcc.target/riscv/target-attr-02.c: Ditto.
> * gcc.target/riscv/target-attr-03.c: Ditto.
> * gcc.target/riscv/target-attr-04.c: Ditto.
> * gcc.target/riscv/target-attr-05.c: Ditto.
> * gcc.target/riscv/target-attr-06.c: Ditto.
> * gcc.target/riscv/target-attr-07.c: Ditto.
> * gcc.target/riscv/target-attr-bad-01.c: Ditto.
> * gcc.target/riscv/target-attr-bad-02.c: Ditto.
> * gcc.target/riscv/target-attr-bad-03.c: Ditto.
> * gcc.target/riscv/target-attr-bad-04.c: Ditto.
> * gcc.target/riscv/target-attr-bad-05.c: Ditto.
> * gcc.target/riscv/target-attr-bad-06.c: Ditto.
> * gcc.target/riscv/target-attr-bad-07.c: Ditto.
> * gcc.target/riscv/target-attr-bad-08.c: Ditto.
> * gcc.target/riscv/target-attr-bad-09.c: Ditto.
> * gcc.target/riscv/target-attr-bad-10.c: Ditto.
> ---
>  gcc/config.gcc|   2 +-
>  gcc/config/riscv/riscv-protos.h   |  21 +
>  gcc/config/riscv/riscv-target-attr.cc | 395 ++
>  gcc/config/riscv/riscv.cc | 192 +++--
>  gcc/config/riscv/riscv.h  |   6 +
>  gcc/config/riscv/riscv.opt|   6 +-
>  gcc/con

Re: [PATCH] RISC-V: Save/restore ra register correctly [PR112478]

2023-11-15 Thread Christoph Müllner
On Tue, Nov 14, 2023 at 3:15 PM Kito Cheng  wrote:
>
> We set ra to fixed register now, but we still need to save/restore that at
> prologue/epilogue if that has used.

So before 71f906498ada9 $ra was neither a fixed nor a used register.
Therefore, riscv_save_reg_p returned true in the first test (not global reg,
not used_or_fixed, and ever_live_p).
After this commit, this does not happen anymore, because the test for
not used_or_fixed fails and we don't test for ever_live_p in the following.
And this patch restores this behavior.

Reviewed-by: Christoph Müllner 
Tested-by: Christoph Müllner 

>
> gcc/ChangeLog:
>
> PR target/112478
> * config/riscv/riscv.cc (riscv_save_return_addr_reg_p): Check ra
> is ever lived.
>
> gcc/testsuite/gcc/ChangeLog:
>
> PR target/112478
> * riscv/pr112478.c: New.
> ---
>  gcc/config/riscv/riscv.cc | 4 
>  gcc/testsuite/gcc.target/riscv/pr112478.c | 8 
>  2 files changed, 12 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr112478.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index ecee7eb4727..f09c4066903 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -5802,6 +5802,10 @@ riscv_save_return_addr_reg_p (void)
>if (riscv_far_jump_used_p ())
>  return true;
>
> +  /* We need to save it if anyone has used that.  */
> +  if (df_regs_ever_live_p (RETURN_ADDR_REGNUM))
> +return true;
> +
>/* Need not to use ra for leaf when frame pointer is turned off by
>   option whatever the omit-leaf-frame's value.  */
>if (frame_pointer_needed && crtl->is_leaf
> diff --git a/gcc/testsuite/gcc.target/riscv/pr112478.c 
> b/gcc/testsuite/gcc.target/riscv/pr112478.c
> new file mode 100644
> index 000..0bbde20b71b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr112478.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */
> +/* { dg-options "-ffat-lto-objects" } */
> +
> +void foo() {
> +asm volatile("# " : ::"ra");
> +}
> +
> +/* { dg-final { scan-assembler "s(w|d)\[ \t\]*ra" } } */
> --
> 2.40.1
>


Re: [PATCH] riscv: thead: Add support for the XTheadInt ISA extension

2023-11-10 Thread Christoph Müllner
On Tue, Nov 7, 2023 at 4:04 AM Jin Ma  wrote:
>
> The XTheadInt ISA extension provides acceleration interruption
> instructions as defined in T-Head-specific:
>
> * th.ipush
> * th.ipop

Overall, it looks ok to me.
There are just a few small issues to clean up (see below).


>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h (th_int_get_mask): New prototype.
> (th_int_get_save_adjustment): Likewise.
> (th_int_adjust_cfi_prologue): Likewise.
> * config/riscv/riscv.cc (TH_INT_INTERRUPT): New macro.
> (riscv_expand_prologue): Add the processing of XTheadInt.
> (riscv_expand_epilogue): Likewise.
> * config/riscv/riscv.md: New unspec.
> * config/riscv/thead.cc (BITSET_P): New macro.
> * config/riscv/thead.md (th_int_push): New pattern.
> (th_int_pop): New pattern.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/xtheadint-push-pop.c: New test.
> ---
>  gcc/config/riscv/riscv-protos.h   |  3 +
>  gcc/config/riscv/riscv.cc | 58 +-
>  gcc/config/riscv/riscv.md |  4 +
>  gcc/config/riscv/thead.cc | 78 +++
>  gcc/config/riscv/thead.md | 67 
>  .../gcc.target/riscv/xtheadint-push-pop.c | 36 +
>  6 files changed, 245 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadint-push-pop.c
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 85d4f6ed9ea..05d1fc2b3a0 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -627,6 +627,9 @@ extern void th_mempair_prepare_save_restore_operands 
> (rtx[4], bool,
>   int, HOST_WIDE_INT,
>   int, HOST_WIDE_INT);
>  extern void th_mempair_save_restore_regs (rtx[4], bool, machine_mode);
> +extern unsigned int th_int_get_mask(unsigned int);

Space between function name and parenthesis.

> +extern unsigned int th_int_get_save_adjustment();

Space between function name and parenthesis.
An empty parameter list should be written as "(void)".

> +extern rtx th_int_adjust_cfi_prologue (unsigned int);
>  #ifdef RTX_CODE
>  extern const char*
>  th_mempair_output_move (rtx[4], bool, machine_mode, RTX_CODE);
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 08ff05dcc3f..c623101b05e 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -101,6 +101,16 @@ along with GCC; see the file COPYING3.  If not see
>  /* True the mode switching has static frm, or false.  */
>  #define STATIC_FRM_P(c) ((c)->machine->mode_sw_info.static_frm_p)
>
> +/* True if we can use the instructions in the XTheadInt extension
> +   to handle interrupts, or false.  */
> +#define TH_INT_INTERRUPT(c)\
> +  (TARGET_XTHEADINT\
> +   /* The XTheadInt extension only supports rv32.  */  \
> +   && !TARGET_64BIT\
> +   && (c)->machine->interrupt_handler_p\
> +   /* This instruction can be executed in M-mode only.*/   \

Dot, space, space, end of comment.

Maybe better:
/* The XTheadInt instructions can only be executed in M-mode.  */

> +   && (c)->machine->interrupt_mode == MACHINE_MODE)
> +
>  /* Information about a function's frame layout.  */
>  struct GTY(())  riscv_frame_info {
>/* The size of the frame in bytes.  */
> @@ -6703,6 +6713,7 @@ riscv_expand_prologue (void)
>unsigned fmask = frame->fmask;
>int spimm, multi_push_additional, stack_adj;
>rtx insn, dwarf = NULL_RTX;
> +  unsigned th_int_mask = 0;
>
>if (flag_stack_usage_info)
>  current_function_static_stack_size = constant_lower_bound 
> (remaining_size);
> @@ -6771,6 +6782,28 @@ riscv_expand_prologue (void)
>REG_NOTES (insn) = dwarf;
>  }
>
> +  th_int_mask = th_int_get_mask(frame->mask);

There should be exactly one space between function name and parenthesis.

> +  if (th_int_mask && TH_INT_INTERRUPT (cfun))
> +{
> +  frame->mask &= ~th_int_mask;
> +
> +  /* RISCV_PROLOGUE_TEMP may be used to handle some CSR for
> +interrupts, such as fcsr. */

Dot, space, space, end of comment.

> +  if ((TARGET_HARD_FLOAT  && frame->fmask)
> + || (TARGET_ZFINX && frame->mask))
> +   frame->mask |= (1 << RISCV_PROLOGUE_TEMP_REGNUM);
> +
> +  unsigned save_adjustment = th_int_get_save_adjustment ();
> +  frame->gp_sp_offset -= save_adjustment;
> +  remaining_size -= save_adjustment;
> +
> +  insn = emit_insn (gen_th_int_push ());
> +
> +  rtx dwarf = th_int_adjust_cfi_prologue (th_int_mask);
> +  RTX_FRAME_RELATED_P (insn) = 1;
> +  REG_NOTES (insn) = dwarf;
> +}
> +
>/* Save the GP, FP registers.  */
>if 

Re: [PATCH] RISC-V: Fix bug that XTheadMemPair extension caused fcsr not to be saved and restored before and after interrupt.

2023-11-10 Thread Christoph Müllner
On Fri, Nov 10, 2023 at 2:20 PM Kito Cheng  wrote:
>
> LGTM

Committed after shortening the commit message's heading.

>
> Christoph Müllner 於 2023年11月10日 週五,20:55寫道:
>>
>> On Fri, Nov 10, 2023 at 8:14 AM Jin Ma  wrote:
>> >
>> > The t0 register is used as a temporary register for interrupts, so it needs
>> > special treatment. It is necessary to avoid using "th.ldd" in the interrupt
>> > program to stop the subsequent operation of the t0 register, so they need 
>> > to
>> > exchange positions in the function "riscv_for_each_saved_reg".
>>
>> RISCV_PROLOGUE_TEMP_REGNUM needs indeed to be treated special
>> in case of ISRs and fcsr. This patch just moves the TARGET_XTHEADMEMPAIR
>> block after the ISR/fcsr block.
>>
>> Reviewed-by: Christoph Müllner 
>>
>> >
>> > gcc/ChangeLog:
>> >
>> > * config/riscv/riscv.cc (riscv_for_each_saved_reg): Place the 
>> > interrupt
>> > operation before the XTheadMemPair.
>> > ---
>> >  gcc/config/riscv/riscv.cc | 56 +--
>> >  .../riscv/xtheadmempair-interrupt-fcsr.c  | 18 ++
>> >  2 files changed, 46 insertions(+), 28 deletions(-)
>> >  create mode 100644 
>> > gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
>> >
>> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>> > index e25692b86fc..fa2d4d4b779 100644
>> > --- a/gcc/config/riscv/riscv.cc
>> > +++ b/gcc/config/riscv/riscv.cc
>> > @@ -6346,6 +6346,34 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
>> > riscv_save_restore_fn fn,
>> >   && riscv_is_eh_return_data_register (regno))
>> > continue;
>> >
>> > +  /* In an interrupt function, save and restore some necessary CSRs 
>> > in the stack
>> > +to avoid changes in CSRs.  */
>> > +  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
>> > + && cfun->machine->interrupt_handler_p
>> > + && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
>> > + || (TARGET_ZFINX
>> > + && (cfun->machine->frame.mask & ~(1 << 
>> > RISCV_PROLOGUE_TEMP_REGNUM)
>> > +   {
>> > + unsigned int fcsr_size = GET_MODE_SIZE (SImode);
>> > + if (!epilogue)
>> > +   {
>> > + riscv_save_restore_reg (word_mode, regno, offset, fn);
>> > + offset -= fcsr_size;
>> > + emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
>> > + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
>> > + offset, riscv_save_reg);
>> > +   }
>> > + else
>> > +   {
>> > + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
>> > + offset - fcsr_size, 
>> > riscv_restore_reg);
>> > + emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
>> > + riscv_save_restore_reg (word_mode, regno, offset, fn);
>> > + offset -= fcsr_size;
>> > +   }
>> > + continue;
>> > +   }
>> > +
>> >if (TARGET_XTHEADMEMPAIR)
>> > {
>> >   /* Get the next reg/offset pair.  */
>> > @@ -6376,34 +6404,6 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
>> > riscv_save_restore_fn fn,
>> > }
>> > }
>> >
>> > -  /* In an interrupt function, save and restore some necessary CSRs 
>> > in the stack
>> > -to avoid changes in CSRs.  */
>> > -  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
>> > - && cfun->machine->interrupt_handler_p
>> > - && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
>> > - || (TARGET_ZFINX
>> > - && (cfun->machine->frame.mask & ~(1 << 
>> > RISCV_PROLOGUE_TEMP_REGNUM)
>> > -   {
>> > - unsigned int fcsr_size = GET_MODE_SIZE (SImode);
>> > - if (!epilogue)
>> > -   {
>> > - riscv_save_restore_reg (word_mode, regno, offset, fn);
>> > - offset -= fcsr_size;
>> > - emit_insn (gen_riscv_frcsr (RISCV_PROLOG

Re: [PATCH] RISC-V: Fix bug that XTheadMemPair extension caused fcsr not to be saved and restored before and after interrupt.

2023-11-10 Thread Christoph Müllner
On Fri, Nov 10, 2023 at 8:14 AM Jin Ma  wrote:
>
> The t0 register is used as a temporary register for interrupts, so it needs
> special treatment. It is necessary to avoid using "th.ldd" in the interrupt
> program to stop the subsequent operation of the t0 register, so they need to
> exchange positions in the function "riscv_for_each_saved_reg".

RISCV_PROLOGUE_TEMP_REGNUM needs indeed to be treated special
in case of ISRs and fcsr. This patch just moves the TARGET_XTHEADMEMPAIR
block after the ISR/fcsr block.

Reviewed-by: Christoph Müllner 

>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_for_each_saved_reg): Place the 
> interrupt
> operation before the XTheadMemPair.
> ---
>  gcc/config/riscv/riscv.cc | 56 +--
>  .../riscv/xtheadmempair-interrupt-fcsr.c  | 18 ++
>  2 files changed, 46 insertions(+), 28 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index e25692b86fc..fa2d4d4b779 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -6346,6 +6346,34 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
> riscv_save_restore_fn fn,
>   && riscv_is_eh_return_data_register (regno))
> continue;
>
> +  /* In an interrupt function, save and restore some necessary CSRs in 
> the stack
> +to avoid changes in CSRs.  */
> +  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
> + && cfun->machine->interrupt_handler_p
> + && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
> + || (TARGET_ZFINX
> + && (cfun->machine->frame.mask & ~(1 << 
> RISCV_PROLOGUE_TEMP_REGNUM)
> +   {
> + unsigned int fcsr_size = GET_MODE_SIZE (SImode);
> + if (!epilogue)
> +   {
> + riscv_save_restore_reg (word_mode, regno, offset, fn);
> + offset -= fcsr_size;
> + emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
> + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> + offset, riscv_save_reg);
> +   }
> + else
> +   {
> + riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> + offset - fcsr_size, riscv_restore_reg);
> + emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
> + riscv_save_restore_reg (word_mode, regno, offset, fn);
> + offset -= fcsr_size;
> +   }
> + continue;
> +   }
> +
>if (TARGET_XTHEADMEMPAIR)
> {
>   /* Get the next reg/offset pair.  */
> @@ -6376,34 +6404,6 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
> riscv_save_restore_fn fn,
> }
> }
>
> -  /* In an interrupt function, save and restore some necessary CSRs in 
> the stack
> -to avoid changes in CSRs.  */
> -  if (regno == RISCV_PROLOGUE_TEMP_REGNUM
> - && cfun->machine->interrupt_handler_p
> - && ((TARGET_HARD_FLOAT  && cfun->machine->frame.fmask)
> - || (TARGET_ZFINX
> - && (cfun->machine->frame.mask & ~(1 << 
> RISCV_PROLOGUE_TEMP_REGNUM)
> -   {
> - unsigned int fcsr_size = GET_MODE_SIZE (SImode);
> - if (!epilogue)
> -   {
> - riscv_save_restore_reg (word_mode, regno, offset, fn);
> - offset -= fcsr_size;
> - emit_insn (gen_riscv_frcsr (RISCV_PROLOGUE_TEMP (SImode)));
> - riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> - offset, riscv_save_reg);
> -   }
> - else
> -   {
> - riscv_save_restore_reg (SImode, RISCV_PROLOGUE_TEMP_REGNUM,
> - offset - fcsr_size, riscv_restore_reg);
> - emit_insn (gen_riscv_fscsr (RISCV_PROLOGUE_TEMP (SImode)));
> - riscv_save_restore_reg (word_mode, regno, offset, fn);
> - offset -= fcsr_size;
> -   }
> - continue;
> -   }
> -
>riscv_save_restore_reg (word_mode, regno, offset, fn);
>  }
>
> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c 
> b/gcc/testsuite/gcc.target/riscv/xtheadmempair-interrupt-fcsr.c
> new file mode 100644
> index 000..d06f05f5c7c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/xtheadme

Re: [PATCH] RISC-V: Fix the illegal operands for the XTheadMemidx extension.

2023-11-09 Thread Christoph Müllner
On Thu, Nov 9, 2023 at 8:40 AM Jin Ma  wrote:
>
> The pattern "*extend2_bitmanip" and
> "*zero_extendhi2_bitmanip" in bitmanip.md are similar
> to the pattern "*th_memidx_bb_extendqi2" and
> "*th_memidx_bb_zero_extendhi2" in thead.md, which will
> cause the wrong instruction to be generated and report the
> following error in binutils:
> Assembler messages:
> Error: illegal operands `lb a5,(a0),1,0'
>
> In fact, the correct instruction is "th.lbia a5,(a0),1,0".

LGTM.
This zbb_xtheadmemidx was not part of the test matrix.
We only had xtheadbb_xtheadmemidx there.

Thanks!

>
> gcc/ChangeLog:
>
> * config/riscv/bitmanip.md: Avoid the conflict between
> zbb and xtheadmemidx in patterns.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/xtheadfmemidx-uindex-zbb.c: New test.
> ---
>  gcc/config/riscv/bitmanip.md  |  4 +--
>  .../riscv/xtheadfmemidx-uindex-zbb.c  | 30 +++
>  2 files changed, 32 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-zbb.c
>
> diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
> index a9c8275fca7..878395c3ffa 100644
> --- a/gcc/config/riscv/bitmanip.md
> +++ b/gcc/config/riscv/bitmanip.md
> @@ -290,7 +290,7 @@ (define_insn "*di2"
>  (define_insn "*zero_extendhi2_bitmanip"
>[(set (match_operand:GPR 0 "register_operand" "=r,r")
>  (zero_extend:GPR (match_operand:HI 1 "nonimmediate_operand" "r,m")))]
> -  "TARGET_ZBB"
> +  "TARGET_ZBB  && !TARGET_XTHEADMEMIDX"
>"@
> zext.h\t%0,%1
> lhu\t%0,%1"
> @@ -301,7 +301,7 @@ (define_insn "*extend2_bitmanip"
>[(set (match_operand:SUPERQI   0 "register_operand" "=r,r")
> (sign_extend:SUPERQI
> (match_operand:SHORT 1 "nonimmediate_operand" " r,m")))]
> -  "TARGET_ZBB"
> +  "TARGET_ZBB && !TARGET_XTHEADMEMIDX"
>"@
> sext.\t%0,%1
> l\t%0,%1"
> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-zbb.c 
> b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-zbb.c
> new file mode 100644
> index 000..a05bc220cba
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-zbb.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
> +/* { dg-options "-march=rv64gc_zbb_xtheadmemidx -mabi=lp64d" { target { rv64 
> } } } */
> +/* { dg-options "-march=rv32imafc_zbb_xtheadmemidx -mabi=ilp32f" { target { 
> rv32 } } } */
> +
> +const unsigned char *
> +read_uleb128(const unsigned char *p, unsigned long *val)
> +{
> +  unsigned int shift = 0;
> +  unsigned char byte;
> +  unsigned long result;
> +
> +  result = 0;
> +  do
> +  {
> +byte = *p++;
> +result |= ((unsigned long)byte & 0x7f) << shift;
> +shift += 7;
> +  } while (byte & 0x80);
> +
> +  *val = result;
> +  return p;
> +}
> +
> +void test(const unsigned char *p, unsigned long utmp)
> +{
> +  p = read_uleb128(p, );
> +}
> +
> +/* { dg-final { scan-assembler-not {\mlb\ta[0-9],\(a[0-9]\),1,0\M} } } */
>
> base-commit: 04d8a47608dcae7f61805e3566e3a1571b574405
> --
> 2.17.1
>


Re: [PATCH] minimal support for xtheadv

2023-11-09 Thread Christoph Müllner
On Thu, Nov 9, 2023 at 8:39 AM Kito Cheng  wrote:
>
> Hi Yi Xuan:
>
> This patch is trivial, and generally LGTM, but I would require putting
> the spec into https://github.com/riscv-non-isa/riscv-toolchain-conventions
> before merging this, also don't forget include "RISC-V:" in the title,
> it would be easier to track during the RISC-V GCC sync meeting :)
>
> And I am a little bit confused by the author's info? is it from you or
> "XYenChi "? or oriachi...@gmail.com is also your
> mail address?
>
> cc Christoph since I believe you may know more about that process.
> cc JoJo since you are T-head folk :P

Hi Yi Xuan and Kito,

I was not aware that CAS is working on getting T-Head's Vector
extension supported.
My biggest concern with this patch is that "XTheadV" does not have a
specification.

T-Head and VRULL are currently working on support patches for T-Head's
Vector extension
implementation. We've named the extension XTheadVector.
Supporting XTheadVector means to address a range of issues (e.g.
defining a formal ISA
vendor extension specification, extension discovery, addressing
implementation details,
differences among available cores, intrinsics, ...).
We've already made good progress on that and expect to publish first
results soon.

BR
Christoph

>
>
> On Wed, Nov 8, 2023 at 9:13 PM  wrote:
> >
> > From: XYenChi 
> >
> > This patch is for support xtheadv.
> >
> > gcc/ChangeLog:
> >
> > 2023-11-08  Chen Yixuan  
> >
> > * common/config/riscv/riscv-common.cc: Add xthead minimal support.
> >
> > gcc/config/ChangeLog:
> >
> > 2023-11-08  Chen Yixuan  
> >
> > * riscv/riscv.opt: Add xthead minimal support.
> > ---
> >  gcc/common/config/riscv/riscv-common.cc | 2 ++
> >  gcc/config/riscv/riscv.opt  | 2 ++
> >  2 files changed, 4 insertions(+)
> >
> > diff --git a/gcc/common/config/riscv/riscv-common.cc 
> > b/gcc/common/config/riscv/riscv-common.cc
> > index 526dbb7603b..d5ea0ee9b70 100644
> > --- a/gcc/common/config/riscv/riscv-common.cc
> > +++ b/gcc/common/config/riscv/riscv-common.cc
> > @@ -325,6 +325,7 @@ static const struct riscv_ext_version 
> > riscv_ext_version_table[] =
> >{"xtheadmemidx", ISA_SPEC_CLASS_NONE, 1, 0},
> >{"xtheadmempair", ISA_SPEC_CLASS_NONE, 1, 0},
> >{"xtheadsync", ISA_SPEC_CLASS_NONE, 1, 0},
> > +  {"xtheadv",ISA_SPEC_CLASS_NONE, 0, 7},
> >
> >{"xventanacondops", ISA_SPEC_CLASS_NONE, 1, 0},
> >
> > @@ -1680,6 +1681,7 @@ static const riscv_ext_flag_table_t 
> > riscv_ext_flag_table[] =
> >{"xtheadmemidx",  _options::x_riscv_xthead_subext, 
> > MASK_XTHEADMEMIDX},
> >{"xtheadmempair", _options::x_riscv_xthead_subext, 
> > MASK_XTHEADMEMPAIR},
> >{"xtheadsync",_options::x_riscv_xthead_subext, MASK_XTHEADSYNC},
> > +  {"xtheadv",   _options::x_riscv_xthead_subext, MASK_XTHEADV},
> >
> >{"xventanacondops", _options::x_riscv_xventana_subext, 
> > MASK_XVENTANACONDOPS},
> >
> > diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
> > index 70d78151cee..2bbdf680fa2 100644
> > --- a/gcc/config/riscv/riscv.opt
> > +++ b/gcc/config/riscv/riscv.opt
> > @@ -438,6 +438,8 @@ Mask(XTHEADMEMPAIR) Var(riscv_xthead_subext)
> >
> >  Mask(XTHEADSYNC)Var(riscv_xthead_subext)
> >
> > +Mask(XTHEADV)   Var(riscv_xthead_subext)
> > +
> >  TargetVariable
> >  int riscv_xventana_subext
> >
> > --
> > 2.42.0
> >


Re: [PATCH] RISC-V: Use stdint-gcc.h in rvv testsuite

2023-11-07 Thread Christoph Müllner
On Tue, Nov 7, 2023 at 11:16 AM Kito Cheng  wrote:
>
> LGTM, but title is little bit misleading, it's not really related to rvv, 
> change to either RISC-V or T-head is fine, anyway, you can commit without 
> send v2 :)

Fixed and pushed.

Thanks!

>
> Christoph Muellner  於 2023年11月7日 週二 17:45 寫道:
>>
>> From: Christoph Müllner 
>>
>> stdint.h can be replaced with stdint-gcc.h to resolve some missing
>> system headers in non-multilib installations.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/riscv/xtheadmemidx-helpers.h:
>> Replace stdint.h with stdint-gcc.h.
>>
>> Signed-off-by: Christoph Müllner 
>> ---
>>  gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h 
>> b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
>> index a97f08c5cc1..9d8ce124a93 100644
>> --- a/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
>> +++ b/gcc/testsuite/gcc.target/riscv/xtheadmemidx-helpers.h
>> @@ -1,7 +1,7 @@
>>  #ifndef XTHEADMEMIDX_HELPERS_H
>>  #define XTHEADMEMIDX_HELPERS_H
>>
>> -#include 
>> +#include 
>>
>>  #define intX_t long
>>  #define uintX_t unsigned long
>> --
>> 2.41.0
>>


Re: [PATCH] RISC-V: Add ABI requirement for XTheadFMemIdx tests

2023-11-07 Thread Christoph Müllner
On Tue, Nov 7, 2023 at 2:19 AM Kito Cheng  wrote:
>
> LGTM, and maybe change stdint.h to stdint-gcc.h in
> xtheadmemidx-helpers.h? that could make it more portable on multi-lib
> testing.

Can be found here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635508.html

Thanks!

>
> On Tue, Nov 7, 2023 at 3:44 AM Christoph Muellner
>  wrote:
> >
> > From: Christoph Müllner 
> >
> > The XTheadFMemIdx tests set the required ABI for RV32, but not
> > for RV64, which has the effect that the tests are expected to
> > succeed for RV64/LP64.  Let's set the ABI to LP64D in these
> > tests to clarify the requirements.
> >
> > Signed-off-by: Christoph Müllner 
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/xtheadfmemidx-index-update.c: Add ABI.
> > * gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c: Likewise.
> > * gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c: Likewise.
> > * gcc.target/riscv/xtheadfmemidx-index.c: Likewise.
> > * gcc.target/riscv/xtheadfmemidx-uindex-update.c: Likewise.
> > * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c: Likewise.
> > * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c: Likewise.
> > * gcc.target/riscv/xtheadfmemidx-uindex.c: Likewise.
> > ---
> >  gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c | 2 +-
> >  .../gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c  | 2 +-
> >  gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c   | 2 +-
> >  gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index.c| 2 +-
> >  gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-update.c| 2 +-
> >  .../gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c | 2 +-
> >  gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c  | 2 +-
> >  gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex.c   | 2 +-
> >  8 files changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c 
> > b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c
> > index 24bbb63d174..cb86b8ad296 100644
> > --- a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c
> > @@ -1,6 +1,6 @@
> >  /* { dg-do compile } */
> >  /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
> > -/* { dg-options "-march=rv64gc_xtheadmemidx_xtheadfmemidx" { target { rv64 
> > } } } */
> > +/* { dg-options "-march=rv64gc_xtheadmemidx_xtheadfmemidx -mabi=lp64d" { 
> > target { rv64 } } } */
> >  /* { dg-options "-march=rv32imafc_xtheadmemidx_xtheadfmemidx -mabi=ilp32f" 
> > { target { rv32 } } } */
> >
> >  #include "xtheadmemidx-helpers.h"
> > diff --git 
> > a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c 
> > b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c
> > index 3b931a4b980..cc3f6219c05 100644
> > --- a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c
> > @@ -1,6 +1,6 @@
> >  /* { dg-do compile } */
> >  /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
> > -/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx_xtheadfmemidx" { 
> > target { rv64 } } } */
> > +/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx_xtheadfmemidx 
> > -mabi=lp64d" { target { rv64 } } } */
> >  /* { dg-options "-march=rv32imafc_xtheadbb_xtheadmemidx_xtheadfmemidx 
> > -mabi=ilp32f" { target { rv32 } } } */
> >
> >  #include "xtheadmemidx-helpers.h"
> > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c 
> > b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c
> > index 48858605c24..8ee98c87469 100644
> > --- a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c
> > @@ -1,6 +1,6 @@
> >  /* { dg-do compile } */
> >  /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
> > -/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx_xtheadfmemidx" { 
> > target { rv64 } } } */
> > +/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx_xtheadfmemidx 
> > -mabi=lp64d" { target { rv64 } } } */
> >  /* { dg-options "-m

Re: [RE] [7/7] riscv: Add basic extension support for XTheadFmv and XTheadInt

2023-11-02 Thread Christoph Müllner
On Thu, Nov 2, 2023, 08:32 Jin Ma  wrote:

> Hi, I see that XTheadInt is not implemented in the compiler. Is there any
> plan here?
> If there is no patch for it, can I try to implement it with you?
>

Yes, sounds good.
Let me know if you have any questions.
We don't have any plans to work on this at the moment.

BR
Christoph




> Thanks
>
> Jin
>


Re: [PATCH v2 2/2] riscv: thead: Add support for the XTheadFMemIdx ISA extension

2023-10-31 Thread Christoph Müllner
On Sun, Oct 29, 2023 at 11:25 PM Jeff Law  wrote:
>
>
>
> On 10/20/23 03:53, Christoph Muellner wrote:
> > From: Christoph Müllner 
> >
> > The XTheadFMemIdx ISA extension provides additional load and store
> > instructions for floating-point registers with new addressing modes.
> >
> > The following memory accesses types are supported:
> > * load/store: [w,d] (single-precision FP, double-precision FP)
> >
> > The following addressing modes are supported:
> > * register offset with additional immediate offset (4 instructions):
> >flr, fsr
> > * zero-extended register offset with additional immediate offset
> >(4 instructions): flur, fsur
> >
> > These addressing modes are also part of the similar XTheadMemIdx
> > ISA extension support, whose code is reused and extended to support
> > floating-point registers.
> >
> > One challenge that this patch needs to solve are GP registers in FP-mode
> > (e.g. "(reg:DF a2)"), which cannot be handled by the XTheadFMemIdx
> > instructions. Such registers are the result of independent
> > optimizations, which can happen after register allocation.
> > This patch uses a simple but efficient method to address this:
> > add a dependency for XTheadMemIdx to XTheadFMemIdx optimizations.
> > This allows to use the instructions from XTheadMemIdx in case
> > of such registers.
> Or alternately define secondary reloads so that you can get a scratch
> register to reload the address into a GPR.  Your call on whether or not
> to try to implement that.  I guess it largely depends on how likely it
> is you'll have one extension defined, but not the other.

I started doing this but I thought it is not worth the effort,
given all cores that implement one extension also support the other.


> > The added tests ensure that this feature won't regress without notice.
> > Testing: GCC regression test suite and SPEC CPU 2017 intrate (base).
> >
> > Signed-off-by: Christoph Müllner 
> >
> > gcc/ChangeLog:
> >
> >   * config/riscv/riscv.cc (riscv_index_reg_class):
> >   Return GR_REGS for XTheadFMemIdx.
> >   (riscv_regno_ok_for_index_p): Add support for XTheadFMemIdx.
> >   * config/riscv/riscv.h (HARDFP_REG_P): New macro.
> >   * config/riscv/thead.cc (is_fmemidx_mode): New function.
> >   (th_memidx_classify_address_index): Add support for XTheadFMemIdx.
> >   (th_fmemidx_output_index): New function.
> >   (th_output_move): Add support for XTheadFMemIdx.
> >   * config/riscv/thead.md (TH_M_ANYF): New mode iterator.
> >   (TH_M_NOEXTF): Likewise.
> >   (*th_fmemidx_movsf_hardfloat): New INSN.
> >   (*th_fmemidx_movdf_hardfloat_rv64): Likewise.
> >   (*th_fmemidx_I_a): Likewise.
> >   (*th_fmemidx_I_c): Likewise.
> >   (*th_fmemidx_US_a): Likewise.
> >   (*th_fmemidx_US_c): Likewise.
> >   (*th_fmemidx_UZ_a): Likewise.
> >   (*th_fmemidx_UZ_c): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/riscv/xtheadfmemidx-index-update.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-index.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-uindex-update.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-uindex.c: New test.
> > ---
> Same note as with the prior patch WRT wrapping assembly instructions
> when using scan-assembler.

Will do.

>
>
>
> > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> > index eb162abcb92..1e9813b4f39 100644
> > --- a/gcc/config/riscv/riscv.h
> > +++ b/gcc/config/riscv/riscv.h
> > @@ -372,6 +372,8 @@ ASM_MISA_SPEC
> > ((unsigned int) ((int) (REGNO) - GP_REG_FIRST) < GP_REG_NUM)
> >   #define FP_REG_P(REGNO)  \
> > ((unsigned int) ((int) (REGNO) - FP_REG_FIRST) < FP_REG_NUM)
> > +#define HARDFP_REG_P(REGNO)  \
> > +  ((REGNO) >= FP_REG_FIRST && (REGNO) <= FP_REG_LAST)
> >   #define V_REG_P(REGNO)  \
> > ((unsigned int) ((int) (REGNO) - V_REG_FIRST) < V_REG_NUM)
> >   #define VL_REG_P(REGNO) ((REGNO) == VL_REGNUM)
>
> > @@ -755,6 +768,40 @@ th_memidx_output_index (rtx x, machine_mode mode, bool 
> > load)
> > return buf;
> >   }
> >
> > +/* Provide a buffer for a th.flX/th.fluX

Re: [PATCH v2 1/2] riscv: thead: Add support for the XTheadMemIdx ISA extension

2023-10-31 Thread Christoph Müllner
On Sun, Oct 29, 2023 at 10:44 PM Jeff Law  wrote:
>
>
>
> On 10/20/23 03:53, Christoph Muellner wrote:
> > From: Christoph Müllner 
> >
> > The XTheadMemIdx ISA extension provides a additional load and store
> > instructions with new addressing modes.
> >
> > The following memory accesses types are supported:
> > * load: b,bu,h,hu,w,wu,d
> > * store: b,h,w,d
> >
> > The following addressing modes are supported:
> > * immediate offset with PRE_MODIFY or POST_MODIFY (22 instructions):
> >l.ia, l.ib, s.ia, s.ib
> > * register offset with additional immediate offset (11 instructions):
> >lr, sr
> > * zero-extended register offset with additional immediate offset
> >(11 instructions): lur, sur
> >
> > The RISC-V base ISA does not support index registers, so the changes
> > are kept separate from the RISC-V standard support as much as possible.
> >
> > To combine the shift/multiply instructions into the memory access
> > instructions, this patch comes with a few insn_and_split optimizations
> > that allow the combiner to do this task.
> >
> > Handling the different cases of extensions results in a couple of INSNs
> > that look redundant on first view, but they are just the equivalence
> > of what we already have for Zbb as well. The only difference is, that
> > we have much more load instructions.
> >
> > We already have a constraint with the name 'th_f_fmv', therefore,
> > the new constraints follow this pattern and have the same length
> > as required ('th_m_mia', 'th_m_mib', 'th_m_mir', 'th_m_miu').
> >
> > The added tests ensure that this feature won't regress without notice.
> > Testing: GCC regression test suite, GCC bootstrap build, and
> > SPEC CPU 2017 intrate (base) on C920.
> >
> > Signed-off-by: Christoph Müllner 
> >
> > gcc/ChangeLog:
> >
> >   * config/riscv/constraints.md (th_m_mia): New constraint.
> >   (th_m_mib): Likewise.
> >   (th_m_mir): Likewise.
> >   (th_m_miu): Likewise.
> >   * config/riscv/riscv-protos.h (enum riscv_address_type):
> >   Add new address types ADDRESS_REG_REG, ADDRESS_REG_UREG,
> >   and ADDRESS_REG_WB and their documentation.
> >   (struct riscv_address_info): Add new field 'shift' and
> >   document the field usage for the new address types.
> >   (riscv_valid_base_register_p): New prototype.
> >   (th_memidx_legitimate_modify_p): Likewise.
> >   (th_memidx_legitimate_index_p): Likewise.
> >   (th_classify_address): Likewise.
> >   (th_output_move): Likewise.
> >   (th_print_operand_address): Likewise.
> >   * config/riscv/riscv.cc (riscv_index_reg_class):
> >   Return GR_REGS for XTheadMemIdx.
> >   (riscv_regno_ok_for_index_p): Add support for XTheadMemIdx.
> >   (riscv_classify_address): Call th_classify_address() on top.
> >   (riscv_output_move): Call th_output_move() on top.
> >   (riscv_print_operand_address): Call th_print_operand_address()
> >   on top.
> >   * config/riscv/riscv.h (HAVE_POST_MODIFY_DISP): New macro.
> >   (HAVE_PRE_MODIFY_DISP): Likewise.
> >   * config/riscv/riscv.md (zero_extendqi2): Disable
> >   for XTheadMemIdx.
> >   (*zero_extendqi2_internal): Convert to expand,
> >   create INSN with same name and disable it for XTheadMemIdx.
> >   (extendsidi2): Likewise.
> >   (*extendsidi2_internal): Disable for XTheadMemIdx.
> >   * config/riscv/thead.cc (valid_signed_immediate): New helper
> >   function.
> >   (th_memidx_classify_address_modify): New function.
> >   (th_memidx_legitimate_modify_p): Likewise.
> >   (th_memidx_output_modify): Likewise.
> >   (is_memidx_mode): Likewise.
> >   (th_memidx_classify_address_index): Likewise.
> >   (th_memidx_legitimate_index_p): Likewise.
> >   (th_memidx_output_index): Likewise.
> >   (th_classify_address): Likewise.
> >   (th_output_move): Likewise.
> >   (th_print_operand_address): Likewise.
> >   * config/riscv/thead.md (*th_memidx_operand): New splitter.
> >   (*th_memidx_zero_extendqi2): New INSN.
> >   (*th_memidx_extendsidi2): Likewise.
> >   (*th_memidx_zero_extendsidi2): Likewise.
> >   (*th_memidx_zero_extendhi2): Likewise.
> >   (*th_memidx_extend2): Likewise.
> >   (*th_memidx_bb_zero_extendsidi2): Likewise.
> >   (*th_memidx_bb_zero_extendhi2): Likewise.
> >   (*th_memidx_bb_extendhi2): Likewise.
> >   (*th_me

Re: [PATCH v2 0/2] riscv: Adding support for XTHead(F)MemIdx

2023-10-20 Thread Christoph Müllner
On Fri, Oct 20, 2023 at 4:33 PM Jeff Law  wrote:
>
>
>
> On 10/20/23 03:53, Christoph Muellner wrote:
> > From: Christoph Müllner 
> >
> > This two patches add support for the XTheadMemIdx
> > and XTheadFMemIdx ISA extensions, that support additional
> > addressing modes. The extensions are implemented in a range
> > of T-Head cores (e.g. C906, C910, C920) and are available
> > on the market for quite some time.
> >
> > The ISA spec can be found here:
> >https://github.com/T-head-Semi/thead-extension-spec
> >
> > An initial version of these patches has been sent a while ago.
> > Jeff Law suggested to use INSNs instead of peepholes to let
> > the combiner do the optimization.  This is the major change
> > that this patches have seen.
> Did you happen to do any before/after testing?  And if so, did using the
> combiner help with discovery of these cases?  I would expect it to have
> done so, but it's always nice to have a confirmation.

I had no doubt this would be equal or better, therefore I did not plan
to do that.
However, measuring this is not that hard, so I just did the exercise
of forward-porting
the peephole-based patchset (and all tiny fixes that the v2 has).
I then built xalancbmk_r/peak (randomly selected) with both compilers and
compared the number of indexed loads and indexed stores in the binary:

v1: 3982 indexed loads / 2447 indexed stores
v2: 4110 indexed loads (+3.2%) / 2476 indexed stores (+1.2%)

So your suggestion indeed helps to discover additional cases.
Thanks again for that!

BR
Christoph


Re: [PATCH] RISC-V: Make xtheadcondmov-indirect tests robust against instruction reordering

2023-10-10 Thread Christoph Müllner
On Tue, Oct 10, 2023 at 5:08 AM Kito Cheng  wrote:
>
> I guess you may also want to clean up those bodies for 
> "check-function-bodies"?

I kept the comments on purpose, so that I have a basis for the
expected instruction counts.
But of course, there is no need to follow the format.
Would the following format change of the comments be ok?

-/*
-** ConEmv_imm_imm_reg:
-** addia[0-9]+,a[0-9]+,-1000
-** li  a[0-9]+,10
-** th\.mvnez   a[0-9]+,a[0-9]+,a[0-9]+
-** ret
-*/
+/* addi aX, aX, -1000
+   li aX, 10
+   th.mvnez aX, aX, aX  */

BR
Christoph

>
> On Mon, Oct 9, 2023 at 3:47 PM Christoph Muellner
>  wrote:
> >
> > From: Christoph Müllner 
> >
> > Fixes: c1bc7513b1d7 ("RISC-V: const: hide mvconst splitter from IRA")
> >
> > A recent change broke the xtheadcondmov-indirect tests, because the order of
> > emitted instructions changed. Since the test is too strict when testing for
> > a fixed instruction order, let's change the tests to simply count 
> > instruction,
> > like it is done for similar tests.
> >
> > Reported-by: Patrick O'Neill 
> > Signed-off-by: Christoph Müllner 
> >
> > gcc/testsuite/ChangeLog:
> >
> >     * gcc.target/riscv/xtheadcondmov-indirect.c: Make robust against
> > instruction reordering.
> >
> > Signed-off-by: Christoph Müllner 
> > ---
> >  .../gcc.target/riscv/xtheadcondmov-indirect.c | 11 ---
> >  1 file changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c 
> > b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c
> > index c3253ba5239..eba1b86137b 100644
> > --- a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c
> > @@ -1,8 +1,7 @@
> >  /* { dg-do compile } */
> > -/* { dg-options "-march=rv32gc_xtheadcondmov -fno-sched-pressure" { target 
> > { rv32 } } } */
> > -/* { dg-options "-march=rv64gc_xtheadcondmov -fno-sched-pressure" { target 
> > { rv64 } } } */
> > +/* { dg-options "-march=rv32gc_xtheadcondmov" { target { rv32 } } } */
> > +/* { dg-options "-march=rv64gc_xtheadcondmov" { target { rv64 } } } */
> >  /* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
> > -/* { dg-final { check-function-bodies "**" "" } } */
> >
> >  /*
> >  ** ConEmv_imm_imm_reg:
> > @@ -116,3 +115,9 @@ int ConNmv_reg_reg_reg(int x, int y, int z, int n)
> >  return z;
> >return n;
> >  }
> > +
> > +/* { dg-final { scan-assembler-times "addi\t" 5 } } */
> > +/* { dg-final { scan-assembler-times "li\t" 4 } } */
> > +/* { dg-final { scan-assembler-times "sub\t" 4 } } */
> > +/* { dg-final { scan-assembler-times "th.mveqz\t" 4 } } */
> > +/* { dg-final { scan-assembler-times "th.mvnez\t" 4 } } */
> > --
> > 2.41.0
> >


Re: xthead regression with [COMMITTED] RISC-V: const: hide mvconst splitter from IRA

2023-10-09 Thread Christoph Müllner
On Mon, Oct 9, 2023 at 10:48 PM Vineet Gupta  wrote:
>
> On 10/9/23 13:46, Christoph Müllner wrote:
> > Given that this causes repeated issues, I think that a fall-back to
> > counting occurrences is the right thing to do. I can do that if that's ok.
>
> Thanks Christoph.

Tested patch on list:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632393.html

>
> -Vineet


Re: xthead regression with [COMMITTED] RISC-V: const: hide mvconst splitter from IRA

2023-10-09 Thread Christoph Müllner
On Mon, Oct 9, 2023 at 10:36 PM Vineet Gupta  wrote:
>
> Hi Christoph,
>
> On 10/9/23 12:06, Patrick O'Neill wrote:
> >
> > Hi Vineet,
> >
> > We're seeing a regression on all riscv targets after this patch:|
> >
> > FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O2
> > check-function-bodies ConNmv_imm_imm_reg||
> > FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O3 -g
> > check-function-bodies ConNmv_imm_imm_reg
> >
> > Debug log output:
> > body: \taddia[0-9]+,a[0-9]+,-1000+
> > \tlia[0-9]+,9998336+
> > \taddia[0-9]+,a[0-9]+,1664+
> > \tth.mveqza[0-9]+,a[0-9]+,a[0-9]+
> > \tret
> >
> > against: lia5,9998336
> > addia4,a0,-1000
> > addia0,a5,1664
> > th.mveqza0,a1,a4
> > ret|
> >
> > https://github.com/patrick-rivos/gcc-postcommit-ci/issues/8
> > https://github.com/ewlu/riscv-gnu-toolchain/issues/286
> >
>
> It seems with my patch, exactly same instructions get out of order (for
> -O2/-O3) tripping up the test results and differ from say O1 for exact
> same build.
>
> -O2 w/ patch
> ConNmv_imm_imm_reg:
>  lia5,9998336
>  addia4,a0,-1000
>  addia0,a5,1664
>  th.mveqza0,a1,a4
>  ret
>
> -O1 w/ patch
> ConNmv_imm_imm_reg:
>  addia4,a0,-1000
>  lia5,9998336
>  addia0,a5,1664
>  th.mveqza0,a1,a4
>  ret
>
> I'm not sure if there is an easy way to handle that.
> Is there a real reason for testing the full sequences verbatim, or is
> testing number of occurrences of th.mv{eqz,nez} enough.

I did not write the test cases, I just merged two non-functional test files
into one that works without changing the actual test approach.

Given that this causes repeated issues, I think that a fall-back to counting
occurrences is the right thing to do.

I can do that if that's ok.

BR
Christoph



> It seems Jeff recently added -fno-sched-pressure to avoid similar issues
> but that apparently is no longer sufficient.
>
> Thx,
> -Vineet
>
> > Thanks,
> > Patrick
> >
> > On 10/6/23 11:22, Vineet Gupta wrote:
> >> Vlad recently introduced a new gate @ira_in_progress, similar to
> >> counterparts @{reload,lra}_in_progress.
> >>
> >> Use this to hide the constant synthesis splitter from being recog* ()
> >> by IRA register equivalence logic which is eager to undo the splits,
> >> generating worse code for constants (and sometimes no code at all).
> >>
> >> See PR/109279 (large constant), PR/110748 (const -0.0) ...
> >>
> >> Granted the IRA logic is subsided with -fsched-pressure which is now
> >> enabled for RISC-V backend, the gate makes this future-proof in
> >> addition to helping with -O1 etc.
> >>
> >> This fixes 1 addition test
> >>
> >> = Summary of gcc testsuite =
> >>  | # of unexpected case / # of unique 
> >> unexpected case
> >>  |  gcc |  g++ | gfortran |
> >>
> >> rv32imac/  ilp32/ medlow |  416 /   103 |   13 / 6 |   67 /12 |
> >>   rv32imafdc/ ilp32d/ medlow |  416 /   103 |   13 / 6 |   24 / 4 |
> >> rv64imac/   lp64/ medlow |  417 /   104 |9 / 3 |   67 /12 |
> >>   rv64imafdc/  lp64d/ medlow |  416 /   103 |5 / 2 |6 / 1 |
> >>
> >> Also similar to v1, this doesn't move RISC-V SPEC scores at all.
> >>
> >> gcc/ChangeLog:
> >>  * config/riscv/riscv.md (mvconst_internal): Add !ira_in_progress.
> >>
> >> Suggested-by: Jeff Law
> >> Signed-off-by: Vineet Gupta
> >> ---
> >>   gcc/config/riscv/riscv.md | 9 ++---
> >>   1 file changed, 6 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> >> index 1ebe8f92284d..da84b9357bd3 100644
> >> --- a/gcc/config/riscv/riscv.md
> >> +++ b/gcc/config/riscv/riscv.md
> >> @@ -1997,13 +1997,16 @@
> >>
> >>   ;; Pretend to have the ability to load complex const_int in order to get
> >>   ;; better code generation around them.
> >> -;;
> >>   ;; But avoid constants that are special cased elsewhere.
> >> +;;
> >> +;; Hide it from IRA register equiv recog* () to elide potential undoing 
> >> of split
> >> +;;
> >>   (define_insn_and_split "*mvconst_internal"
> >> [(set (match_operand:GPR 0 "register_operand" "=r")
> >>   (match_operand:GPR 1 "splittable_const_int_operand" "i"))]
> >> -  "!(p2m1_shift_operand (operands[1], mode)
> >> - || high_mask_shift_operand (operands[1], mode))"
> >> +  "!ira_in_progress
> >> +   && !(p2m1_shift_operand (operands[1], mode)
> >> +|| high_mask_shift_operand (operands[1], mode))"
> >> "#"
> >> "&& 1"
> >> [(const_int 0)]
>


Re: [PATCH] RISC-V: THead: Fix missing CFI directives for th.sdd in prologue.

2023-10-04 Thread Christoph Müllner
On Wed, Oct 4, 2023 at 9:49 AM Xianmiao Qu  wrote:
>
> From: quxm 
>
> When generating CFI directives for the store-pair instruction,
> if we add two parallel REG_FRAME_RELATED_EXPR expr_lists like
>   (expr_list:REG_FRAME_RELATED_EXPR (set (mem/c:DI (plus:DI (reg/f:DI 2 sp)
> (const_int 8 [0x8])) [1  S8 A64])
> (reg:DI 1 ra))
>   (expr_list:REG_FRAME_RELATED_EXPR (set (mem/c:DI (reg/f:DI 2 sp) [1  S8 
> A64])
> (reg:DI 8 s0))
> only the first expr_list will be recognized by dwarf2out_frame_debug
> funciton. So, here we generate a SEQUENCE expression of 
> REG_FRAME_RELATED_EXPR,
> which includes two sub-expressions of RTX_FRAME_RELATED_P. Then the
> dwarf2out_frame_debug_expr function will iterate through all the 
> sub-expressions
> and generate the corresponding CFI directives.
>
> gcc/
> * config/riscv/thead.cc (th_mempair_save_regs): Fix missing CFI
> directives for store-pair instruction.
>
> gcc/testsuite/
> * gcc.target/riscv/xtheadmempair-4.c: New test.

LGTM, I've also tested it.

Reviewed-by: Christoph Müllner 
Tested-by: Christoph Müllner 

Thanks!

> ---
>  gcc/config/riscv/thead.cc | 11 +++
>  .../gcc.target/riscv/xtheadmempair-4.c| 29 +++
>  2 files changed, 35 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
>
> diff --git a/gcc/config/riscv/thead.cc b/gcc/config/riscv/thead.cc
> index 507c912bc39..be0cd7c1276 100644
> --- a/gcc/config/riscv/thead.cc
> +++ b/gcc/config/riscv/thead.cc
> @@ -366,14 +366,15 @@ th_mempair_save_regs (rtx operands[4])
>  {
>rtx set1 = gen_rtx_SET (operands[0], operands[1]);
>rtx set2 = gen_rtx_SET (operands[2], operands[3]);
> +  rtx dwarf = gen_rtx_SEQUENCE (VOIDmode, rtvec_alloc (2));
>rtx insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set1, 
> set2)));
>RTX_FRAME_RELATED_P (insn) = 1;
>
> -  REG_NOTES (insn) = alloc_EXPR_LIST (REG_FRAME_RELATED_EXPR,
> - copy_rtx (set1), REG_NOTES (insn));
> -
> -  REG_NOTES (insn) = alloc_EXPR_LIST (REG_FRAME_RELATED_EXPR,
> - copy_rtx (set2), REG_NOTES (insn));
> +  XVECEXP (dwarf, 0, 0) = copy_rtx (set1);
> +  XVECEXP (dwarf, 0, 1) = copy_rtx (set2);
> +  RTX_FRAME_RELATED_P (XVECEXP (dwarf, 0, 0)) = 1;
> +  RTX_FRAME_RELATED_P (XVECEXP (dwarf, 0, 1)) = 1;
> +  add_reg_note (insn, REG_FRAME_RELATED_EXPR, dwarf);
>  }
>
>  /* Similar like riscv_restore_reg, but restores two registers from memory
> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c 
> b/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
> new file mode 100644
> index 000..9aef4e15f8d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-g" "-Oz" "-Os" "-flto" } } */
> +/* { dg-options "-march=rv64gc_xtheadmempair -mtune=thead-c906 
> -funwind-tables" { target { rv64 } } } */
> +/* { dg-options "-march=rv32gc_xtheadmempair -mtune=thead-c906 
> -funwind-tables" { target { rv32 } } } */
> +
> +extern void bar (void);
> +
> +void foo (void)
> +{
> +  asm volatile (";my clobber list"
> +   : : : "s0");
> +  bar ();
> +  asm volatile (";my clobber list"
> +   : : : "s0");
> +}
> +
> +/* { dg-final { scan-assembler-times "th.sdd\t" 1 { target { rv64 } } } } */
> +/* { dg-final { scan-assembler ".cfi_offset 8, -16" { target { rv64 } } } } 
> */
> +/* { dg-final { scan-assembler ".cfi_offset 1, -8" { target { rv64 } } } } */
> +
> +/* { dg-final { scan-assembler-times "th.swd\t" 1 { target { rv32 } } } } */
> +/* { dg-final { scan-assembler ".cfi_offset 8, -8" { target { rv32 } } } } */
> +/* { dg-final { scan-assembler ".cfi_offset 1, -4" { target { rv32 } } } } */
> +
> +/* { dg-final { scan-assembler ".cfi_restore 1" } } */
> +/* { dg-final { scan-assembler ".cfi_restore 8" } } */
> +
> +/* { dg-final { scan-assembler-times "th.ldd\t" 1 { target { rv64 } } } } */
> +/* { dg-final { scan-assembler-times "th.lwd\t" 1 { target { rv32 } } } } */
> --
> 2.17.1
>


Re: [gcc r14-2455] riscv: Prepare backend for index registers

2023-07-17 Thread Christoph Müllner
On Mon, Jul 17, 2023 at 9:44 AM Andreas Schwab  wrote:
>
> On Jul 17 2023, Christoph Müllner wrote:
>
> > My host compiler is: gcc version 13.1.1 20230614 (Red Hat 13.1.1-4) (GCC)
>
> Too old.

Ok understood.

Thanks,
Christoph

>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."


Re: [gcc r14-2455] riscv: Prepare backend for index registers

2023-07-17 Thread Christoph Müllner
On Mon, Jul 17, 2023 at 9:31 AM Andrew Pinski  wrote:
>
> On Sun, Jul 16, 2023 at 11:49 PM Christoph Müllner
>  wrote:
> >
> > On Fri, Jul 14, 2023 at 12:28 PM Andreas Schwab  
> > wrote:
> > >
> > > Why didn't you test that?
> >
> > Thanks for reporting, and sorry for introducing this warning.
> >
> > I test all patches before sending them.
> > In the case of RISC-V backend patches, I build a 2-stage
> > cross-toolchain and run all regression tests for RV32 and RV64 (using
> > QEMU).
> > Testing is done with and without patches applied to identify regressions.
> >
> > The build process shows a lot of warnings. Therefore I did not
> > investigate finding a way to use -Werror.
> > This means that looking for compiler warnings is a manual step, and I
> > might miss one while scrolling through the logs.
>
> If you are building a cross compiler, and want to clean up warnings,
> first build a native compiler and then build the cross using that.

Ok, will adjust my workflow accordingly.

> Also maybe it is finding a way to do native bootstraps on riscv to do
> testing of patches rather than doing just cross builds when testing
> backend patches.
> Especially when I have found the GCC testsuite but the bootstrap is
> more likely to find backend issues and such.

Yes, using the patch-under-testing to build a toolchain can identify
issues that the testsuite can't find. I did that a couple of times in a
QEMU environment, but I prefer the cross-toolchain approach because
it is faster. For patches that have a bigger impact, I test the toolchain
with SPEC CPU 2017.

Thanks,
Christoph

>
> Thanks,
> Andrew
>
> >
> > Sorry for the inconvenience,
> > Christoph
> >
> >
> > >
> > > ../../gcc/config/riscv/riscv.cc: In function 'int 
> > > riscv_regno_ok_for_index_p(int)':
> > > ../../gcc/config/riscv/riscv.cc:864:33: error: unused parameter 'regno' 
> > > [-Werror=unused-parameter]
> > >   864 | riscv_regno_ok_for_index_p (int regno)
> > >   | ^
> > > cc1plus: all warnings being treated as errors
> > > make[3]: *** [Makefile:2499: riscv.o] Error 1
> > >
> > > --
> > > Andreas Schwab, sch...@linux-m68k.org
> > > GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> > > "And now for something completely different."


Re: [gcc r14-2455] riscv: Prepare backend for index registers

2023-07-17 Thread Christoph Müllner
On Mon, Jul 17, 2023 at 9:24 AM Andreas Schwab  wrote:
>
> On Jul 17 2023, Christoph Müllner wrote:
>
> > The build process shows a lot of warnings.
>
> Then you are using a bad compiler.  The build is 100% -Werror clean.

My host compiler is: gcc version 13.1.1 20230614 (Red Hat 13.1.1-4) (GCC)

Some examples:

> /home/cm/src/gcc/riscv-mainline/gcc/text-art/table.cc: In member function 
> ‘int text_art::table_geometry::table_x_to_canvas_x(int) const’:
> /home/cm/src/gcc/riscv-mainline/gcc/text-art/table.cc:561:15: warning: 
> comparison of integer expressions of different signedness: ‘int’ and 
> ‘std::vector::size_type’ {aka ‘long unsigned int’} [-Wsign-compare]
>   561 |   if (table_x == m_col_start_x.size ())
>   |   ^~~~

> /home/cm/src/gcc/riscv-mainline/gcc/text-art/table.cc: In function ‘void 
> selftest::test_spans_3()’:
> /home/cm/src/gcc/riscv-mainline/gcc/text-art/table.cc:947:62: warning: 
> unquoted keyword ‘char’ in format [-Wformat-diag]
>  947 |  "'buf' 
> (char[%i])",
>   |  ^~~~

> /home/cm/src/gcc/riscv-mainline/gcc/gengtype-lex.l: In function ‘int 
> yylex(const char**)’:
> gengtype-lex.cc:356:15: warning: this statement may fall through 
> [-Wimplicit-fallthrough=]
>   356 |  */
>   |  ~~   ^

> In file included from /home/cm/src/gcc/riscv-mainline/libgcc/unwind-dw2.c:410:
> ./md-unwind-support.h: In function 'riscv_fallback_frame_state':
> ./md-unwind-support.h:67:6: warning: assignment to 'struct sigcontext *' from 
> incompatible pointer type 'mcontext_t *' [-Wincompatible-pointer-types]
>   67 |   sc = _->uc.uc_mcontext;
>  |  ^

> /home/cm/src/gcc/riscv-mainline/libgcc/config/riscv/atomic.c:36:8: warning: 
> conflicting types for built-in function '__sync_fetch_and_add_1'; expected 
> 'unsigned char(volatile void *, unsigned char)' 
> [-Wbuiltin-declaration-mismatch]
>36 |   type __sync_fetch_and_ ## opname ## _ ## size (type *p, type v) 
>   \
>   |^

Please let me know if I am doing something wrong.

BR
Christoph


>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."


Re: [gcc r14-2455] riscv: Prepare backend for index registers

2023-07-17 Thread Christoph Müllner
On Fri, Jul 14, 2023 at 12:28 PM Andreas Schwab  wrote:
>
> Why didn't you test that?

Thanks for reporting, and sorry for introducing this warning.

I test all patches before sending them.
In the case of RISC-V backend patches, I build a 2-stage
cross-toolchain and run all regression tests for RV32 and RV64 (using
QEMU).
Testing is done with and without patches applied to identify regressions.

The build process shows a lot of warnings. Therefore I did not
investigate finding a way to use -Werror.
This means that looking for compiler warnings is a manual step, and I
might miss one while scrolling through the logs.

Sorry for the inconvenience,
Christoph


>
> ../../gcc/config/riscv/riscv.cc: In function 'int 
> riscv_regno_ok_for_index_p(int)':
> ../../gcc/config/riscv/riscv.cc:864:33: error: unused parameter 'regno' 
> [-Werror=unused-parameter]
>   864 | riscv_regno_ok_for_index_p (int regno)
>   | ^
> cc1plus: all warnings being treated as errors
> make[3]: *** [Makefile:2499: riscv.o] Error 1
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."


Re: [PATCH] riscv: thead: Fix failing XTheadCondMov tests (indirect-rv[32|64])

2023-07-12 Thread Christoph Müllner
On Wed, Jul 12, 2023 at 4:05 AM Jeff Law  wrote:
>
>
>
> On 7/10/23 22:44, Christoph Muellner wrote:
> > From: Christoph Müllner 
> >
> > Recently, two identical XTheadCondMov tests have been added, which both 
> > fail.
> > Let's fix that by changing the following:
> > * Merge both files into one (no need for separate tests for rv32 and rv64)
> > * Drop unrelated attribute check test (we already test for `th.mveqz`
> >and `th.mvnez` instructions, so there is little additional value)
> > * Fix the pattern to allow matching
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/riscv/xtheadcondmov-indirect-rv32.c: Moved to...
> >   * gcc.target/riscv/xtheadcondmov-indirect.c: ...here.
> >   * gcc.target/riscv/xtheadcondmov-indirect-rv64.c: Removed.
> I thought this stuff got fixed recently.  Certainly happy to see the
> files merged though.  Here's what I got from the July 4 run:

I have the following with a GCC master from today
(a454325bea77a0dd79415480d48233a7c296bc0a):

FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2
scan-assembler .attribute arch,
"rv32i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"
FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2
scan-assembler .attribute arch,
"rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"

With this patch the fails are gone.

BR
Christoph

>
> > UNSUPPORTED: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O0
> > UNSUPPORTED: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O1
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2  (test for 
> > excess errors)
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2   
> > check-function-bodies ConEmv_imm_imm_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2   
> > check-function-bodies ConEmv_imm_reg_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2   
> > check-function-bodies ConEmv_reg_imm_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2   
> > check-function-bodies ConEmv_reg_reg_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2   
> > check-function-bodies ConNmv_imm_imm_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2   
> > check-function-bodies ConNmv_imm_reg_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2   
> > check-function-bodies ConNmv_reg_imm_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2   
> > check-function-bodies ConNmv_reg_reg_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2   scan-assembler 
> > .attribute arch, 
> > "rv32i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"
> > UNSUPPORTED: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O3 -g
> > UNSUPPORTED: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -Os
> > UNSUPPORTED: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2 -flto 
> > -fno-use-linker-plugin -flto-partition=none
> > UNSUPPORTED: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2 -flto 
> > -fuse-linker-plugin -fno-fat-lto-objects
> > UNSUPPORTED: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O0
> > UNSUPPORTED: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O1
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2  (test for 
> > excess errors)
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2   
> > check-function-bodies ConEmv_imm_imm_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2   
> > check-function-bodies ConEmv_imm_reg_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2   
> > check-function-bodies ConEmv_reg_imm_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2   
> > check-function-bodies ConEmv_reg_reg_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2   
> > check-function-bodies ConNmv_imm_imm_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2   
> > check-function-bodies ConNmv_imm_reg_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2   
> > check-function-bodies ConNmv_reg_imm_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2   
> > check-function-bodies ConNmv_reg_reg_reg
> > PASS: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2   scan-assembler 
> > .attribute arch, 
> > "rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"
> > UNSUPPORTED: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O3 -g
> > UNSUPPORTED: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -Os
> > UNSUPPORTED: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2 -flto 
> > -fno-use-linker-plugin -flto-partition=none
> > UNSUPPORTED: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2 -flto 
> > -fuse-linker-plugin -fno-fat-lto-objects
>
>
> jeff


Re: [PATCH 1/1] riscv: thead: Fix ICE when enable XTheadMemPair ISA extension.

2023-07-11 Thread Christoph Müllner
Hi Kito,

I take some of the blame because I have sent a series
that consisted of fixes followed by new features.

You have ack'ed patches 1-9 from the series.
The last two patches (for XTheadMemIdx and XTheadFMemIdx) were
later reviewed by Jeff and need a bit rework and more testing.

If it helps, you can find patches 1-9 rebased and retested here:
  https://github.com/cmuellner/gcc/tree/riscv-thead-improvements

I have also sent out a fix for two failing T-Head tests earlier today:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624049.html
It would be great if you could look at that and push that as well, if it is ok.

Thanks,
Christoph



On Tue, Jul 11, 2023 at 5:51 PM Kito Cheng  wrote:
>
> Hi Christoph:
>
> Ooops, I thought Philipp will push those patches, does here any other
> patches got approved but not committed? I can help to push those
> patches tomorrow.
>
> On Tue, Jul 11, 2023 at 11:42 PM Christoph Müllner
>  wrote:
> >
> > Hi Cooper,
> >
> > I addressed this in April this year.
> > It even got an "ok", but nobody pushed it:
> >   https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616972.html
> >
> > BR
> > Christoph
> >
> > On Tue, Jul 11, 2023 at 5:39 PM Xianmiao Qu  
> > wrote:
> > >
> > > The frame related load/store instructions should not been
> > > scheduled bewteen echo other, and the REG_FRAME_RELATED_EXPR
> > > expression note should should be added to those instructions
> > > to prevent this.
> > > This bug cause ICE during GCC bootstap, and it will also ICE
> > > in the simplified case mempair-4.c, compilation fails with:
> > > during RTL pass: dwarf2
> > > theadmempair-4.c:20:1: internal compiler error: in 
> > > dwarf2out_frame_debug_cfa_offset, at dwarf2cfi.cc:1376
> > > 0xa8c017 dwarf2out_frame_debug_cfa_offset
> > > ../../../gcc/gcc/dwarf2cfi.cc:1376
> > > 0xa8c017 dwarf2out_frame_debug
> > > ../../../gcc/gcc/dwarf2cfi.cc:2285
> > > 0xa8c017 scan_insn_after
> > > ../../../gcc/gcc/dwarf2cfi.cc:2726
> > > 0xa8cc97 scan_trace
> > > ../../../gcc/gcc/dwarf2cfi.cc:2893
> > > 0xa8d84d create_cfi_notes
> > > ../../../gcc/gcc/dwarf2cfi.cc:2933
> > > 0xa8d84d execute_dwarf2_frame
> > > ../../../gcc/gcc/dwarf2cfi.cc:3309
> > > 0xa8d84d execute
> > > ../../../gcc/gcc/dwarf2cfi.cc:3799
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/riscv/thead.cc (th_mempair_save_regs): Add
> > > REG_FRAME_RELATED_EXPR note for mempair instuctions.
> > >
> > > gcc/testsuite/ChangeLog:
> > > * gcc.target/riscv/xtheadmempair-4.c: New test.
> > > ---
> > >  gcc/config/riscv/thead.cc |  6 +++--
> > >  .../gcc.target/riscv/xtheadmempair-4.c| 26 +++
> > >  2 files changed, 30 insertions(+), 2 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
> > >
> > > diff --git a/gcc/config/riscv/thead.cc b/gcc/config/riscv/thead.cc
> > > index 75203805310..2df709226f9 100644
> > > --- a/gcc/config/riscv/thead.cc
> > > +++ b/gcc/config/riscv/thead.cc
> > > @@ -366,10 +366,12 @@ th_mempair_save_regs (rtx operands[4])
> > >  {
> > >rtx set1 = gen_rtx_SET (operands[0], operands[1]);
> > >rtx set2 = gen_rtx_SET (operands[2], operands[3]);
> > > +  rtx dwarf = gen_rtx_SEQUENCE (VOIDmode, rtvec_alloc (2));
> > >rtx insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set1, 
> > > set2)));
> > >RTX_FRAME_RELATED_P (insn) = 1;
> > > -  add_reg_note (insn, REG_CFA_OFFSET, copy_rtx (set1));
> > > -  add_reg_note (insn, REG_CFA_OFFSET, copy_rtx (set2));
> > > +  XVECEXP (dwarf, 0, 0) = copy_rtx (set1);
> > > +  XVECEXP (dwarf, 0, 1) = copy_rtx (set2);
> > > +  add_reg_note (insn, REG_FRAME_RELATED_EXPR, dwarf);
> > >  }
> > >
> > >  /* Similar like riscv_restore_reg, but restores two registers from memory
> > > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c 
> > > b/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
> > > new file mode 100644
> > > index 000..d653f056ef4
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
> > > @@ -0,0 +1,26 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-g" "-Oz" "-Os&q

Re: [PATCH 1/1] riscv: thead: Fix ICE when enable XTheadMemPair ISA extension.

2023-07-11 Thread Christoph Müllner
Hi Cooper,

I addressed this in April this year.
It even got an "ok", but nobody pushed it:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616972.html

BR
Christoph

On Tue, Jul 11, 2023 at 5:39 PM Xianmiao Qu  wrote:
>
> The frame related load/store instructions should not been
> scheduled bewteen echo other, and the REG_FRAME_RELATED_EXPR
> expression note should should be added to those instructions
> to prevent this.
> This bug cause ICE during GCC bootstap, and it will also ICE
> in the simplified case mempair-4.c, compilation fails with:
> during RTL pass: dwarf2
> theadmempair-4.c:20:1: internal compiler error: in 
> dwarf2out_frame_debug_cfa_offset, at dwarf2cfi.cc:1376
> 0xa8c017 dwarf2out_frame_debug_cfa_offset
> ../../../gcc/gcc/dwarf2cfi.cc:1376
> 0xa8c017 dwarf2out_frame_debug
> ../../../gcc/gcc/dwarf2cfi.cc:2285
> 0xa8c017 scan_insn_after
> ../../../gcc/gcc/dwarf2cfi.cc:2726
> 0xa8cc97 scan_trace
> ../../../gcc/gcc/dwarf2cfi.cc:2893
> 0xa8d84d create_cfi_notes
> ../../../gcc/gcc/dwarf2cfi.cc:2933
> 0xa8d84d execute_dwarf2_frame
> ../../../gcc/gcc/dwarf2cfi.cc:3309
> 0xa8d84d execute
> ../../../gcc/gcc/dwarf2cfi.cc:3799
>
> gcc/ChangeLog:
>
> * config/riscv/thead.cc (th_mempair_save_regs): Add
> REG_FRAME_RELATED_EXPR note for mempair instuctions.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/riscv/xtheadmempair-4.c: New test.
> ---
>  gcc/config/riscv/thead.cc |  6 +++--
>  .../gcc.target/riscv/xtheadmempair-4.c| 26 +++
>  2 files changed, 30 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
>
> diff --git a/gcc/config/riscv/thead.cc b/gcc/config/riscv/thead.cc
> index 75203805310..2df709226f9 100644
> --- a/gcc/config/riscv/thead.cc
> +++ b/gcc/config/riscv/thead.cc
> @@ -366,10 +366,12 @@ th_mempair_save_regs (rtx operands[4])
>  {
>rtx set1 = gen_rtx_SET (operands[0], operands[1]);
>rtx set2 = gen_rtx_SET (operands[2], operands[3]);
> +  rtx dwarf = gen_rtx_SEQUENCE (VOIDmode, rtvec_alloc (2));
>rtx insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set1, 
> set2)));
>RTX_FRAME_RELATED_P (insn) = 1;
> -  add_reg_note (insn, REG_CFA_OFFSET, copy_rtx (set1));
> -  add_reg_note (insn, REG_CFA_OFFSET, copy_rtx (set2));
> +  XVECEXP (dwarf, 0, 0) = copy_rtx (set1);
> +  XVECEXP (dwarf, 0, 1) = copy_rtx (set2);
> +  add_reg_note (insn, REG_FRAME_RELATED_EXPR, dwarf);
>  }
>
>  /* Similar like riscv_restore_reg, but restores two registers from memory
> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c 
> b/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
> new file mode 100644
> index 000..d653f056ef4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-g" "-Oz" "-Os" "-flto" } } */
> +/* { dg-options "-march=rv64gc_xtheadmempair -O2 -g -mtune=thead-c906" { 
> target { rv64 } } } */
> +/* { dg-options "-march=rv32gc_xtheadmempair -O2 -g -mtune=thead-c906" { 
> target { rv32 } } } */
> +
> +void a();
> +void b(char *);
> +void m_fn1(int);
> +int e;
> +
> +int foo(int ee, int f, int g) {
> +  char *h = (char *)__builtin_alloca(1);
> +  b(h);
> +  b("");
> +  int i = ee;
> +  e = g;
> +  m_fn1(f);
> +  a();
> +  e = i;
> +}
> +
> +/* { dg-final { scan-assembler-times "th.ldd\t" 3 { target { rv64 } } } } */
> +/* { dg-final { scan-assembler-times "th.sdd\t" 3 { target { rv64 } } } } */
> +
> +/* { dg-final { scan-assembler-times "th.lwd\t" 3 { target { rv32 } } } } */
> +/* { dg-final { scan-assembler-times "th.swd\t" 3 { target { rv32 } } } } */
> --
> 2.17.1
>


Re: [PATCH 10/11] riscv: thead: Add support for the XTheadMemIdx ISA extension

2023-07-06 Thread Christoph Müllner
On Thu, Jun 29, 2023 at 4:09 PM Jeff Law  wrote:
>
>
>
> On 6/29/23 01:39, Christoph Müllner wrote:
> > On Wed, Jun 28, 2023 at 8:23 PM Jeff Law  wrote:
> >>
> >>
> >>
> >> On 6/28/23 06:39, Christoph Müllner wrote:
> >>
> >>>>> +;; XTheadMemIdx overview:
> >>>>> +;; All peephole passes attempt to improve the operand utilization of
> >>>>> +;; XTheadMemIdx instructions, where one sign or zero extended
> >>>>> +;; register-index-operand can be shifted left by a 2-bit immediate.
> >>>>> +;;
> >>>>> +;; The basic idea is the following optimization:
> >>>>> +;; (set (reg 0) (op (reg 1) (imm 2)))
> >>>>> +;; (set (reg 3) (mem (plus (reg 0) (reg 4)))
> >>>>> +;; ==>
> >>>>> +;; (set (reg 3) (mem (plus (reg 4) (op2 (reg 1) (imm 2
> >>>>> +;; This optimization only valid if (reg 0) has no further uses.
> >>>> Couldn't this be done by combine if you created define_insn patterns
> >>>> rather than define_peephole2 patterns?  Similarly for the other cases
> >>>> handled here.
> >>>
> >>> I was inspired by XTheadMemPair, which merges two memory accesses
> >>> into a mem-pair instruction (and which got inspiration from
> >>> gcc/config/aarch64/aarch64-ldpstp.md).
> >> Right.  I'm pretty familiar with those.  They cover a different case,
> >> specifically the two insns being optimized don't have a true data
> >> dependency between them.  ie, the first instruction does not produce a
> >> result used in the second insn.
> >>
> >>
> >> In the case above there is a data dependency on reg0.  ie, the first
> >> instruction generates a result used in the second instruction.  combine
> >> is usually the best place to handle the data dependency case.
> >
> > Ok, understood.
> >
> > It is a bit of a special case here, because the peephole is restricted
> > to those cases, where reg0 is not used elsewhere (peep2_reg_dead_p()).
> > I have not seen how to do this for combiner optimizations.
> If the value is used elsewhere, then the combiner will generate a
> parallel with two sets.  If the value dies, then the combiner generates
> the one set.  ie given
>
> (set (t) (op0 (a) (b)))
> (set (r) (op1 (c) (t)))
>
> If "t" is dead, then combine will present you with:
>
> (set (r) (op1 (c) (op0 (a) (b
>
> If "t" is used elsewhere, then combine will present you with:
>
> (parallel
>[(set (r) (op1 (c) (op0 (a) (b
> (set (t) (op0 (a) (b)))])
>
> Which makes perfect sense if you think about it for a while.  If you
> still need "t", then the first sequence simply isn't valid as it doesn't
> preserve that side effect.  Hence it tries to produce a sequence with
> the combined operation, but with the side effect of the first statement
> included as well.

Thanks for this!
Of course I was "lucky" and ran into the issue that the patterns did not match,
because of unexpected MULT insns where ASHIFTs were expected.
But after reading enough of combiner.cc I understood that this is on purpose
(for addresses) and I have to adjust my INSNs accordingly.

I've changed the patches for XTheadMemIdx and XTheadFMemIdx and will
send out a new series.

Thanks,
Christoph


Re: [PATCH 10/11] riscv: thead: Add support for the XTheadMemIdx ISA extension

2023-06-29 Thread Christoph Müllner
On Wed, Jun 28, 2023 at 8:23 PM Jeff Law  wrote:
>
>
>
> On 6/28/23 06:39, Christoph Müllner wrote:
>
> >>> +;; XTheadMemIdx overview:
> >>> +;; All peephole passes attempt to improve the operand utilization of
> >>> +;; XTheadMemIdx instructions, where one sign or zero extended
> >>> +;; register-index-operand can be shifted left by a 2-bit immediate.
> >>> +;;
> >>> +;; The basic idea is the following optimization:
> >>> +;; (set (reg 0) (op (reg 1) (imm 2)))
> >>> +;; (set (reg 3) (mem (plus (reg 0) (reg 4)))
> >>> +;; ==>
> >>> +;; (set (reg 3) (mem (plus (reg 4) (op2 (reg 1) (imm 2
> >>> +;; This optimization only valid if (reg 0) has no further uses.
> >> Couldn't this be done by combine if you created define_insn patterns
> >> rather than define_peephole2 patterns?  Similarly for the other cases
> >> handled here.
> >
> > I was inspired by XTheadMemPair, which merges two memory accesses
> > into a mem-pair instruction (and which got inspiration from
> > gcc/config/aarch64/aarch64-ldpstp.md).
> Right.  I'm pretty familiar with those.  They cover a different case,
> specifically the two insns being optimized don't have a true data
> dependency between them.  ie, the first instruction does not produce a
> result used in the second insn.
>
>
> In the case above there is a data dependency on reg0.  ie, the first
> instruction generates a result used in the second instruction.  combine
> is usually the best place to handle the data dependency case.

Ok, understood.

It is a bit of a special case here, because the peephole is restricted
to those cases, where reg0 is not used elsewhere (peep2_reg_dead_p()).
I have not seen how to do this for combiner optimizations.

I found sh_remove_reg_dead_or_unused_notes(), which tests for reg notes
on a given rtx_insn. In our case we have a pattern that matches two insns,
where we have to test if one operand (reg0) is dead or unused after the second
insn. The first insn can be accessed with "curr_insn", but I did not see how to
access the second matching insn. Any ideas or hints?

Thanks,
Christoph



>
>
> >
> > I don't see the benefit of using combine or peephole, but I can change
> > if necessary. At least for the provided test cases, the implementation
> > works quite well.
> Peepholes require the instructions to be consecutive in the stream while
> combine relies on data dependence links and can thus find these
> opportunities even when the two insn we care about are separated by
> unrelated other insns.
>
>
> Jeff


Re: [PATCH 11/11] riscv: thead: Add support for the XTheadFMemIdx ISA extension

2023-06-28 Thread Christoph Müllner
On Sat, Jun 10, 2023 at 7:54 PM Jeff Law  wrote:
>
>
>
> On 4/28/23 00:23, Christoph Muellner wrote:
> > From: Christoph Müllner 
> >
> > The XTheadFMemIdx ISA extension provides additional load and store
> > instructions for floating-point registers with new addressing modes.
> >
> > The following memory accesses types are supported:
> > * ftype = [w,d] (single-precision, double-precision)
> >
> > The following addressing modes are supported:
> > * register offset with additional immediate offset (4 instructions):
> >flr, fsr
> > * zero-extended register offset with additional immediate offset
> >(4 instructions): flur, fsur
> >
> > These addressing modes are also part of the similar XTheadMemIdx
> > ISA extension support, whose code is reused and extended to support
> > floating-point registers.
> >
> > gcc/ChangeLog:
> >
> >   * config/riscv/riscv.cc (riscv_index_reg_class): Also allow
> >   for XTheadFMemIdx.
> >   (riscv_regno_ok_for_index_p): Likewise.
> >   * config/riscv/thead-peephole.md (TARGET_64BIT):
> >   Generalize peepholes for XTheadFMemIdx.
> >   * config/riscv/thead.cc (is_fmemidx_mode): New function.
> >   (th_memidx_classify_address_index): Add support for
> >   XTheadFMemIdx.
> >   (th_fmemidx_output_index): New function.
> >   (th_output_move): Add support for XTheadFMemIdx.
> >   * config/riscv/thead.md (*th_fmemidx_movsf_hardfloat): New INSN.
> >   (*th_fmemidx_movdf_hardfloat_rv64): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/riscv/xtheadmemidx-helpers.h: Add helpers for
> > XTheadMemFIdx.
> >   * gcc.target/riscv/xtheadfmemidx-index-update.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-index.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-uindex-update.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c: New test.
> >   * gcc.target/riscv/xtheadfmemidx-uindex.c: New test.
> Same core questions/comments as in patch #10 of this series.

The basic support for this extension is already merged.

The documentation can be found here:
  https://github.com/T-head-Semi/thead-extension-spec/tree/master

The extension's name and a link to the documentation has also been
registered here:
 
https://github.com/riscv-non-isa/riscv-toolchain-conventions#list-of-vendor-extensions

The XTheadFMemIdx extension is part of the T-Head C906 and C910 SoCs.
The C906 was launched in October 2021.

Thanks,
Christoph

>
> jeff
>


Re: [PATCH 10/11] riscv: thead: Add support for the XTheadMemIdx ISA extension

2023-06-28 Thread Christoph Müllner
On Sat, Jun 10, 2023 at 7:53 PM Jeff Law  wrote:
>
>
>
> On 4/28/23 00:23, Christoph Muellner wrote:
> > From: Christoph Müllner 
> >
> > The XTheadMemIdx ISA extension provides a additional load and store
> > instructions with new addressing modes.
> >
> > The following memory accesses types are supported:
> > * ltype = [b,bu,h,hu,w,wu,d]
> > * stype = [b,h,w,d]
> >
> > The following addressing modes are supported:
> > * immediate offset with PRE_MODIFY or POST_MODIFY (22 instructions):
> >l.ia, l.ib, s.ia, s.ib
> > * register offset with additional immediate offset (11 instructions):
> >lr, sr
> > * zero-extended register offset with additional immediate offset
> >(11 instructions): lur, sur
> >
> > The RISC-V base ISA does not support index registers, so the changes
> > are kept separate from the RISC-V standard support.
> >
> > Similar like other extensions (Zbb, XTheadBb), this patch needs to
> > prevent the conversion of sign-extensions/zero-extensions into
> > shift instructions. The case of the zero-extended register offset
> > addressing mode is handled by a new peephole pass.
> >
> > Handling the different cases of extensions results in a couple of INSNs
> > that look redundant on first view, but they are just the equivalent
> > of what we already have for Zbb as well. The only difference is, that
> > we have much more load instructions.
> >
> > To fully utilize the capabilities of the instructions, there are
> > a few new peephole passes which fold shift amounts into the RTX
> > if possible. The added tests ensure that this feature won't
> > regress without notice.
> >
> > We already have a constraint with the name 'th_f_fmv', therefore,
> > the new constraints follow this pattern and have the same length
> > as required ('th_m_mia', 'th_m_mib', 'th_m_mir', 'th_m_miu').
> >
> > gcc/ChangeLog:
> >
> >   * config/riscv/constraints.md (th_m_mia): New constraint.
> >   (th_m_mib): Likewise.
> >   (th_m_mir): Likewise.
> >   (th_m_miu): Likewise.
> >   * config/riscv/riscv-protos.h (enum riscv_address_type):
> >   Add new address types ADDRESS_REG_REG, ADDRESS_REG_UREG,
> >   and ADDRESS_REG_WB and their documentation.
> >   (struct riscv_address_info): Add new field 'shift' and
> >   document the field usage for the new address types.
> >   (riscv_valid_base_register_p): New prototype.
> >   (th_memidx_legitimate_modify_p): Likewise.
> >   (th_memidx_legitimate_index_p): Likewise.
> >   (th_classify_address): Likewise.
> >   (th_output_move): Likewise.
> >   (th_print_operand_address): Likewise.
> >   * config/riscv/riscv.cc (riscv_index_reg_class):
> >   Return GR_REGS for XTheadMemIdx.
> >   (riscv_regno_ok_for_index_p): Add support for XTheadMemIdx.
> >   (riscv_classify_address): Call th_classify_address() on top.
> >   (riscv_output_move): Call th_output_move() on top.
> >   (riscv_print_operand_address): Call th_print_operand_address()
> >   on top.
> >   * config/riscv/riscv.h (HAVE_POST_MODIFY_DISP): New macro.
> >   (HAVE_PRE_MODIFY_DISP): Likewise.
> >   * config/riscv/riscv.md (zero_extendqi2): Disable
> >   for XTheadMemIdx.
> >   (*zero_extendqi2_internal): Convert to expand,
> >   create INSN with same name and disable it for XTheadMemIdx.
> >   (extendsidi2): Likewise.
> >   (*extendsidi2_internal): Disable for XTheadMemIdx.
> >   * config/riscv/thead-peephole.md: Add helper peephole passes.
> >   * config/riscv/thead.cc (valid_signed_immediate): New helper
> >   function.
> >   (th_memidx_classify_address_modify): New function.
> >   (th_memidx_legitimate_modify_p): Likewise.
> >   (th_memidx_output_modify): Likewise.
> >   (is_memidx_mode): Likewise.
> >   (th_memidx_classify_address_index): Likewise.
> >   (th_memidx_legitimate_index_p): Likewise.
> >   (th_memidx_output_index): Likewise.
> >   (th_classify_address): Likewise.
> >   (th_output_move): Likewise.
> >   (th_print_operand_address): Likewise.
> >   * config/riscv/thead.md (*th_memidx_mov2):
> >   New INSN.
> >   (*th_memidx_zero_extendqi2): Likewise.
> >   (*th_memidx_extendsidi2): Likewise
> >   (*th_memidx_zero_extendsidi2): Likewise.
> >   (*th_memidx_zero_extendhi2): Likewise.
> >   (*th_memidx_extend2): Likewise
> >   (*th_memidx_bb_zero_extendsidi

Re: [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.

2023-05-05 Thread Christoph Müllner
On Fri, May 5, 2023 at 5:13 PM Palmer Dabbelt  wrote:
>
> On Fri, 05 May 2023 08:04:53 PDT (-0700), christoph.muell...@vrull.eu wrote:
> > What I forgot to mention:
> > Zfa is frozen and in public review:
> >   https://groups.google.com/a/groups.riscv.org/g/isa-dev/c/SED4ntBkabg
>
> Thanks, I'd also forgot to send that out ;).
>
> I think the only blocker here on the specification side is the assembly
> format for FLI?  It looks like the feedback on
> <https://github.com/riscv-non-isa/riscv-asm-manual/pull/85> has been
> pretty minor so far.  It'd be nice to have the docs lined up before
> we merge, but we could always just call it a GNU extension -- we've
> already got a lot of that in assembler land, so I don't think it's that
> big of a deal.

I also don't think that we need to wait for that PR to land.

Nelson already gave his ok on the Binutils v4 (but after ratification,
not freeze):
  https://sourceware.org/pipermail/binutils/2023-April/127027.html

FWIW, I have meanwhile sent out a v5 for Binutils as well (there were
few changes requested).
And the v5 has been rebased and retested as well.

>
> >
> > On Fri, May 5, 2023 at 5:03 PM Christoph Müllner
> >  wrote:
> >>
> >> On Wed, Apr 19, 2023 at 11:58 AM Jin Ma  wrote:
> >> >
> >> > This patch adds the 'Zfa' extension for riscv, which is based on:
> >> >   https://github.com/riscv/riscv-isa-manual/commits/zfb
> >> >   
> >> > https://github.com/riscv/riscv-isa-manual/commit/1f038182810727f5feca311072e630d6baac51da
> >> >
> >> > The binutils-gdb for 'Zfa' extension:
> >> >   https://github.com/a4lg/binutils-gdb/commits/riscv-zfa
> >> >
> >> > What needs special explanation is:
> >> > 1, The immediate number of the instructions FLI.H/S/D is represented in 
> >> > the assembly as a
> >> >   floating-point value, with scientific counting when rs1 is 1,2, and 
> >> > decimal numbers for
> >> >   the rest.
> >> >
> >> >   Related llvm link:
> >> > https://reviews.llvm.org/D145645
> >> >   Related discussion link:
> >> > https://github.com/riscv/riscv-isa-manual/issues/980
> >> >
> >> > 2, According to riscv-spec, "The FCVTMO D.W.D instruction was added 
> >> > principally to
> >> >   accelerate the processing of JavaScript Numbers.", so it seems that no 
> >> > implementation
> >> >   is required.
> >> >
> >> > 3, The instructions FMINM and FMAXM correspond to C23 library function 
> >> > fminimum and fmaximum.
> >> >   Therefore, this patch has simply implemented the pattern of 
> >> > fminm3 and
> >> >   fmaxm3 to prepare for later.
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> > * common/config/riscv/riscv-common.cc: Add zfa extension version.
> >> > * config/riscv/constraints.md (Zf): Constrain the floating point 
> >> > number that the
> >> > instructions FLI.H/S/D can load.
> >> > ((TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS): enable 
> >> > FMVP.D.X and FMVH.X.D.
> >> > * config/riscv/iterators.md (ceil): New.
> >> > * config/riscv/riscv-protos.h 
> >> > (riscv_float_const_rtx_index_for_fli): New.
> >> > * config/riscv/riscv.cc (find_index_in_array): New.
> >> > (riscv_float_const_rtx_index_for_fli): Get the index of the 
> >> > floating-point number that
> >> > the instructions FLI.H/S/D can mov.
> >> > (riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be 
> >> > used, memory is not applicable.
> >> > (riscv_const_insns): The cost of FLI.H/S/D is 3.
> >> > (riscv_legitimize_const_move): Likewise.
> >> > (riscv_split_64bit_move_p): If instruction FLI.H/S/D can be 
> >> > used, no split is required.
> >> > (riscv_output_move): Output the mov instructions in zfa 
> >> > extension.
> >> > (riscv_print_operand): Output the floating-point value of the 
> >> > FLI.H/S/D immediate in assembly
> >> > (riscv_secondary_memory_needed): Likewise.
> >> > * config/riscv/riscv.h (GP_REG_RTX_P): New.
> >> > * config/riscv/riscv.md (fminm3): New.
> >> >
> >> > gcc/testsuite/ChangeLog:
> >> >
> >> > * gcc

Re: [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.

2023-05-05 Thread Christoph Müllner
What I forgot to mention:
Zfa is frozen and in public review:
  https://groups.google.com/a/groups.riscv.org/g/isa-dev/c/SED4ntBkabg

On Fri, May 5, 2023 at 5:03 PM Christoph Müllner
 wrote:
>
> On Wed, Apr 19, 2023 at 11:58 AM Jin Ma  wrote:
> >
> > This patch adds the 'Zfa' extension for riscv, which is based on:
> >   https://github.com/riscv/riscv-isa-manual/commits/zfb
> >   
> > https://github.com/riscv/riscv-isa-manual/commit/1f038182810727f5feca311072e630d6baac51da
> >
> > The binutils-gdb for 'Zfa' extension:
> >   https://github.com/a4lg/binutils-gdb/commits/riscv-zfa
> >
> > What needs special explanation is:
> > 1, The immediate number of the instructions FLI.H/S/D is represented in the 
> > assembly as a
> >   floating-point value, with scientific counting when rs1 is 1,2, and 
> > decimal numbers for
> >   the rest.
> >
> >   Related llvm link:
> > https://reviews.llvm.org/D145645
> >   Related discussion link:
> > https://github.com/riscv/riscv-isa-manual/issues/980
> >
> > 2, According to riscv-spec, "The FCVTMO D.W.D instruction was added 
> > principally to
> >   accelerate the processing of JavaScript Numbers.", so it seems that no 
> > implementation
> >   is required.
> >
> > 3, The instructions FMINM and FMAXM correspond to C23 library function 
> > fminimum and fmaximum.
> >   Therefore, this patch has simply implemented the pattern of 
> > fminm3 and
> >   fmaxm3 to prepare for later.
> >
> > gcc/ChangeLog:
> >
> > * common/config/riscv/riscv-common.cc: Add zfa extension version.
> > * config/riscv/constraints.md (Zf): Constrain the floating point 
> > number that the
> > instructions FLI.H/S/D can load.
> > ((TARGET_XTHEADFMV || TARGET_ZFA) ? FP_REGS : NO_REGS): enable 
> > FMVP.D.X and FMVH.X.D.
> > * config/riscv/iterators.md (ceil): New.
> > * config/riscv/riscv-protos.h 
> > (riscv_float_const_rtx_index_for_fli): New.
> > * config/riscv/riscv.cc (find_index_in_array): New.
> > (riscv_float_const_rtx_index_for_fli): Get the index of the 
> > floating-point number that
> > the instructions FLI.H/S/D can mov.
> > (riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be 
> > used, memory is not applicable.
> > (riscv_const_insns): The cost of FLI.H/S/D is 3.
> > (riscv_legitimize_const_move): Likewise.
> > (riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, 
> > no split is required.
> > (riscv_output_move): Output the mov instructions in zfa extension.
> > (riscv_print_operand): Output the floating-point value of the 
> > FLI.H/S/D immediate in assembly
> > (riscv_secondary_memory_needed): Likewise.
> > * config/riscv/riscv.h (GP_REG_RTX_P): New.
> > * config/riscv/riscv.md (fminm3): New.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
> > * gcc.target/riscv/zfa-fleq-fltq.c: New test.
> > * gcc.target/riscv/zfa-fli-rv32.c: New test.
> > * gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
> > * gcc.target/riscv/zfa-fli-zfh.c: New test.
> > * gcc.target/riscv/zfa-fli.c: New test.
> > * gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
> > * gcc.target/riscv/zfa-fround-rv32.c: New test.
> > * gcc.target/riscv/zfa-fround.c: New test.
> > ---
> >  gcc/common/config/riscv/riscv-common.cc   |   4 +
> >  gcc/config/riscv/constraints.md   |  11 +-
> >  gcc/config/riscv/iterators.md |   5 +
> >  gcc/config/riscv/riscv-opts.h |   3 +
> >  gcc/config/riscv/riscv-protos.h   |   1 +
> >  gcc/config/riscv/riscv.cc | 168 +-
> >  gcc/config/riscv/riscv.h  |   1 +
> >  gcc/config/riscv/riscv.md | 112 +---
> >  .../gcc.target/riscv/zfa-fleq-fltq-rv32.c |  19 ++
> >  .../gcc.target/riscv/zfa-fleq-fltq.c  |  19 ++
> >  gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 
> >  .../gcc.target/riscv/zfa-fli-zfh-rv32.c   |  41 +
> >  gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 +
> >  gcc/testsuite/gcc.target/riscv/zfa-fli.c  |  79 
> >  .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 ++
> >  .../gcc.target/riscv/zfa-fround-rv32.c|  42 +
> >  gcc/testsuite/gcc.targ

Re: [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.

2023-05-05 Thread Christoph Müllner
t/riscv/zfa-fround-rv32.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index 309a52def75..f9fce6bcc38 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -217,6 +217,8 @@ static const struct riscv_ext_version 
> riscv_ext_version_table[] =
>{"zfh",   ISA_SPEC_CLASS_NONE, 1, 0},
>{"zfhmin",ISA_SPEC_CLASS_NONE, 1, 0},
>
> +  {"zfa", ISA_SPEC_CLASS_NONE, 0, 2},
> +
>{"zmmul", ISA_SPEC_CLASS_NONE, 1, 0},
>
>{"svinval", ISA_SPEC_CLASS_NONE, 1, 0},
> @@ -1260,6 +1262,8 @@ static const riscv_ext_flag_table_t 
> riscv_ext_flag_table[] =
>{"zfhmin",_options::x_riscv_zf_subext, MASK_ZFHMIN},
>{"zfh",   _options::x_riscv_zf_subext, MASK_ZFH},
>
> +  {"zfa",   _options::x_riscv_zf_subext, MASK_ZFA},
> +
>{"zmmul", _options::x_riscv_zm_subext, MASK_ZMMUL},
>
>{"svinval", _options::x_riscv_sv_subext, MASK_SVINVAL},
> diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
> index c448e6b37e9..62d9094f966 100644
> --- a/gcc/config/riscv/constraints.md
> +++ b/gcc/config/riscv/constraints.md
> @@ -118,6 +118,13 @@ (define_constraint "T"
>(and (match_operand 0 "move_operand")
> (match_test "CONSTANT_P (op)")))
>
> +;; Zfa constraints.
> +
> +(define_constraint "Zf"
> +  "A floating point number that can be loaded using instruction `fli` in 
> zfa."
> +  (and (match_code "const_double")
> +   (match_test "(riscv_float_const_rtx_index_for_fli (op) != -1)")))
> +
>  ;; Vector constraints.
>
>  (define_register_constraint "vr" "TARGET_VECTOR ? V_REGS : NO_REGS"
> @@ -183,8 +190,8 @@ (define_memory_constraint "Wdm"
>
>  ;; Vendor ISA extension constraints.
>
> -(define_register_constraint "th_f_fmv" "TARGET_XTHEADFMV ? FP_REGS : NO_REGS"
> +(define_register_constraint "th_f_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? 
> FP_REGS : NO_REGS"
>"A floating-point register for XTheadFmv.")
>
> -(define_register_constraint "th_r_fmv" "TARGET_XTHEADFMV ? GR_REGS : NO_REGS"
> +(define_register_constraint "th_r_fmv" "(TARGET_XTHEADFMV || TARGET_ZFA) ? 
> GR_REGS : NO_REGS"
>"An integer register for XTheadFmv.")

These are vendor extension constraints with the prefix "th_".
I would avoid using them in code that targets standard extensions.

I see two ways here:
a) Create two new constraints at the top of the file. E.g.:
- "F" - "A floating-point register (no fall-back for Zfinx)" and
- "rF" - "A integer register in case FP registers are available".
b) Move to top and rename these two constraints (and adjust
movdf_hardfloat_rv32 accordingly)

I would prefer b) and would even go so far, that I would do this in a
separate commit that
comes before the Zfa support patch.


I've applied the patch on top of today's master (with --3way) and
successfully tested it:
Tested-by: Christoph Müllner 

> diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
> index 9b767038452..c81b08e3cc5 100644
> --- a/gcc/config/riscv/iterators.md
> +++ b/gcc/config/riscv/iterators.md
> @@ -288,3 +288,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET 
> UNSPEC_FLE_QUIET])
>  (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET 
> "le")])
>  (define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET 
> "LE")])
>
> +(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL 
> UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])
> +(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR 
> "floor") (UNSPEC_CEIL "ceil")
> +   (UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN 
> "roundeven") (UNSPEC_NEARBYINT "nearbyint")])
> +(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") 
> (UNSPEC_CEIL "rup")
> +  (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne") 
> (UNSPEC_NEARBYINT "dyn")])
> \ No newline at end of file
> diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
> index cf0cd669be4..87b72efd12e 100644
> --- a/gcc/config/riscv/riscv-opts.h
> +++ b/gcc/config/riscv/riscv-opts.

Re: [PATCH v6] RISC-V: Add support for experimental zfa extension.

2023-04-13 Thread Christoph Müllner
On Fri, Mar 10, 2023 at 1:41 PM Jin Ma via Gcc-patches
 wrote:
>
> This patch adds the 'Zfa' extension for riscv, which is based on:
>  
> https://github.com/riscv/riscv-isa-manual/commit/d74d99e22d5f68832f70982d867614e2149a3bd7
> latest 'Zfa' change on the master branch of the RISC-V ISA Manual as
> of this writing.
>
> The Wiki Page (details):
>  https://github.com/a4lg/binutils-gdb/wiki/riscv_zfa
>
> The binutils-gdb for 'Zfa' extension:
>  https://sourceware.org/pipermail/binutils/2022-September/122938.html
>
> Implementation of zfa extension on LLVM:
>   https://reviews.llvm.org/rGc0947dc44109252fcc0f68a542fc6ef250d4d3a9
>
> There are three points that need to be discussed here.
> 1. According to riscv-spec, "The FCVTMO D.W.D instruction was added 
> principally to
>   accelerate the processing of JavaScript Numbers.", so it seems that no 
> implementation
>   is required in the compiler.
> 2. The FROUND and FROUNDN instructions in this patch use related functions in 
> the math
>   library, such as round, floor, ceil, etc. Since there is no interface for 
> half-precision in
>   the math library, the instructions FROUN D.H and FROUNDN X.H have not been 
> implemented for
>   the time being. Is it necessary to add a built-in interface belonging to 
> riscv such as
>  __builtin_roundhf or __builtin_roundf16 to generate half floating point 
> instructions?
> 3. As far as I know, FMINM and FMAXM instructions correspond to C23 library 
> function fminimum
>   and fmaximum. Therefore, I have not dealt with such instructions for the 
> time being, but have
>   simply implemented the pattern of fminm3 and fmaxm3. Is 
> it necessary to
>   add a built-in interface belonging to riscv such as__builtin_fminm to 
> generate half
>   floating-point instructions?


I have rebased and tested this patch.
Here are my observations (with fixes below at the actual code):
* There is a compiler warning because of a missing "fallthrough" comment
* There are merge conflicts with a current master
* The constant operand of the fli instruction uses the constant index
in the rs1-field, but not the constant in hex FP literal form

A patch that addresses these issues can also be found here:
  https://github.com/cmuellner/gcc/tree/riscv-zfa

Additionally I observe the following failing test cases with this patch applied:

=== gcc: Unexpected fails for rv64gc lp64d medlow ===
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O0  (internal compiler
error: Segmentation fault)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O0  (test for excess errors)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O1  (internal compiler
error: Segmentation fault)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O1  (test for excess errors)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O2  (internal compiler
error: Segmentation fault)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O2  (test for excess errors)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  (internal compiler error:
Segmentation fault)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error:
Segmentation fault)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O3 -g  (internal
compiler error: Segmentation fault)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -O3 -g  (test for excess errors)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -Os  (internal compiler
error: Segmentation fault)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c   -Os  (test for excess errors)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c  -Og -g  (internal
compiler error: Segmentation fault)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c  -Og -g  (test for excess errors)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c  -Oz  (internal compiler
error: Segmentation fault)
FAIL: gcc.target/riscv/zero-scratch-regs-3.c  -Oz  (test for excess errors)

I have not analysed these ICEs so far.


>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc: Add zfa extension.
> * config/riscv/constraints.md (Zf): Constrain the floating point 
> number that the FLI instruction can load.
> * config/riscv/iterators.md (round_pattern): New.
> * config/riscv/predicates.md: Predicate the floating point number 
> that the FLI instruction can load.
> * config/riscv/riscv-opts.h (MASK_ZFA): New.
> (TARGET_ZFA): New.
> * config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): 
> Get the index of the
>   floating-point number that the FLI instruction can load.
> * config/riscv/riscv.cc (find_index_in_array): New.
> (riscv_float_const_rtx_index_for_fli): New.
> 

Re: [PATCH v3 04/11] riscv: thead: Add support for the XTheadBs ISA extension

2023-02-28 Thread Christoph Müllner
On Wed, Mar 1, 2023 at 1:19 AM Hans-Peter Nilsson  wrote:
>
>
>
> On Tue, 28 Feb 2023, Christoph Müllner wrote:
>
> > On Sun, Feb 26, 2023 at 12:42 AM Hans-Peter Nilsson  
> > wrote:
> > >
> > > On Fri, 24 Feb 2023, Christoph Muellner wrote:
> > > > diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
> > > > index 158e9124c3a..2c684885850 100644
> > > > --- a/gcc/config/riscv/thead.md
> > > > +++ b/gcc/config/riscv/thead.md
> > > > @@ -29,3 +29,14 @@ (define_insn "*th_addsl"
> > > >"th.addsl\t%0,%3,%1,%2"
> > > >[(set_attr "type" "bitmanip")
> > > > (set_attr "mode" "")])
> > > > +
> > > > +;; XTheadBs
> > > > +
> > > > +(define_insn "*th_tst"
> > > > +  [(set (match_operand:X 0 "register_operand" "=r")
> > > > + (zero_extract:X (match_operand:X 1 "register_operand" "r")
> > > > + (const_int 1)
> > > > + (match_operand 2 "immediate_operand" "i")))]
> > >
> > > (Here and same elsewhere.)
> > >
> > > You're unlikely to get other constant operands in that pattern,
> > > but FWIW, the actual matching pair for just CONST_INT is
> > > "const_int_operand" for the predicate and "n" for the
> > > constraint.  Using the right predicate and constraint will also
> > > help the generated part of recog be a few nanoseconds faster. ;)
> >
> > Thank you for that comment!
> > I think what you mean would look like this:
> >
> > (define_insn "*th_tst"
> >   [(set (match_operand:X 0 "register_operand" "=r")
> > (zero_extract:X (match_operand:X 1 "register_operand" "r")
> > (match_operand 3 "const_int_operand" "n")
> > (match_operand 2 "immediate_operand" "i")))]
>
> No; misunderstanding.  Keep the (const_int 1) but replace
> (match_operand 2 "immediate_operand" "i") with
> (match_operand 2 "const_int_operand" "n")

Ah, yes, this makes sense!

Thanks!


>
> brgds, H-P


Re: [PATCH v3 04/11] riscv: thead: Add support for the XTheadBs ISA extension

2023-02-28 Thread Christoph Müllner
On Sun, Feb 26, 2023 at 12:42 AM Hans-Peter Nilsson  wrote:
>
> On Fri, 24 Feb 2023, Christoph Muellner wrote:
> > diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
> > index 158e9124c3a..2c684885850 100644
> > --- a/gcc/config/riscv/thead.md
> > +++ b/gcc/config/riscv/thead.md
> > @@ -29,3 +29,14 @@ (define_insn "*th_addsl"
> >"th.addsl\t%0,%3,%1,%2"
> >[(set_attr "type" "bitmanip")
> > (set_attr "mode" "")])
> > +
> > +;; XTheadBs
> > +
> > +(define_insn "*th_tst"
> > +  [(set (match_operand:X 0 "register_operand" "=r")
> > + (zero_extract:X (match_operand:X 1 "register_operand" "r")
> > + (const_int 1)
> > + (match_operand 2 "immediate_operand" "i")))]
>
> (Here and same elsewhere.)
>
> You're unlikely to get other constant operands in that pattern,
> but FWIW, the actual matching pair for just CONST_INT is
> "const_int_operand" for the predicate and "n" for the
> constraint.  Using the right predicate and constraint will also
> help the generated part of recog be a few nanoseconds faster. ;)

Thank you for that comment!
I think what you mean would look like this:

(define_insn "*th_tst"
  [(set (match_operand:X 0 "register_operand" "=r")
(zero_extract:X (match_operand:X 1 "register_operand" "r")
(match_operand 3 "const_int_operand" "n")
(match_operand 2 "immediate_operand" "i")))]
  "TARGET_XTHEADBS && UINTVAL (operands[2]) < GET_MODE_BITSIZE (mode)
   && UINTVAL (operands[3]) == 1"
  "th.tst\t%0,%1,%2"
  [(set_attr "type" "bitmanip")])

So while we have more generic form in the pattern, the condition needs
to check that the operand is equal to 1.

I can change this in the patch (I don't have strong opinions about
this and I do care about the nanosecond).
However, I think this goes beyond this patchset.
Because a single git grep shows many examples of "const_int " matches
in GCC's backends.
Examples can be found in gcc/config/riscv/bitmanip.md,
gcc/config/aarch64/aarch64.md,...
So it feels like changing the patch to use const_int_operand would go
against common practice.

@Kito: Any preferences about this?

Thanks,
Christoph


Re: [PATCH v3 00/11] RISC-V: Add XThead* extension support

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 9:09 AM Kito Cheng  wrote:
>
> Hi Christoph:
>
> OK for trunk for the 1~8, feel free to commit 1~8 after you address
> those minor comments, and could you also prepare release notes for
> those extensions?

I addressed the comment regarding XTheadBs.
But I have not done anything regarding XTheadB* and Zb*.

Release notes patch can be found here:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612763.html

> And 9~11 needs to take a few more rounds of review and test.

I've seen the comments regarding patch 10 and 11.
We will try to clean this up asap.

In the patch for XTheadMemPair there was this nasty typo in one of the tests,
is there anything else that is needed?
I believe that patch should be in a better shape than the last two patches
and it is much less invasive.
Further similar code can be found in other backends.

Thanks,
Christoph

>
>
>
>
> On Fri, Feb 24, 2023 at 1:52 PM Christoph Muellner
>  wrote:
> >
> > From: Christoph Müllner 
> >
> > This series introduces support for the T-Head specific RISC-V ISA extensions
> > which are available e.g. on the T-Head XuanTie C906.
> >
> > The ISA spec can be found here:
> >   https://github.com/T-head-Semi/thead-extension-spec
> >
> > This series adds support for the following XThead* extensions:
> > * XTheadBa
> > * XTheadBb
> > * XTheadBs
> > * XTheadCmo
> > * XTheadCondMov
> > * XTheadFMemIdx
> > * XTheadFmv
> > * XTheadInt
> > * XTheadMac
> > * XTheadMemIdx
> > * XTheadMemPair
> > * XTheadSync
> >
> > All extensions are properly integrated and the included tests
> > demonstrate the improvements of the generated code.
> >
> > The series also introduces support for "-mcpu=thead-c906", which also
> > enables all available XThead* ISA extensions of the T-Head C906.
> >
> > All patches have been tested and don't introduce regressions for RV32 or 
> > RV64.
> > The patches have also been tested with SPEC CPU2017 on QEMU and real HW
> > (D1 board).
> >
> > Support patches for these extensions for Binutils, QEMU, and LLVM have
> > already been merged in the corresponding upstream projects.
> >
> > Changes in v3:
> > - Bugfix in XTheadBa
> > - Rewrite of XTheadMemPair
> > - Inclusion of XTheadMemIdx and XTheadFMemIdx
> >
> > Christoph Müllner (9):
> >   riscv: Add basic XThead* vendor extension support
> >   riscv: riscv-cores.def: Add T-Head XuanTie C906
> >   riscv: thead: Add support for the XTheadBa ISA extension
> >   riscv: thead: Add support for the XTheadBs ISA extension
> >   riscv: thead: Add support for the XTheadBb ISA extension
> >   riscv: thead: Add support for the XTheadCondMov ISA extensions
> >   riscv: thead: Add support for the XTheadMac ISA extension
> >   riscv: thead: Add support for the XTheadFmv ISA extension
> >   riscv: thead: Add support for the XTheadMemPair ISA extension
> >
> > moiz.hussain (2):
> >   riscv: thead: Add support for the XTheadMemIdx ISA extension
> >   riscv: thead: Add support for the XTheadFMemIdx ISA extension
> >
> >  gcc/common/config/riscv/riscv-common.cc   |   26 +
> >  gcc/config/riscv/bitmanip.md  |   52 +-
> >  gcc/config/riscv/constraints.md   |   43 +
> >  gcc/config/riscv/iterators.md |4 +
> >  gcc/config/riscv/peephole.md  |   56 +
> >  gcc/config/riscv/riscv-cores.def  |4 +
> >  gcc/config/riscv/riscv-opts.h |   29 +
> >  gcc/config/riscv/riscv-protos.h   |   28 +-
> >  gcc/config/riscv/riscv.cc | 1090 +++--
> >  gcc/config/riscv/riscv.h  |8 +-
> >  gcc/config/riscv/riscv.md |  169 ++-
> >  gcc/config/riscv/riscv.opt|3 +
> >  gcc/config/riscv/thead.md |  351 ++
> >  .../gcc.target/riscv/mcpu-thead-c906.c|   28 +
> >  .../gcc.target/riscv/xtheadba-addsl.c |   55 +
> >  gcc/testsuite/gcc.target/riscv/xtheadba.c |   14 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbb-ext.c |   20 +
> >  .../gcc.target/riscv/xtheadbb-extu-2.c|   22 +
> >  .../gcc.target/riscv/xtheadbb-extu.c  |   22 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbb-ff1.c |   18 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbb-rev.c |   45 +
> >  .../gcc.target/riscv/xtheadbb-srri.c  |   21 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbb.c |   14 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c

Re: [PATCH v3 03/11] riscv: thead: Add support for the XTheadBa ISA extension

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 11:05 AM Christoph Müllner
 wrote:
>
> On Fri, Feb 24, 2023 at 10:54 AM Kito Cheng  wrote:
> >
> > My impression is that md patterns will use first-match patterns? so
> > the zba will get higher priority than xtheadba if both patterns are
> > matched?
>
> Yes, I was just about to write this.
>
> /opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
> -march=rv64gc_zba_xtheadba -mtune=thead-c906 -S
> ./gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
>
> The resulting xtheadba-addsl.s file has:
> .attribute arch, 
> "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_zba1p0_xtheadba1p0"
> [...]
> sh1add  a0,a1,a0
>
> So the standard extension will be preferred over the custom extension.

I tested now with all of them (RV32 and RV64):

/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zba_xtheadba -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-ext.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-extu.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-extu-2.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-ff1.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-rev.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-srri.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbs -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c

/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zba_xtheadba -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-ext.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-extu.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-extu-2.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-ff1.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-rev.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-srri.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbs -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c

All behave ok (also when dropping the xtheadb* from the -march).

Is it ok to leave this as is?

Thanks,
Christoph

>
>
> >
> > On Fri, Feb 24, 2023 at 2:52 PM Andrew Pinski via Gcc-patches
> >  wrote:
> > >
> > > On Thu, Feb 23, 2023 at 9:55 PM Christoph Muellner
> > >  wrote:
> > > >
> > > > From: Christoph Müllner 
> > > >
> > > > This patch adds support for the XTheadBa ISA extension.
> > > > The new INSN pattern is defined in a new file to separate
> > > > this vendor extension from the standard extensions.
> > >
> > > How does this interact with doing -march=rv32gc_xtheadba_zba ?
> > > Seems like it might be better handle that case correctly. I suspect
> > > these all XThreadB* extensions have a similar problem too.
> > >
> > > Thanks,
> > > Andrew Pinski
> > >
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >     * config/riscv/riscv.md: Include thead.md
> > > > * config/riscv/thead.md: New file.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > * gcc.target/riscv/xtheadba-addsl.c: New test.
> > > >
> > > > Changes in v3:
> > > > - Fix operand order for th.addsl.
> > > >
> > > > Signed-off-by: Christoph Müllner 
> > > > ---
> > > >  gcc/config/riscv/riscv.md |  1 +
> > > >  gcc/config/riscv/thead.md  

Re: [PATCH v3 04/11] riscv: thead: Add support for the XTheadBs ISA extension

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 8:37 AM Kito Cheng  wrote:
>
> > diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
> > index 158e9124c3a..2c684885850 100644
> > --- a/gcc/config/riscv/thead.md
> > +++ b/gcc/config/riscv/thead.md
> > @@ -29,3 +29,14 @@ (define_insn "*th_addsl"
> >"th.addsl\t%0,%3,%1,%2"
> >[(set_attr "type" "bitmanip")
> > (set_attr "mode" "")])
> > +
> > +;; XTheadBs
> > +
> > +(define_insn "*th_tst"
> > +  [(set (match_operand:X 0 "register_operand" "=r")
> > +   (zero_extract:X (match_operand:X 1 "register_operand" "r")
> > +   (const_int 1)
> > +   (match_operand 2 "immediate_operand" "i")))]
> > +  "TARGET_XTHEADBS"
>
> Add range check like *bexti pattern?
>
> TARGET_XTHEADBS && UINTVAL (operands[2]) < GET_MODE_BITSIZE (mode)

Ok.

Thanks,
Christoph

>
> > +  "th.tst\t%0,%1,%2"
> > +  [(set_attr "type" "bitmanip")])


Re: [PATCH v3 03/11] riscv: thead: Add support for the XTheadBa ISA extension

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 10:54 AM Kito Cheng  wrote:
>
> My impression is that md patterns will use first-match patterns? so
> the zba will get higher priority than xtheadba if both patterns are
> matched?

Yes, I was just about to write this.

/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zba_xtheadba -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c

The resulting xtheadba-addsl.s file has:
.attribute arch, "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_zba1p0_xtheadba1p0"
[...]
sh1add  a0,a1,a0

So the standard extension will be preferred over the custom extension.


>
> On Fri, Feb 24, 2023 at 2:52 PM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > On Thu, Feb 23, 2023 at 9:55 PM Christoph Muellner
> >  wrote:
> > >
> > > From: Christoph Müllner 
> > >
> > > This patch adds support for the XTheadBa ISA extension.
> > > The new INSN pattern is defined in a new file to separate
> > > this vendor extension from the standard extensions.
> >
> > How does this interact with doing -march=rv32gc_xtheadba_zba ?
> > Seems like it might be better handle that case correctly. I suspect
> > these all XThreadB* extensions have a similar problem too.
> >
> > Thanks,
> > Andrew Pinski
> >
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/riscv/riscv.md: Include thead.md
> > > * config/riscv/thead.md: New file.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.target/riscv/xtheadba-addsl.c: New test.
> > >
> > > Changes in v3:
> > > - Fix operand order for th.addsl.
> > >
> > > Signed-off-by: Christoph Müllner 
> > > ---
> > >  gcc/config/riscv/riscv.md |  1 +
> > >  gcc/config/riscv/thead.md | 31 +++
> > >  .../gcc.target/riscv/xtheadba-addsl.c | 55 +++
> > >  3 files changed, 87 insertions(+)
> > >  create mode 100644 gcc/config/riscv/thead.md
> > >  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> > >
> > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > > index 05924e9bbf1..d6c2265e9d4 100644
> > > --- a/gcc/config/riscv/riscv.md
> > > +++ b/gcc/config/riscv/riscv.md
> > > @@ -3093,4 +3093,5 @@ (define_insn "riscv_prefetchi_"
> > >  (include "pic.md")
> > >  (include "generic.md")
> > >  (include "sifive-7.md")
> > > +(include "thead.md")
> > >  (include "vector.md")
> > > diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
> > > new file mode 100644
> > > index 000..158e9124c3a
> > > --- /dev/null
> > > +++ b/gcc/config/riscv/thead.md
> > > @@ -0,0 +1,31 @@
> > > +;; Machine description for T-Head vendor extensions
> > > +;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
> > > +
> > > +;; This file is part of GCC.
> > > +
> > > +;; GCC is free software; you can redistribute it and/or modify
> > > +;; it under the terms of the GNU General Public License as published by
> > > +;; the Free Software Foundation; either version 3, or (at your option)
> > > +;; any later version.
> > > +
> > > +;; GCC is distributed in the hope that it will be useful,
> > > +;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > +;; GNU General Public License for more details.
> > > +
> > > +;; You should have received a copy of the GNU General Public License
> > > +;; along with GCC; see the file COPYING3.  If not see
> > > +;; <http://www.gnu.org/licenses/>.
> > > +
> > > +;; XTheadBa
> > > +
> > > +(define_insn "*th_addsl"
> > > +  [(set (match_operand:X 0 "register_operand" "=r")
> > > +   (plus:X (ashift:X (match_operand:X 1 "register_operand" "r")
> > > + (match_operand:QI 2 "immediate_operand" "I"))
> > > +   (match_operand:X 3 "register_operand" "r")))]
> > > +  "TARGET_XTHEADBA
> > > +   && (INTVAL (operands[2]) >= 0) && (INTVAL (operands[2]) <= 3)"
> > > +  "th.addsl\t%0,%3,%1,%2"
> > > +  [(set_attr "type" "bitmanip")
&

Re: [PATCH v3 09/11] riscv: thead: Add support for the XTheadMemPair ISA extension

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 10:01 AM Kito Cheng  wrote:
>
> Got one fail:
>
> FAIL: gcc.target/riscv/xtheadmempair-1.c   -O2   scan-assembler-times
> th.luwd\t 4
>
> It should scan lwud rather than luwd?

Yes, this should be th.lwud.
Must have been introduced after testing.

I also ran the whole patchset again with RV32 and RV64.
This should be the only issue of this kind in the series.
Sorry for that!


Re: [PATCH v2 02/11] riscv: Restructure callee-saved register save/restore code

2022-12-19 Thread Christoph Müllner
On Mon, Dec 19, 2022 at 7:30 AM Kito Cheng  wrote:

> just one more nit: Use INVALID_REGNUM as sentinel value for
> riscv_next_saved_reg, otherwise LGTM, and feel free to commit that
> separately :)
>

Would this change below be ok?

@@ -5540,7 +5540,7 @@ riscv_next_saved_reg (unsigned int regno, unsigned
int limit,
   if (inc)
 regno++;

-  while (regno <= limit)
+  while (regno <= limit && regno != INVALID_REGNUM)
 {
   if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
{

Thanks,
Christoph



>
> On Mon, Dec 19, 2022 at 9:08 AM Christoph Muellner
>  wrote:
> >
> > From: Christoph Müllner 
> >
> > This patch restructures the loop over the GP registers
> > which saves/restores then as part of the prologue/epilogue.
> > No functional change is intended by this patch, but it
> > offers the possibility to use load-pair/store-pair instructions.
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv.cc (riscv_next_saved_reg): New function.
> > (riscv_is_eh_return_data_register): New function.
> > (riscv_for_each_saved_reg): Restructure loop.
> >
> > Signed-off-by: Christoph Müllner 
> > ---
> >  gcc/config/riscv/riscv.cc | 94 +++
> >  1 file changed, 66 insertions(+), 28 deletions(-)
> >
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index 6dd2ab2d11e..a8d5e1dac7f 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -4835,6 +4835,49 @@ riscv_save_restore_reg (machine_mode mode, int
> regno,
> >fn (gen_rtx_REG (mode, regno), mem);
> >  }
> >
> > +/* Return the next register up from REGNO up to LIMIT for the callee
> > +   to save or restore.  OFFSET will be adjusted accordingly.
> > +   If INC is set, then REGNO will be incremented first.  */
> > +
> > +static unsigned int
> > +riscv_next_saved_reg (unsigned int regno, unsigned int limit,
> > + HOST_WIDE_INT *offset, bool inc = true)
> > +{
> > +  if (inc)
> > +regno++;
> > +
> > +  while (regno <= limit)
> > +{
> > +  if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
> > +   {
> > + *offset = *offset - UNITS_PER_WORD;
> > + break;
> > +   }
> > +
> > +  regno++;
> > +}
> > +  return regno;
> > +}
> > +
> > +/* Return TRUE if provided REGNO is eh return data register.  */
> > +
> > +static bool
> > +riscv_is_eh_return_data_register (unsigned int regno)
> > +{
> > +  unsigned int i, regnum;
> > +
> > +  if (!crtl->calls_eh_return)
> > +return false;
> > +
> > +  for (i = 0; (regnum = EH_RETURN_DATA_REGNO (i)) != INVALID_REGNUM;
> i++)
> > +if (regno == regnum)
> > +  {
> > +   return true;
> > +  }
> > +
> > +  return false;
> > +}
> > +
> >  /* Call FN for each register that is saved by the current function.
> > SP_OFFSET is the offset of the current stack pointer from the start
> > of the frame.  */
> > @@ -4844,36 +4887,31 @@ riscv_for_each_saved_reg (poly_int64 sp_offset,
> riscv_save_restore_fn fn,
> >   bool epilogue, bool maybe_eh_return)
> >  {
> >HOST_WIDE_INT offset;
> > +  unsigned int regno;
> > +  unsigned int start = GP_REG_FIRST;
> > +  unsigned int limit = GP_REG_LAST;
> >
> >/* Save the link register and s-registers. */
> > -  offset = (cfun->machine->frame.gp_sp_offset - sp_offset).to_constant
> ();
> > -  for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
> > -if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
> > -  {
> > -   bool handle_reg =
> !cfun->machine->reg_is_wrapped_separately[regno];
> > -
> > -   /* If this is a normal return in a function that calls the
> eh_return
> > -  builtin, then do not restore the eh return data registers as
> that
> > -  would clobber the return value.  But we do still need to save
> them
> > -  in the prologue, and restore them for an exception return, so
> we
> > -  need special handling here.  */
> > -   if (epilogue && !maybe_eh_return && crtl->calls_eh_return)
> > - {
> > -   unsigned int i, regnum;
> > -
> > -   for (i = 0; (regnum = EH_RETURN_DATA_REGNO (i)) !=
> INVALID_REGNUM;
&

Re: [PATCH v2 02/11] riscv: Restructure callee-saved register save/restore code

2022-12-19 Thread Christoph Müllner
On Mon, Dec 19, 2022 at 10:26 AM Kito Cheng  wrote:

> Something like this:
>
> static unsigned int
> riscv_next_saved_reg (unsigned int regno, unsigned int limit,
>  HOST_WIDE_INT *offset, bool inc = true)
> {
>   if (inc)
> regno++;
>
>   while (regno <= limit)
> {
>   if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
>{
>  *offset = *offset - UNITS_PER_WORD;
>  break;
>}
>
>   regno++;
> }
>   if (regno >= limit)
> return INVALID_REGNUM;
>   else
> return regno;
> }
> ...
>
>   for (regno = riscv_next_saved_reg (start, limit, , false);
>regno != INVALID_REGNUM;
>regno = riscv_next_saved_reg (regno, limit, ))
> {
> ...
>
>
Ok, I see.
I changed it as follows (it will be retested before committing):

@@ -5531,7 +5531,8 @@ riscv_save_restore_reg (machine_mode mode, int regno,

 /* Return the next register up from REGNO up to LIMIT for the callee
to save or restore.  OFFSET will be adjusted accordingly.
-   If INC is set, then REGNO will be incremented first.  */
+   If INC is set, then REGNO will be incremented first.
+   Returns INVALID_REGNUM if there is no such next register.  */

 static unsigned int
 riscv_next_saved_reg (unsigned int regno, unsigned int limit,
@@ -5545,12 +5546,12 @@ riscv_next_saved_reg (unsigned int regno, unsigned
int limit,
   if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
{
  *offset = *offset - UNITS_PER_WORD;
- break;
+ return regno;
}

   regno++;
 }
-  return regno;
+  return INVALID_REGNUM;
 }

 /* Return TRUE if provided REGNO is eh return data register.  */
@@ -5589,7 +5590,7 @@ riscv_for_each_saved_reg (poly_int64 sp_offset,
riscv_save_restore_fn fn,
   offset = (cfun->machine->frame.gp_sp_offset - sp_offset).to_constant ()
   + UNITS_PER_WORD;
   for (regno = riscv_next_saved_reg (start, limit, , false);
-   regno <= limit;
+   regno != INVALID_REGNUM;

Thanks!



> On Mon, Dec 19, 2022 at 5:21 PM Christoph Müllner
>  wrote:
> >
> >
> >
> > On Mon, Dec 19, 2022 at 7:30 AM Kito Cheng 
> wrote:
> >>
> >> just one more nit: Use INVALID_REGNUM as sentinel value for
> >> riscv_next_saved_reg, otherwise LGTM, and feel free to commit that
> >> separately :)
> >
> >
> > Would this change below be ok?
> >
> > @@ -5540,7 +5540,7 @@ riscv_next_saved_reg (unsigned int regno, unsigned
> int limit,
> >if (inc)
> >  regno++;
> >
> > -  while (regno <= limit)
> > +  while (regno <= limit && regno != INVALID_REGNUM)
> >  {
> >if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
> > {
> >
> > Thanks,
> > Christoph
> >
> >
> >>
> >>
> >> On Mon, Dec 19, 2022 at 9:08 AM Christoph Muellner
> >>  wrote:
> >> >
> >> > From: Christoph Müllner 
> >> >
> >> > This patch restructures the loop over the GP registers
> >> > which saves/restores then as part of the prologue/epilogue.
> >> > No functional change is intended by this patch, but it
> >> > offers the possibility to use load-pair/store-pair instructions.
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> > * config/riscv/riscv.cc (riscv_next_saved_reg): New function.
> >> > (riscv_is_eh_return_data_register): New function.
> >> > (riscv_for_each_saved_reg): Restructure loop.
> >> >
> >> > Signed-off-by: Christoph Müllner 
> >> > ---
> >> >  gcc/config/riscv/riscv.cc | 94
> +++
> >> >  1 file changed, 66 insertions(+), 28 deletions(-)
> >> >
> >> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> >> > index 6dd2ab2d11e..a8d5e1dac7f 100644
> >> > --- a/gcc/config/riscv/riscv.cc
> >> > +++ b/gcc/config/riscv/riscv.cc
> >> > @@ -4835,6 +4835,49 @@ riscv_save_restore_reg (machine_mode mode, int
> regno,
> >> >fn (gen_rtx_REG (mode, regno), mem);
> >> >  }
> >> >
> >> > +/* Return the next register up from REGNO up to LIMIT for the callee
> >> > +   to save or restore.  OFFSET will be adjusted accordingly.
> >> > +   If INC is set, then REGNO will be incremented first.  */
> >> > +
> >> > +static unsigned int
> >> > +riscv_next_saved_reg (unsigned int regno, unsigned int limit,
> >

Re: [PATCH] RISC-V: Add support for AIA ISA extensions (Ssaia and Smaia)

2022-11-27 Thread Christoph Müllner
On Fri, Nov 18, 2022 at 10:08 AM Christoph Müllner
 wrote:
>
>
>
> On Fri, Nov 18, 2022 at 6:09 AM Palmer Dabbelt  wrote:
>>
>> On Thu, 17 Nov 2022 18:12:23 PST (-0800), christoph.muell...@vrull.eu wrote:
>> > From: Christoph Müllner 
>> >
>> > This patch adds support for the two AIA ISA extensions Ssaia and Smaia.
>> > They are not relelvant for the compiler, but the assembler might want
>> > to validate the CSRs. Therefore, all this patch does is recognize the
>> > extension name, emit a feature macro (incl. a test).
>>
>> This is pretty far in the weeds, but the AIA PDF says
>>
>> extension Smaia encompasses all added CSRs and all modifications to
>> interrupt response behavior that the AIA specifies for a hart, over
>> all privilege levels
>>
>> but only a subset of AIA has been frozen.  I think that's fine, assuming
>> we're decoupling ourselves from the ISA strings (and thus extension
>> names).  We just need to document it somewhere -- presumably invoke, but
>> that doesn't document anything else yet so we don't really have a
>> pattern to match.
>
>
> Thanks for highlighting this!
> We could model this such that Smaia implies Ssaia.
> Since the tool's interpretation of these extensions is "availability of 
> extension's CSRs",
> this should work.
> But it is mostly irrelevant for GCC, as Binutils does the CSR checking, and 
> we need
> to model it there.
>
> I see what you mean with the "subset of AIA has been frozen".
> I would expect that the draft chapters ("Duo-PLIC" and "IOMMU Support") will
> introduce new CSRs in the future. They might get included in separate 
> extensions,
> be available only if another extension is enabled (like the hypervisor CSRs), 
> or
> they will be put into the existing Smaia and Ssaia extensions.
> The last case is problematic, as it would change the behavior of the CSR 
> checker.
> We could therefore document that the CSR checker strictly follows the latest
> specs and that changing behavior is possible for that reason.
> Not perfect, but reasonable and a method to permanently solve the recurring
> CSR discussions.

Palmer, since you did not respond since 9 days,
I tried to guess what you want to have documented and made a change in
invoke.texi:
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607326.html

The Binutils patch landed already:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ac8df5a1921904b3928429e696ad8b40c612f829

>
>
>
>
>>
>>
>> > Signed-off-by: Christoph Müllner 
>> > ---
>> >  gcc/common/config/riscv/riscv-common.cc |  2 ++
>> >  gcc/testsuite/gcc.target/riscv/smaia.c  | 13 +
>> >  gcc/testsuite/gcc.target/riscv/ssaia.c  | 13 +
>> >  3 files changed, 28 insertions(+)
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/smaia.c
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/ssaia.c
>> >
>> > diff --git a/gcc/common/config/riscv/riscv-common.cc 
>> > b/gcc/common/config/riscv/riscv-common.cc
>> > index 4b7f777c103..674eded07b7 100644
>> > --- a/gcc/common/config/riscv/riscv-common.cc
>> > +++ b/gcc/common/config/riscv/riscv-common.cc
>> > @@ -219,6 +219,8 @@ static const struct riscv_ext_version 
>> > riscv_ext_version_table[] =
>> >
>> >{"zmmul", ISA_SPEC_CLASS_NONE, 1, 0},
>> >
>> > +  {"smaia", ISA_SPEC_CLASS_NONE, 1, 0},
>> > +  {"ssaia", ISA_SPEC_CLASS_NONE, 1, 0},
>> >{"svinval", ISA_SPEC_CLASS_NONE, 1, 0},
>> >{"svnapot", ISA_SPEC_CLASS_NONE, 1, 0},
>> >
>> > diff --git a/gcc/testsuite/gcc.target/riscv/smaia.c 
>> > b/gcc/testsuite/gcc.target/riscv/smaia.c
>> > new file mode 100644
>> > index 000..9ca80236245
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/smaia.c
>> > @@ -0,0 +1,13 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-march=rv64gc_smaia" { target { rv64 } } } */
>> > +/* { dg-options "-march=rv32gc_smaia" { target { rv32 } } } */
>> > +
>> > +#ifndef __riscv_smaia
>> > +#error Feature macro not defined
>> > +#endif
>> > +
>> > +int
>> > +foo (int a)
>> > +{
>> > +  return a;
>> > +}
>> > diff --git a/gcc/testsuite/gcc.target/riscv/ssaia.c 
>> > b/gcc/testsuite/gcc.target/riscv/ssaia.c
>> > new file mode 100644
>> > index 000..b20e0eb10f5
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/ssaia.c
>> > @@ -0,0 +1,13 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-march=rv64gc_ssaia" { target { rv64 } } } */
>> > +/* { dg-options "-march=rv32gc_ssaia" { target { rv32 } } } */
>> > +
>> > +#ifndef __riscv_ssaia
>> > +#error Feature macro not defined
>> > +#endif
>> > +
>> > +int
>> > +foo (int a)
>> > +{
>> > +  return a;
>> > +}


  1   2   >