[PATCH v3] vect: Verify that GET_MODE_UNITS is greater than one for vect_grouped_store_supported

2023-04-17 Thread Kevin Lee
This patch properly guards gcc_assert (multiple_p (m_full_nelts, 
m_npatterns)) in vec_perm_indices indices (sel, 2, nelt) for VNx1 vectors.

Based on the feedback from Richard Biener and Richard Sandiford,
multiple_p has been used instead of maybe_lt to compare nelt with the
minimum size 2.

Bootstrap and testing done on x86_64-pc-linux-gnu. Would this be ok for trunk?

Patch V1: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614463.html
Patch V2: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614700.html
 
Kevin Lee 
gcc/ChangeLog:

* tree-vect-data-refs.cc (vect_grouped_store_supported): Add new
condition
---
 gcc/tree-vect-data-refs.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 8daf7bd7dd3..df393ba723d 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -5399,6 +5399,8 @@ vect_grouped_store_supported (tree vectype, unsigned 
HOST_WIDE_INT count)
  poly_uint64 nelt = GET_MODE_NUNITS (mode);
 
  /* The encoding has 2 interleaved stepped patterns.  */
+if(!multiple_p (nelt, 2))
+  return false;
  vec_perm_builder sel (nelt, 2, 3);
  sel.quick_grow (6);
  for (i = 0; i < 3; i++)
-- 
2.25.1



Re: [PATCH v2][RFC] vect: Verify that GET_MODE_NUNITS is greater than one for vect_grouped_store_supported

2023-04-12 Thread Kevin Lee
Thank you for the feedback Richard and Richard.

> Note the calls are guarded with
>
>   && ! known_eq (TYPE_VECTOR_SUBPARTS (vectype), 1U)

Yes, I believe nelt.is_constant() wouldn't be necessary. I didn't realize
the call was guarded by this condition.

> But I think the better check for location above is:
>
>if (!multiple_p (nelt, 2))
> return false;
>
> which then guards the assert in the later exact_div (nelt, 2).

I believe this check is better than using maybe_lt because it properly
guards exact_div(nelt, 2) and vec_perm_builder sel(nelt, 2, 3).
I'll modify the patch accordingly, build, test and submit the patch. Thank
you!!

Sincerely,


Ping: [PATCH v2][RFC] vect: Verify that GET_MODE_NUNITS is greater than one for vect_grouped_store_supported

2023-04-06 Thread Kevin Lee
May I ping this patch?
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614700.html
Any suggestions and comments would be appreciated. Thank you!

Sincerely,
Kevin Lee


[PATCH v2][RFC] vect: Verify that GET_MODE_NUNITS is greater than one for vect_grouped_store_supported

2023-03-27 Thread Kevin Lee
This patch is a proper fix to the previous patch 
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614463.html 
vect_grouped_store_supported checks if the count is a power of 2, but
doesn't check the size of the GET_MODE_NUNITS.
This should handle the riscv case where the mode is VNx1DI since the
nelt would be {1, 1}. 
It was tested on RISCV and x86_64-linux-gnu. Would this be correct 
for the vectors with size smaller than 2?

---
 gcc/tree-vect-data-refs.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 8daf7bd7dd3..04ad12f7d04 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -5399,6 +5399,8 @@ vect_grouped_store_supported (tree vectype, unsigned 
HOST_WIDE_INT count)
  poly_uint64 nelt = GET_MODE_NUNITS (mode);
 
  /* The encoding has 2 interleaved stepped patterns.  */
+if(!nelt.is_constant() && maybe_lt(nelt, (unsigned int) 2))
+  return false;
  vec_perm_builder sel (nelt, 2, 3);
  sel.quick_grow (6);
  for (i = 0; i < 3; i++)
-- 
2.25.1



[RFC][Patch] vect: verify that nelt is greater than one

2023-03-22 Thread Kevin Lee
This is a patch related to 
https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613977.html, 
aiming for gcc14. Since the RISC-V target has vector modes (e.g. VNx1DImode)
with nelt smaller than 2, npat has to match with the nelt to create proper 
vec_perm_indices. 

I tested on x86_64-linux-gnu and didn't cause more failures, but wasn't sure if 
total_elem would be used in the rest of the function. Should there be additional
changes in the vect_grouped_store_supported? Thank you!

gcc/ChangeLog:
Kevin Lee 
* tree-vect-data-refs.cc (vect_grouped_store_supported): Check
if the nelt is greater than one.
---
 gcc/tree-vect-data-refs.cc | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 8daf7bd7dd3..9c09cc973d0 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -5399,17 +5399,20 @@ vect_grouped_store_supported (tree vectype, unsigned 
HOST_WIDE_INT count)
  poly_uint64 nelt = GET_MODE_NUNITS (mode);
 
  /* The encoding has 2 interleaved stepped patterns.  */
- vec_perm_builder sel (nelt, 2, 3);
- sel.quick_grow (6);
+
+unsigned int npat = known_gt(nelt, (unsigned int) 1) ? 2 : 1;
+unsigned int total_elem = npat * 3;
+ vec_perm_builder sel (nelt, npat, 3);
+ sel.quick_grow (total_elem);
  for (i = 0; i < 3; i++)
{
- sel[i * 2] = i;
- sel[i * 2 + 1] = i + nelt;
+ sel[i * npat] = i;
+ sel[i * npat + 1] = i + nelt;
}
  vec_perm_indices indices (sel, 2, nelt);
  if (can_vec_perm_const_p (mode, mode, indices))
{
- for (i = 0; i < 6; i++)
+ for (i = 0; i < total_elem; i++)
sel[i] += exact_div (nelt, 2);
  indices.new_vector (sel, 2, nelt);
  if (can_vec_perm_const_p (mode, mode, indices))
-- 
2.25.1



[PATCH v3] RISC-V missing __builtin_lceil and __builtin_lfloor

2022-11-16 Thread Kevin Lee
l insn condition has been modified based on the thread in
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605481.html. The
lfloor-lecil-inexact checks call instead of scan-assembler-not
"fcvt.l.s/d" due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107723.

Is this patch good for commit?

gcc/ChangeLog:

Michael Collison 
        Kevin Lee 
* config/riscv/iterators.md (RINT): Additional iterators.
(rint_pattern): Additional attributes.
(rint_rm): Ditto.
* config/riscv/riscv.md (UNSPEC_LCEIL): New unspec.
(UNSPEC_LFLOOR): Ditto.
(l2): Additional conditions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/lfloor-lceil-inexact.c: New test.
* gcc.target/riscv/lfloor-lceil.c: New test.
---
 gcc/config/riscv/iterators.md | 10 ++-
 gcc/config/riscv/riscv.md |  8 +-
 .../gcc.target/riscv/lfloor-lceil-inexact.c   | 78 ++
 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c | 79 +++
 4 files changed, 171 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/lfloor-lceil-inexact.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 50380ecfac9..c5adcb08421 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -233,9 +233,13 @@ (define_code_attr bitmanip_insn [(smin "min")
 ;; ---
 
 ;; Iterator and attributes for floating-point rounding instructions.
-(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND])
-(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND "round")])
-(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")])
+(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND UNSPEC_LCEIL 
UNSPEC_LFLOOR])
+(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND "round")
+ (UNSPEC_LCEIL "ceil") (UNSPEC_LFLOOR "floor")])
+(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")
+(UNSPEC_LCEIL "rup") (UNSPEC_LFLOOR "rdn")])
+(define_int_attr rint_allow_inexact [(UNSPEC_LRINT "1") (UNSPEC_LROUND "0")
+(UNSPEC_LCEIL "0") (UNSPEC_LFLOOR "0")])
 
 ;; Iterator and attributes for quiet comparisons.
 (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 798f7370a08..5074f8e 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -60,6 +60,9 @@ (define_c_enum "unspec" [
   UNSPEC_FMIN
   UNSPEC_FMAX
 
+  UNSPEC_LCEIL
+  UNSPEC_LFLOOR
+
   ;; Stack tie
   UNSPEC_TIE
 ])
@@ -1552,7 +1555,10 @@ (define_insn "l2"
(unspec:GPR
[(match_operand:ANYF 1 "register_operand" " f")]
RINT))]
-  "TARGET_HARD_FLOAT || TARGET_ZFINX"
+  "(TARGET_HARD_FLOAT || TARGET_ZFINX) &&
+(
+  || flag_fp_int_builtin_inexact
+  || !flag_trapping_math)"
   "fcvt.. %0,%1,"
   [(set_attr "type" "fcvt")
(set_attr "mode" "")])
diff --git a/gcc/testsuite/gcc.target/riscv/lfloor-lceil-inexact.c 
b/gcc/testsuite/gcc.target/riscv/lfloor-lceil-inexact.c
new file mode 100644
index 000..3b37df20d0e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/lfloor-lceil-inexact.c
@@ -0,0 +1,78 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -fno-fp-int-builtin-inexact" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+int
+ceil1(float i)
+{
+  return __builtin_lceil(i);
+}
+
+long
+ceil2(float i)
+{
+  return __builtin_lceil(i);
+}
+
+long long
+ceil3(float i)
+{
+  return __builtin_lceil(i);
+}
+
+int
+ceil4(double i)
+{
+  return __builtin_lceil(i);
+}
+
+long
+ceil5(double i)
+{
+  return __builtin_lceil(i);
+}
+
+long long
+ceil6(double i)
+{
+  return __builtin_lceil(i);
+}
+
+int
+floor1(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+long
+floor2(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+long long
+floor3(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+int
+floor4(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+long
+floor5(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+long long
+floor6(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+/* { dg-final { scan-assembler-times "call" 12 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c 
b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
new file mode 100644
index 000..4715de746fb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
@@ -0,0 +1,79 @

[PATCH] RISC-V uninit-pred-9_b.c failure

2022-11-15 Thread Kevin Lee
The gimple generated by riscv is identical to that of powerpc64
currently. It seems like the change at 
r12-4790-4b3a325f07acebf47e82de227ce1d5ba62f5bcae also affected riscv64 like 
powerpc64 and cris.

gcc/testsuite/ChangeLog:

* gcc.dg/uninit-pred-9_b.c: Xfail for riscv64
---
 gcc/testsuite/gcc.dg/uninit-pred-9_b.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/uninit-pred-9_b.c 
b/gcc/testsuite/gcc.dg/uninit-pred-9_b.c
index 53c4a5399ea..843f5323713 100644
--- a/gcc/testsuite/gcc.dg/uninit-pred-9_b.c
+++ b/gcc/testsuite/gcc.dg/uninit-pred-9_b.c
@@ -17,7 +17,7 @@ int foo (int n, int l, int m, int r)
 
   if (l > 100)
 if ( (n <= 9) &&  (m < 100)  && (r < 19) )
-  blah(v); /* { dg-bogus "uninitialized" "bogus warning" { xfail 
powerpc64*-*-* cris-*-* } } */
+  blah(v); /* { dg-bogus "uninitialized" "bogus warning" { xfail 
powerpc64*-*-* cris-*-* riscv64*-*-* } */
 
   if ( (n <= 8) &&  (m < 99)  && (r < 19) )
   blah(v); /* { dg-bogus "uninitialized" "pr101674" { xfail mmix-*-* } } */
-- 
2.25.1



Re: [PATCH v2] RISC-V missing __builtin_lceil and __builtin_lfloor

2022-11-11 Thread Kevin Lee
On Wed, Nov 9, 2022 at 1:49 AM Xi Ruoyao  wrote:
>
> On Mon, 2022-11-07 at 20:36 -0800, Kevin Lee wrote:
> I "shamelessly copied" your idea in
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605456.html.
> During the review we found an issue.
>

> -fno-fp-int-builtin-inexact does not allow __builtin_ceil to raise
> inexact exception.  But fcvt.l.d may raise one.

Your solution of activating only for the fp-int-builtin-inexact seems
to be a good way to handle the issue. Thank you for the example. I'll
create a new patch based on the fix.

> --
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University


[PATCH v2] RISC-V missing __builtin_lceil and __builtin_lfloor

2022-11-07 Thread Kevin Lee
The patch in 
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599755.html was
corrupted. I am resending the cleaner version as patch v2. Thank you! 

gcc/ChangeLog:

Michael Collison 
* config/riscv/iterators.md (RINT): Additional iterators.
(rint_pattern): Additional attributes.
(rint_rm): Ditto.
* config/riscv/riscv.md: New attributes.

gcc/testsuite/ChangeLog:

Kevin Lee 
* gcc.target/riscv/lfloor-lceil.c: New test.
---
 gcc/config/riscv/iterators.md |  8 +-
 gcc/config/riscv/riscv.md |  3 +
 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c | 79 +++
 3 files changed, 87 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 50380ecfac9..3dd705eaf81 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -233,9 +233,11 @@ (define_code_attr bitmanip_insn [(smin "min")
 ;; ---
 
 ;; Iterator and attributes for floating-point rounding instructions.
-(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND])
-(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND "round")])
-(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")])
+(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND UNSPEC_LCEIL 
UNSPEC_LFLOOR])
+(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND "round")
+ (UNSPEC_LCEIL "ceil") (UNSPEC_LFLOOR "floor")])
+(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")
+(UNSPEC_LCEIL "rup") (UNSPEC_LFLOOR "rdn")])
 
 ;; Iterator and attributes for quiet comparisons.
 (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 798f7370a08..07e72af8950 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -60,6 +60,9 @@ (define_c_enum "unspec" [
   UNSPEC_FMIN
   UNSPEC_FMAX
 
+  UNSPEC_LCEIL
+  UNSPEC_LFLOOR
+
   ;; Stack tie
   UNSPEC_TIE
 ])
diff --git a/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c 
b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
new file mode 100644
index 000..4715de746fb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
@@ -0,0 +1,79 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+int
+ceil1(float i)
+{
+  return __builtin_lceil(i);
+}
+
+long
+ceil2(float i)
+{
+  return __builtin_lceil(i);
+}
+
+long long
+ceil3(float i)
+{
+  return __builtin_lceil(i);
+}
+
+int
+ceil4(double i)
+{
+  return __builtin_lceil(i);
+}
+
+long
+ceil5(double i)
+{
+  return __builtin_lceil(i);
+}
+
+long long
+ceil6(double i)
+{
+  return __builtin_lceil(i);
+}
+
+int
+floor1(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+long
+floor2(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+long long
+floor3(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+int
+floor4(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+long
+floor5(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+long long
+floor6(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+/* { dg-final { scan-assembler-times "fcvt.l.s" 6 } } */
+/* { dg-final { scan-assembler-times "fcvt.l.d" 6 } } */
+/* { dg-final { scan-assembler-not "call" } } */
-- 
2.25.1



Re: [PATCH] RISC-V missing __builtin_lceil and __builtin_lfloor

2022-11-07 Thread Kevin Lee
> Kevin: Looks like this got corrupted, possibly from copy/paste into
> gmail.  I resurrect it, but there's a floating-point test failure in
> gfortran.  Looks like it predates this, but I'm trying to bisect it to
> at least have a root cause before just ignoring it.  I've got this
> floating around on a branch and hopefully that'll remind me to commit
> it after I sort that out.

Currently, the testsuite doesn't show additional failures. It seems
like the corrupted patch caused the issue. I will post the clean patch
as v2. Thank you for the review!

On Sun, Oct 2, 2022 at 1:47 PM Kevin Lee  wrote:
>
> Thank you for the update Palmer. I'll certainly look into the corrupted patch 
> issue and the floating-point test failure in gfortran.
>
> On Sun, Oct 2, 2022 at 1:42 PM Palmer Dabbelt  wrote:
>>
>> On Sat, 17 Sep 2022 14:16:13 PDT (-0700), Kito Cheng wrote:
>> > LGTM, thanks, I guess I just missed this before
>>
>> No worries, I'd just stubmled on it looking through old stuff.
>>
>> Kevin: Looks like this got corrupted, possibly from copy/paste into
>> gmail.  I resurrect it, but there's a floating-point test failure in
>> gfortran.  Looks like it predates this, but I'm trying to bisect it to
>> at least have a root cause before just ignoring it.  I've got this
>> floating around on a branch and hopefully that'll remind me to commit
>> it after I sort that out.
>>
>> >
>> > Palmer Dabbelt  於 2022年9月17日 週六 23:07 寫道:
>> >
>> >> On Mon, 15 Aug 2022 17:44:35 PDT (-0700), kev...@rivosinc.com wrote:
>> >> > Hello,
>> >> > Currently, __builtin_lceil and __builtin_lfloor doesn't generate an
>> >> > existing instruction fcvt, but rather calls ceil and floor from the
>> >> > library. This patch adds the missing iterator and attributes for lceil
>> >> and
>> >> > lfloor to produce the optimized code.
>> >> >  The test cases check the correct generation of the fcvt instruction for
>> >> > float/double to int/long/long long. Passed the test in riscv-linux.
>> >> > Could this patch be committed?
>> >>
>> >> Reviewed-by: Palmer Dabbelt 
>> >> Acked-by: Palmer Dabbelt 
>> >>
>> >> Not sure if Kito had any comments for this one, but it looks good to me.
>> >>
>> >> > gcc/ChangeLog:
>> >> >Michael Collison  
>> >> > * config/riscv/riscv.md (RINT): Add iterator for lceil and
>> >> lround.
>> >> > (rint_pattern): Add ceil and floor.
>> >> > (rint_rm): Add rup and rdn.
>> >> >
>> >> > gcc/testsuite/ChangeLog:
>> >> > Kevin Lee  
>> >> > * gcc.target/riscv/lfloor-lceil.c: New test.
>> >> > ---
>> >> >  gcc/config/riscv/riscv.md | 13 ++-
>> >> >  gcc/testsuite/gcc.target/riscv/lfloor-lceil.c | 79 +++
>> >> >  2 files changed, 88 insertions(+), 4 deletions(-)
>> >> >  create mode 100644 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
>> >> >
>> >> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
>> >> > index c6399b1389e..070004fa7fe 100644
>> >> > --- a/gcc/config/riscv/riscv.md
>> >> > +++ b/gcc/config/riscv/riscv.md
>> >> > @@ -43,6 +43,9 @@ (define_c_enum "unspec" [
>> >> >UNSPEC_LRINT
>> >> >UNSPEC_LROUND
>> >> >
>> >> > +  UNSPEC_LCEIL
>> >> > +  UNSPEC_LFLOOR
>> >> > +
>> >> >;; Stack tie
>> >> >UNSPEC_TIE
>> >> >  ])
>> >> > @@ -345,10 +348,12 @@ (define_mode_attr UNITMODE [(SF "SF") (DF "DF")])
>> >> >  ;; the controlling mode.
>> >> >  (define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])
>> >> >
>> >> > -;; Iterator and attributes for floating-point rounding instructions.
>> >> > -(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND])
>> >> > -(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
>> >> > "round")])
>> >> > -(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")])
>> >> > +;; Iterator and attributes for floating-point rounding instructions.f
>> >> > +(define_int_iterator RINT [UNSPEC_LRINT UNSPEC

[PATCH v3] RISC-V modified add3 for large stack frame optimization [PR105733]

2022-11-03 Thread Kevin Lee
This is the identical patch with 
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604814.html, but with 
the correct plaintext format. 

>The loop still seems a bit odd which may point to further improvements 
>that could be made to this patch.  Consider this fragment of the loop:

Thank you for the review Jeff! I am currently looking into this issue
in a different patch. I'll come back with some improvement.
 
gcc/ChangeLog:
   Jim Wilson 
   Michael Collison 
   Kevin Lee 
   
* config/riscv/predicates.md (const_lui_operand): New Predicate.
(add_operand): Ditto.
(reg_or_const_int_operand): Ditto.
* config/riscv/riscv-protos.h (riscv_eliminable_reg): New 
   function.
* config/riscv/riscv-selftests.cc (calculate_x_in_sequence):
   Consider Parallel insns.
* config/riscv/riscv.cc (riscv_eliminable_reg): New function.
(riscv_adjust_libcall_cfi_prologue): Use gen_rtx_SET and
   gen_rtx_fmt_ee instead of gen_add3_insn.
(riscv_adjust_libcall_cfi_epilogue): Ditto.
* config/riscv/riscv.md (addsi3): Remove.
(add3): New instruction for large stack frame
   optimization.
(add3_internal): Ditto.
(adddi3): Remove.
(add3_internal2): New instruction for insns generated in
   the prologue and epilogue pass.
---
 gcc/config/riscv/predicates.md  | 13 +
 gcc/config/riscv/riscv-protos.h |  1 +
 gcc/config/riscv/riscv-selftests.cc |  3 ++
 gcc/config/riscv/riscv.cc   | 20 +--
 gcc/config/riscv/riscv.md   | 84 -
 5 files changed, 104 insertions(+), 17 deletions(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index c2ff41bb0fd..3149f7227ac 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -35,6 +35,14 @@
   (ior (match_operand 0 "arith_operand")
(match_operand 0 "lui_operand")))
 
+(define_predicate "const_lui_operand"
+  (and (match_code "const_int")
+   (match_test "(INTVAL (op) & 0xFFF) == 0 && INTVAL (op) != 0")))
+
+(define_predicate "add_operand"
+  (ior (match_operand 0 "arith_operand")
+   (match_operand 0 "const_lui_operand")))
+
 (define_predicate "const_csr_operand"
   (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 0, 31)")))
@@ -59,6 +67,11 @@
   (ior (match_operand 0 "const_0_operand")
(match_operand 0 "register_operand")))
 
+;; For use in adds, when adding to an eliminable register.
+(define_predicate "reg_or_const_int_operand"
+  (ior (match_code "const_int")
+   (match_operand 0 "register_operand")))
+
 ;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
 (define_predicate "branch_on_bit_operand"
   (and (match_code "const_int")
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 5a718bb62b4..9348ac71956 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -63,6 +63,7 @@ extern void riscv_expand_conditional_move (rtx, rtx, rtx, 
rtx_code, rtx, rtx);
 extern rtx riscv_legitimize_call_address (rtx);
 extern void riscv_set_return_address (rtx, rtx);
 extern bool riscv_expand_block_move (rtx, rtx, rtx);
+extern bool riscv_eliminable_reg (rtx);
 extern rtx riscv_return_addr (int, rtx);
 extern poly_int64 riscv_initial_elimination_offset (int, int);
 extern void riscv_expand_prologue (void);
diff --git a/gcc/config/riscv/riscv-selftests.cc 
b/gcc/config/riscv/riscv-selftests.cc
index 636874ebc0f..50457db708e 100644
--- a/gcc/config/riscv/riscv-selftests.cc
+++ b/gcc/config/riscv/riscv-selftests.cc
@@ -116,6 +116,9 @@ calculate_x_in_sequence (rtx reg)
   rtx pat = PATTERN (insn);
   rtx dest = SET_DEST (pat);
 
+  if (GET_CODE (pat) == PARALLEL)
+   dest = SET_DEST (XVECEXP (pat, 0, 0));
+
   if (GET_CODE (pat) == CLOBBER)
continue;
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 32f9ef9ade9..de9344b37a3 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4686,6 +4686,16 @@ riscv_initial_elimination_offset (int from, int to)
   return src - dest;
 }
 
+/* Return true if X is a register that will be eliminated later on.  */
+bool
+riscv_eliminable_reg (rtx x)
+{
+  return REG_P (x) && (REGNO (x) == FRAME_POINTER_REGNUM
+  || REGNO (x) == ARG_POINTER_REGNUM
+  || (REGNO (x) >= FIRST_VIRTUAL_REGISTER
+  && REGNO (x) <= LAST_VIRTUAL_REGISTER));
+}
+
 /* Implement RETURN_ADDR_RTX.  We do not support moving back to a
previous frame.  */
 
@@ -4887,8 +4897,9 @@ riscv_adjust_libcall_cfi_prologue ()
   }
 
   /* Debug info for adjust sp.  */
-  adjust_sp_rtx = gen_add

[PATCH v2] RISC-V modified add3 for large stack frame optimization [PR105733]

2022-11-01 Thread Kevin Lee
This is the updated patch of
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601824.html. Since
the riscv-selftest.cc has been added, this version of the patch adds the
logic in riscv-selftest.cc to also consider parallel insns.
  The patch has been tested with rv64imafdc / rv64imac / rv32imafdc /
rv32imac and no additional failures were detected in the testsuite.

gcc/ChangeLog:
Jim Wilson 
Michael Collison 
Kevin Lee 
* config/riscv/predicates.md (const_lui_operand): New Predicate.
(add_operand): Ditto.
(reg_or_const_int_operand): Ditto.
* config/riscv/riscv-protos.h (riscv_eliminable_reg): New
function.
* config/riscv/riscv-selftests.cc (calculate_x_in_sequence):
Consider Parallel insns.
* config/riscv/riscv.cc (riscv_eliminable_reg): New function.
(riscv_adjust_libcall_cfi_prologue): Use gen_rtx_SET and
gen_rtx_fmt_ee instead of gen_add3_insn.
(riscv_adjust_libcall_cfi_epilogue): Ditto.
* config/riscv/riscv.md (addsi3): Remove.
(add3): New instruction for large stack frame
optimization.
(add3_internal): Ditto.
(adddi3): Remove.
(add3_internal2): New instruction for insns generated in
the prologue and epilogue pass.
---
gcc/config/riscv/predicates.md | 13 +
gcc/config/riscv/riscv-protos.h | 1 +
gcc/config/riscv/riscv-selftests.cc | 3 ++
gcc/config/riscv/riscv.cc | 20 +--
gcc/config/riscv/riscv.md | 84 -
5 files changed, 104 insertions(+), 17 deletions(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index c2ff41bb0fd..3149f7227ac 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -35,6 +35,14 @@
(ior (match_operand 0 "arith_operand")
(match_operand 0 "lui_operand")))
+(define_predicate "const_lui_operand"
+ (and (match_code "const_int")
+ (match_test "(INTVAL (op) & 0xFFF) == 0 && INTVAL (op) != 0")))
+
+(define_predicate "add_operand"
+ (ior (match_operand 0 "arith_operand")
+ (match_operand 0 "const_lui_operand")))
+
(define_predicate "const_csr_operand"
(and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 0, 31)")))
@@ -59,6 +67,11 @@
(ior (match_operand 0 "const_0_operand")
(match_operand 0 "register_operand")))
+;; For use in adds, when adding to an eliminable register.
+(define_predicate "reg_or_const_int_operand"
+ (ior (match_code "const_int")
+ (match_operand 0 "register_operand")))
+
;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
(define_predicate "branch_on_bit_operand"
(and (match_code "const_int")
diff --git a/gcc/config/riscv/riscv-protos.h
b/gcc/config/riscv/riscv-protos.h
index 5a718bb62b4..9348ac71956 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -63,6 +63,7 @@ extern void riscv_expand_conditional_move (rtx, rtx, rtx,
rtx_code, rtx, rtx);
extern rtx riscv_legitimize_call_address (rtx);
extern void riscv_set_return_address (rtx, rtx);
extern bool riscv_expand_block_move (rtx, rtx, rtx);
+extern bool riscv_eliminable_reg (rtx);
extern rtx riscv_return_addr (int, rtx);
extern poly_int64 riscv_initial_elimination_offset (int, int);
extern void riscv_expand_prologue (void);
diff --git a/gcc/config/riscv/riscv-selftests.cc
b/gcc/config/riscv/riscv-selftests.cc
index 636874ebc0f..50457db708e 100644
--- a/gcc/config/riscv/riscv-selftests.cc
+++ b/gcc/config/riscv/riscv-selftests.cc
@@ -116,6 +116,9 @@ calculate_x_in_sequence (rtx reg)
rtx pat = PATTERN (insn);
rtx dest = SET_DEST (pat);
+ if (GET_CODE (pat) == PARALLEL)
+ dest = SET_DEST (XVECEXP (pat, 0, 0));
+
if (GET_CODE (pat) == CLOBBER)
continue;
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 32f9ef9ade9..de9344b37a3 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4686,6 +4686,16 @@ riscv_initial_elimination_offset (int from, int to)
return src - dest;
}
+/* Return true if X is a register that will be eliminated later on. */
+bool
+riscv_eliminable_reg (rtx x)
+{
+ return REG_P (x) && (REGNO (x) == FRAME_POINTER_REGNUM
+ || REGNO (x) == ARG_POINTER_REGNUM
+ || (REGNO (x) >= FIRST_VIRTUAL_REGISTER
+ && REGNO (x) <= LAST_VIRTUAL_REGISTER));
+}
+
/* Implement RETURN_ADDR_RTX. We do not support moving back to a
previous frame. */
@@ -4887,8 +4897,9 @@ riscv_adjust_libcall_cfi_prologue ()
}
/* Debug info for adjust sp. */
- adjust_sp_rtx = gen_add3_insn (stack_pointer_rtx,
- stack_pointer_rtx, GEN_INT (-saved_size));
+ adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx,
+ gen_rtx_fmt_ee (PLUS, GET_MODE (stack_pointer_rtx),
+ stack_pointer_rtx, GEN_INT (saved_size)));
dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
dwarf);
return dwarf;
@@ -4990,8 +5001,9 @@ riscv_adjust_libcall_cfi_epilogue ()
int saved_size = cfun->machine->frame.save_libcall_adjustment;
/* Debug 

Re: [PATCH] RISC-V missing __builtin_lceil and __builtin_lfloor

2022-10-02 Thread Kevin Lee
Thank you for the update Palmer. I'll certainly look into the corrupted
patch issue and the floating-point test failure in gfortran.

On Sun, Oct 2, 2022 at 1:42 PM Palmer Dabbelt  wrote:

> On Sat, 17 Sep 2022 14:16:13 PDT (-0700), Kito Cheng wrote:
> > LGTM, thanks, I guess I just missed this before
>
> No worries, I'd just stubmled on it looking through old stuff.
>
> Kevin: Looks like this got corrupted, possibly from copy/paste into
> gmail.  I resurrect it, but there's a floating-point test failure in
> gfortran.  Looks like it predates this, but I'm trying to bisect it to
> at least have a root cause before just ignoring it.  I've got this
> floating around on a branch and hopefully that'll remind me to commit
> it after I sort that out.
>
> >
> > Palmer Dabbelt  於 2022年9月17日 週六 23:07 寫道:
> >
> >> On Mon, 15 Aug 2022 17:44:35 PDT (-0700), kev...@rivosinc.com wrote:
> >> > Hello,
> >> > Currently, __builtin_lceil and __builtin_lfloor doesn't generate an
> >> > existing instruction fcvt, but rather calls ceil and floor from the
> >> > library. This patch adds the missing iterator and attributes for lceil
> >> and
> >> > lfloor to produce the optimized code.
> >> >  The test cases check the correct generation of the fcvt instruction
> for
> >> > float/double to int/long/long long. Passed the test in riscv-linux.
> >> > Could this patch be committed?
> >>
> >> Reviewed-by: Palmer Dabbelt 
> >> Acked-by: Palmer Dabbelt 
> >>
> >> Not sure if Kito had any comments for this one, but it looks good to me.
> >>
> >> > gcc/ChangeLog:
> >> >Michael Collison  
> >> > * config/riscv/riscv.md (RINT): Add iterator for lceil and
> >> lround.
> >> > (rint_pattern): Add ceil and floor.
> >> > (rint_rm): Add rup and rdn.
> >> >
> >> > gcc/testsuite/ChangeLog:
> >> > Kevin Lee  
> >> > * gcc.target/riscv/lfloor-lceil.c: New test.
> >> > ---
> >> >  gcc/config/riscv/riscv.md | 13 ++-
> >> >  gcc/testsuite/gcc.target/riscv/lfloor-lceil.c | 79
> +++
> >> >  2 files changed, 88 insertions(+), 4 deletions(-)
> >> >  create mode 100644 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> >> >
> >> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> >> > index c6399b1389e..070004fa7fe 100644
> >> > --- a/gcc/config/riscv/riscv.md
> >> > +++ b/gcc/config/riscv/riscv.md
> >> > @@ -43,6 +43,9 @@ (define_c_enum "unspec" [
> >> >UNSPEC_LRINT
> >> >UNSPEC_LROUND
> >> >
> >> > +  UNSPEC_LCEIL
> >> > +  UNSPEC_LFLOOR
> >> > +
> >> >;; Stack tie
> >> >UNSPEC_TIE
> >> >  ])
> >> > @@ -345,10 +348,12 @@ (define_mode_attr UNITMODE [(SF "SF") (DF
> "DF")])
> >> >  ;; the controlling mode.
> >> >  (define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])
> >> >
> >> > -;; Iterator and attributes for floating-point rounding instructions.
> >> > -(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND])
> >> > -(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
> >> > "round")])
> >> > -(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND
> "rmm")])
> >> > +;; Iterator and attributes for floating-point rounding instructions.f
> >> > +(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND UNSPEC_LCEIL
> >> > UNSPEC_LFLOOR])
> >> > +(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
> >> > "round")
> >> > + (UNSPEC_LCEIL "ceil") (UNSPEC_LFLOOR
> >> > "floor")])
> >> > +(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")
> >> > +(UNSPEC_LCEIL "rup") (UNSPEC_LFLOOR "rdn")])
> >> >
> >> >  ;; Iterator and attributes for quiet comparisons.
> >> >  (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET
> >> UNSPEC_FLE_QUIET])
> >> > diff --git a/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> >> > b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> >> > new file 

Re: [PATCH] RISC-V modified add3 for large stack frame optimization [PR105733]

2022-09-20 Thread Kevin Lee
The proposed patch only makes the difference if the operand 1 is an
eliminable register and operand 2 is a splittable const int. Otherwise, it
follows the original add3 pattern.

Besides the example from pr105733 shown on the first post,
#define BUF_SIZE 5012
void saxpy( float a )
{
  volatile float x[BUF_SIZE];
  volatile float y[BUF_SIZE];

  for (int i = 0; i < BUF_SIZE; ++i)
  y[i] = a*x[i] + y[i];
}
generates
Before:
saxpy:
li t0,-40960
li a2,40960
addi t0,t0,848
add sp,sp,t0
li a4,-40960
addi a3,a2,-864
add a3,a3,a4
addi a4,sp,16
add a4,a3,a4
sd a4,0(sp)
addi a3,a2,-864
li a4,-20480
add a3,a3,a4
addi a4,sp,16
add a4,a3,a4
li a2,4096
li a5,0
sd a4,8(sp)
addi a2,a2,916
.L2:
ld a4,8(sp)
ld a3,0(sp)
sh2add a4,a5,a4
sh2add a3,a5,a3
flw fa5,864(a3)
flw fa4,432(a4)
addiw a5,a5,1
fmadd.s fa5,fa5,fa0,fa4
fsw fa5,432(a4)
bne a5,a2,.L2
li t0,40960
addi t0,t0,-848
add sp,sp,t0
jr ra

After:
saxpy:
li t0,-40960
addi t0,t0,864
li a2,4096
add sp,sp,t0
li a5,0
addi a2,a2,916
.L2:
li a4,20480
addi a4,a4,-864
add a4,a4,sp
addi a3,sp,-864
sh2add a4,a5,a4
sh2add a3,a5,a3
flw fa5,864(a3)
flw fa4,432(a4)
addiw a5,a5,1
fmadd.s fa5,fa5,fa0,fa4
fsw fa5,432(a4)
bne a5,a2,.L2
li t0,40960
addi t0,t0,-864
add sp,sp,t0
jr ra

The number of instructions before .L2 is reduced from 19 to 6 after the
patch.
Moreover, the following example
#define limit 4096
void foo()
{
volatile int temp = 0;
volatile int buf[limit];
for(int i = 0; i < limit; ++i){
for(int j = 0; j < limit; ++j){
temp += buf[(i * 1234 + j) % limit];
}
}
}
generates
before:
foo:
li t0,-16384
addi t0,t0,-32
li a4,16384
add sp,sp,t0
li a5,-16384
addi a4,a4,16
add a4,a4,a5
addi a5,sp,16
add a5,a4,a5
li a1,4096
sd a5,8(sp)
sw zero,-4(a5)
li a7,-4096
addi a0,a1,-1
li a6,5058560
.L2:
addw a5,a7,a1
.L3:
ld a3,8(sp)
and a4,a5,a0
addiw a5,a5,1
sh2add a4,a4,a3
lw a2,0(a4)
lw a4,-4(a3)
addw a4,a4,a2
ld a2,8(sp)
sw a4,-4(a2)
bne a5,a1,.L3
addiw a1,a5,1234
bne a1,a6,.L2
li t0,16384
addi t0,t0,32
add sp,sp,t0
jr ra

After:
foo:
li t0,-16384
addi t0,t0,-16
add sp,sp,t0
li a1,4096
sw zero,12(sp)
li a7,-4096
addi a0,a1,-1
li a6,5058560
.L2:
addw a5,a7,a1
.L3:
and a4,a5,a0
addi a3,sp,16
sh2add a4,a4,a3
lw a2,0(a4)
lw a4,12(sp)
addiw a5,a5,1
addw a4,a4,a2
sw a4,12(sp)
bne a5,a1,.L3
addiw a1,a5,1234
bne a1,a6,.L2
li t0,16384
addi t0,t0,16
add sp,sp,t0
jr ra

This example also shows that the instructions before .L2 is reduced from 15
lines to 8 lines after the patch.

On Mon, Sep 19, 2022 at 3:16 PM Kito Cheng  wrote:

> Could you provide some data including code size and performance? add is
> frequently used patten, so we should more careful when changing that.
>
> Kevin Lee 於 2022年9月19日 週一,18:07寫道:
>
>> Hello GCC,
>>  Started from Jim Wilson's patch in
>>
>> https://github.com/riscv-admin/riscv-code-speed-optimization/blob/main/projects/gcc-optimizations.adoc
>> for the large stack frame optimization problem, this augmented patch
>> generates less instructions for cases such as
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105733.
>> Original:
>> foo:
>> li t0,-4096
>> addi t0,t0,2016
>> li a4,4096
>> add sp,sp,t0
>> li a5,-4096
>> addi a4,a4,-2032
>> add a4,a4,a5
>> addi a5,sp,16
>> add a5,a4,a5
>> add a0,a5,a0
>> li t0,4096
>> sd a5,8(sp)
>> sb zero,2032(a0)
>> addi t0,t0,-2016
>> add sp,sp,t0
>> jr ra
>> After Patch:
>> foo:
>> li t0,-4096
>> addi t0,t0,2032
>> add sp,sp,t0
>> addi a5,sp,-2032
>> add a0,a5

[PATCH] RISC-V modified add3 for large stack frame optimization [PR105733]

2022-09-19 Thread Kevin Lee
Hello GCC,
 Started from Jim Wilson's patch in
https://github.com/riscv-admin/riscv-code-speed-optimization/blob/main/projects/gcc-optimizations.adoc
for the large stack frame optimization problem, this augmented patch
generates less instructions for cases such as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105733.
Original:
foo:
li t0,-4096
addi t0,t0,2016
li a4,4096
add sp,sp,t0
li a5,-4096
addi a4,a4,-2032
add a4,a4,a5
addi a5,sp,16
add a5,a4,a5
add a0,a5,a0
li t0,4096
sd a5,8(sp)
sb zero,2032(a0)
addi t0,t0,-2016
add sp,sp,t0
jr ra
After Patch:
foo:
li t0,-4096
addi t0,t0,2032
add sp,sp,t0
addi a5,sp,-2032
add a0,a5,a0
li t0,4096
sb zero,2032(a0)
addi t0,t0,-2032
add sp,sp,t0
jr ra

   = Summary of gcc testsuite =
| # of unexpected case / # of unique unexpected
case
|  gcc |  g++ | gfortran |
 rv64gc/  lp64d/ medlow |4 / 4 |   13 / 4 |0 / 0 |
No additional failures were created from the testsuite.

gcc/ChangeLog:
   Jim Wilson 
   Michael Collison 
   Kevin Lee 

* config/riscv/predicates.md (const_lui_operand): New predicate.
(add_operand): Ditto.
(reg_or_const_int_operand): Ditto.
* config/riscv/riscv-protos.h (riscv_eliminable_reg): New
function.
* config/riscv/riscv.cc (riscv_eliminable_reg): New Function.
(riscv_adjust_libcall_cfi_prologue): Use gen_rtx_SET and
gen_rtx_fmt_ee instead of gen_add3_insn.
(riscv_adjust_libcall_cfi_epilogue): ditto.
* config/riscv/riscv.md (addsi3): Remove.
(adddi3): ditto.
(add3): New instruction for large stack frame optimization.
(add3_internal): ditto
(add3_internal2): New instruction for insns generated in
the prologue and epilogue pass.
---
 gcc/config/riscv/predicates.md  | 13 +
 gcc/config/riscv/riscv-protos.h |  1 +
 gcc/config/riscv/riscv.cc   | 20 ++--
 gcc/config/riscv/riscv.md   | 84 -
 4 files changed, 101 insertions(+), 17 deletions(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 862e72b0983..b98bb5a9768 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -35,6 +35,14 @@ (define_predicate "sfb_alu_operand"
   (ior (match_operand 0 "arith_operand")
(match_operand 0 "lui_operand")))

+(define_predicate "const_lui_operand"
+  (and (match_code "const_int")
+   (match_test "(INTVAL (op) & 0xFFF) == 0 && INTVAL (op) != 0")))
+
+(define_predicate "add_operand"
+  (ior (match_operand 0 "arith_operand")
+   (match_operand 0 "const_lui_operand")))
+
 (define_predicate "const_csr_operand"
   (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 0, 31)")))
@@ -59,6 +67,11 @@ (define_predicate "reg_or_0_operand"
   (ior (match_operand 0 "const_0_operand")
(match_operand 0 "register_operand")))

+;; For use in adds, when adding to an eliminable register.
+(define_predicate "reg_or_const_int_operand"
+  (ior (match_code "const_int")
+   (match_operand 0 "register_operand")))
+
 ;; Only use branch-on-bit sequences when the mask is not an ANDI immediate.
 (define_predicate "branch_on_bit_operand"
   (and (match_code "const_int")
diff --git a/gcc/config/riscv/riscv-protos.h
b/gcc/config/riscv/riscv-protos.h
index 649c5c977e1..8f0aa8114be 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -63,6 +63,7 @@ extern void riscv_expand_conditional_move (rtx, rtx, rtx,
rtx_code, rtx, rtx);
 extern rtx riscv_legitimize_call_address (rtx);
 extern void riscv_set_return_address (rtx, rtx);
 extern bool riscv_expand_block_move (rtx, rtx, rtx);
+extern bool riscv_eliminable_reg (rtx);
 extern rtx riscv_return_addr (int, rtx);
 extern poly_int64 riscv_initial_elimination_offset (int, int);
 extern void riscv_expand_prologue (void);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 675d92c0961..b5577a4f366 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4320,6 +4320,16 @@ riscv_initial_elimination_offset (int from, int to)
   return src - dest;
 }

+/* Return true if X is a register that will be eliminated later on.  */
+bool
+riscv_eliminable_reg (rtx x)
+{
+  return REG_P (x) && (REGNO (x) == FRAME_POINTER_REGNUM
+   || REGNO (x) == ARG_POINTER_REGNUM
+   || (REGNO (x) >= FIRST_VIRTUAL_REGISTER
+   && REGNO (x) <= LAST_VIRTUAL_REGISTER));
+}
+
 /* Implement RETURN_ADDR_RTX.  We do not support moving back to a
previous frame.  */

@@ -4521,8 +4531,9 @@ riscv_adjust_libcall_cfi_prologue ()
   }

   /* Debug info for adjust sp.  */
-  adjust_sp_rtx = gen_add3_insn (stack_pointer_rtx,
- stack_pointer_rtx, GEN_INT (-saved_size));
+  adjust_sp_rtx = gen_r

[PATCH] RISC-V missing __builtin_lceil and __builtin_lfloor

2022-08-15 Thread Kevin Lee
Hello,
Currently, __builtin_lceil and __builtin_lfloor doesn't generate an
existing instruction fcvt, but rather calls ceil and floor from the
library. This patch adds the missing iterator and attributes for lceil and
lfloor to produce the optimized code.
 The test cases check the correct generation of the fcvt instruction for
float/double to int/long/long long. Passed the test in riscv-linux.
Could this patch be committed?

gcc/ChangeLog:
   Michael Collison  
* config/riscv/riscv.md (RINT): Add iterator for lceil and lround.
(rint_pattern): Add ceil and floor.
(rint_rm): Add rup and rdn.

gcc/testsuite/ChangeLog:
Kevin Lee  
* gcc.target/riscv/lfloor-lceil.c: New test.
---
 gcc/config/riscv/riscv.md | 13 ++-
 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c | 79 +++
 2 files changed, 88 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index c6399b1389e..070004fa7fe 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -43,6 +43,9 @@ (define_c_enum "unspec" [
   UNSPEC_LRINT
   UNSPEC_LROUND

+  UNSPEC_LCEIL
+  UNSPEC_LFLOOR
+
   ;; Stack tie
   UNSPEC_TIE
 ])
@@ -345,10 +348,12 @@ (define_mode_attr UNITMODE [(SF "SF") (DF "DF")])
 ;; the controlling mode.
 (define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])

-;; Iterator and attributes for floating-point rounding instructions.
-(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND])
-(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
"round")])
-(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")])
+;; Iterator and attributes for floating-point rounding instructions.f
+(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND UNSPEC_LCEIL
UNSPEC_LFLOOR])
+(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
"round")
+ (UNSPEC_LCEIL "ceil") (UNSPEC_LFLOOR
"floor")])
+(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")
+(UNSPEC_LCEIL "rup") (UNSPEC_LFLOOR "rdn")])

 ;; Iterator and attributes for quiet comparisons.
 (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
diff --git a/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
new file mode 100644
index 000..4d81c12cefa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
@@ -0,0 +1,79 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+int
+ceil1(float i)
+{
+  return __builtin_lceil(i);
+}
+
+long
+ceil2(float i)
+{
+  return __builtin_lceil(i);
+}
+
+long long
+ceil3(float i)
+{
+  return __builtin_lceil(i);
+}
+
+int
+ceil4(double i)
+{
+  return __builtin_lceil(i);
+}
+
+long
+ceil5(double i)
+{
+  return __builtin_lceil(i);
+}
+
+long long
+ceil6(double i)
+{
+  return __builtin_lceil(i);
+}
+
+int
+floor1(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+long
+floor2(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+long long
+floor3(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+int
+floor4(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+long
+floor5(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+long long
+floor6(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+/* { dg-final { scan-assembler-times "fcvt.l.s" 6 } } */
+/* { dg-final { scan-assembler-times "fcvt.l.d" 6 } } */
+/* { dg-final { scan-assembler-not "call" } } */
-- 
2.25.1