[PATCH] MATCH: Remove redundant pattern for `(x | y) & ~x`

2023-08-27 Thread Andrew Pinski via Gcc-patches
After r14-2885-gb9237226fdc938, this pattern becomes
redundant as we match it using bitwise_inverted_equal_p.

There is already a testcase (gcc.dg/nand.c) for this pattern
and it still passes after the removal.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/46
* match.pd (`(x | y) & ~x`, `(x & y) | ~x`): Remove
redundant pattern.
---
 gcc/match.pd | 8 
 1 file changed, 8 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index fa598d5ca2e..0076392c522 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1556,14 +1556,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (bit_ior:c (bit_xor:s @0 @1) (bit_not:s (bit_ior:s @0 @1)))
  (bit_not (bit_and @0 @1)))
 
-/* (x | y) & ~x -> y & ~x */
-/* (x & y) | ~x -> y | ~x */
-(for bitop (bit_and bit_ior)
- rbitop (bit_ior bit_and)
- (simplify
-  (bitop:c (rbitop:c @0 @1) (bit_not@2 @0))
-  (bitop @1 @2)))
-
 /* (x & y) ^ (x | y) -> x ^ y */
 (simplify
  (bit_xor:c (bit_and @0 @1) (bit_ior @0 @1))
-- 
2.31.1



Re: [PATCH] RISC-V: Enable vec_init testsuite for RVV VLA vectorization

2023-08-27 Thread Kito Cheng via Gcc-patches
> @@ -11100,6 +11101,15 @@ proc check_vect_support_and_set_flags { } {
>  }
>  } elseif [istarget amdgcn-*-*] {
>  set dg-do-what-default run
> +} elseif [istarget riscv64-*-*] {
> +   if [check_effective_target_riscv_vector_hw] {
> +   lappend DEFAULT_VECTCFLAGS "--param" 
> "riscv-autovec-preference=scalable"
> +   set dg-do-what-default run
> +   } else {
> +   lappend DEFAULT_VECTCFLAGS "-march=rv64gcv_zfh" "-mabi=lp64d"

I would suggest using `-march=rv64gcv` or `-march=rv64gcv_zvfh_zfh`?
Just zfh is not meaningful I think.


[PATCH v2] LoongArch: Enable '-free' starting at -O2.

2023-08-27 Thread Lulu Cheng
v1 -> v2:
1. Modify Changelog information format.

gcc/ChangeLog:

* common/config/loongarch/loongarch-common.cc:
Enable '-free' on O2 and above.
* doc/invoke.texi: Modify the description information
of the '-free' compilation option and add the LoongArch
description.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/sign-extend.c: New test.
---
 .../config/loongarch/loongarch-common.cc  |  1 +
 gcc/doc/invoke.texi   |  4 +--
 .../gcc.target/loongarch/sign-extend.c| 25 +++
 3 files changed, 28 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/sign-extend.c

diff --git a/gcc/common/config/loongarch/loongarch-common.cc 
b/gcc/common/config/loongarch/loongarch-common.cc
index fce32fa3f8d..c5ed37d27a6 100644
--- a/gcc/common/config/loongarch/loongarch-common.cc
+++ b/gcc/common/config/loongarch/loongarch-common.cc
@@ -35,6 +35,7 @@ static const struct default_options 
loongarch_option_optimization_table[] =
 {
   { OPT_LEVELS_ALL, OPT_fasynchronous_unwind_tables, NULL, 1 },
   { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
+  { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
   { OPT_LEVELS_NONE, 0, NULL, 0 }
 };
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a32dabf0405..16aa92b5e86 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -12639,8 +12639,8 @@ Attempt to remove redundant extension instructions.  
This is especially
 helpful for the x86-64 architecture, which implicitly zero-extends in 64-bit
 registers after writing to their lower 32-bit half.
 
-Enabled for Alpha, AArch64, PowerPC, RISC-V, SPARC, h83000 and x86 at levels
-@option{-O2}, @option{-O3}, @option{-Os}.
+Enabled for Alpha, AArch64, LoongArch, PowerPC, RISC-V, SPARC, h83000 and x86 
at
+levels @option{-O2}, @option{-O3}, @option{-Os}.
 
 @opindex fno-lifetime-dse
 @opindex flifetime-dse
diff --git a/gcc/testsuite/gcc.target/loongarch/sign-extend.c 
b/gcc/testsuite/gcc.target/loongarch/sign-extend.c
new file mode 100644
index 000..3f339d06bbd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/sign-extend.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-mabi=lp64d -O2" } */
+/* { dg-final { scan-assembler-times "slli.w" 1 } } */
+
+extern int PL_savestack_ix;
+extern int PL_regsize;
+extern int PL_savestack_max;
+void Perl_savestack_grow_cnt (int need);
+extern void Perl_croak (char *);
+
+int
+S_regcppush(int parenfloor)
+{
+  int retval = PL_savestack_ix;
+  int paren_elems_to_push = (PL_regsize - parenfloor) * 4;
+  int p;
+
+  if (paren_elems_to_push < 0)
+Perl_croak ("panic: paren_elems_to_push < 0");
+
+  if (PL_savestack_ix + (paren_elems_to_push + 6) > PL_savestack_max)
+Perl_savestack_grow_cnt (paren_elems_to_push + 6);
+
+  return retval;
+}
-- 
2.31.1



[PATCH v1] LoongArch: Enable '-free' starting at -O2.

2023-08-27 Thread Lulu Cheng
gcc/ChangeLog:

* common/config/loongarch/loongarch-common.cc:
Enable '-free' on O2 and above.
* doc/invoke.texi:
Modify the description information of the '-free'
compilation option and add the LoongArch description.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/sign-extend.c: New test.
---
 .../config/loongarch/loongarch-common.cc  |  1 +
 gcc/doc/invoke.texi   |  4 +--
 .../gcc.target/loongarch/sign-extend.c| 25 +++
 3 files changed, 28 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/sign-extend.c

diff --git a/gcc/common/config/loongarch/loongarch-common.cc 
b/gcc/common/config/loongarch/loongarch-common.cc
index fce32fa3f8d..c5ed37d27a6 100644
--- a/gcc/common/config/loongarch/loongarch-common.cc
+++ b/gcc/common/config/loongarch/loongarch-common.cc
@@ -35,6 +35,7 @@ static const struct default_options 
loongarch_option_optimization_table[] =
 {
   { OPT_LEVELS_ALL, OPT_fasynchronous_unwind_tables, NULL, 1 },
   { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
+  { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
   { OPT_LEVELS_NONE, 0, NULL, 0 }
 };
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a32dabf0405..16aa92b5e86 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -12639,8 +12639,8 @@ Attempt to remove redundant extension instructions.  
This is especially
 helpful for the x86-64 architecture, which implicitly zero-extends in 64-bit
 registers after writing to their lower 32-bit half.
 
-Enabled for Alpha, AArch64, PowerPC, RISC-V, SPARC, h83000 and x86 at levels
-@option{-O2}, @option{-O3}, @option{-Os}.
+Enabled for Alpha, AArch64, LoongArch, PowerPC, RISC-V, SPARC, h83000 and x86 
at
+levels @option{-O2}, @option{-O3}, @option{-Os}.
 
 @opindex fno-lifetime-dse
 @opindex flifetime-dse
diff --git a/gcc/testsuite/gcc.target/loongarch/sign-extend.c 
b/gcc/testsuite/gcc.target/loongarch/sign-extend.c
new file mode 100644
index 000..3f339d06bbd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/sign-extend.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-mabi=lp64d -O2" } */
+/* { dg-final { scan-assembler-times "slli.w" 1 } } */
+
+extern int PL_savestack_ix;
+extern int PL_regsize;
+extern int PL_savestack_max;
+void Perl_savestack_grow_cnt (int need);
+extern void Perl_croak (char *);
+
+int
+S_regcppush(int parenfloor)
+{
+  int retval = PL_savestack_ix;
+  int paren_elems_to_push = (PL_regsize - parenfloor) * 4;
+  int p;
+
+  if (paren_elems_to_push < 0)
+Perl_croak ("panic: paren_elems_to_push < 0");
+
+  if (PL_savestack_ix + (paren_elems_to_push + 6) > PL_savestack_max)
+Perl_savestack_grow_cnt (paren_elems_to_push + 6);
+
+  return retval;
+}
-- 
2.31.1



[PATCH] RISC-V: Enable vec_init testsuite for RVV VLA vectorization

2023-08-27 Thread Juzhe-Zhong
Hi, this patch is enabling vec_init for RVV VLA vectorization since we have 
almost
support all RVV-related features to the middle-end loop vectorizer.

Test report:
FAIL: gcc.dg/vect/bb-slp-10.c -flto -ffat-lto-objects  scan-tree-dump slp2 
"unsupported unaligned access"
FAIL: gcc.dg/vect/bb-slp-10.c scan-tree-dump slp2 "unsupported unaligned access"
FAIL: gcc.dg/vect/bb-slp-70.c (test for excess errors)
FAIL: gcc.dg/vect/bb-slp-70.c -flto -ffat-lto-objects (test for excess errors)
FAIL: gcc.dg/vect/bb-slp-layout-17.c (test for excess errors)
FAIL: gcc.dg/vect/bb-slp-layout-17.c -flto -ffat-lto-objects (test for excess 
errors)
XPASS: gcc.dg/vect/no-scevccp-outer-12.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
XPASS: gcc.dg/vect/no-scevccp-outer-16.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
XPASS: gcc.dg/vect/no-scevccp-outer-17.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
XPASS: gcc.dg/vect/no-scevccp-outer-19.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-2.c (internal compiler error: in 
anticipatable_occurrence_p, at config/riscv/riscv-vsetvl.cc:314)
FAIL: gcc.dg/vect/no-scevccp-outer-2.c (test for excess errors)
XPASS: gcc.dg/vect/no-scevccp-outer-21.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-7.c scan-tree-dump-times vect 
"vect_recog_widen_mult_pattern: detected" 1
XPASS: gcc.dg/vect/no-scevccp-outer-8.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-section-anchors-vect-31.c scan-tree-dump-times vect 
"Alignment of access forced using peeling" 2
FAIL: gcc.dg/vect/no-section-anchors-vect-64.c scan-tree-dump-times vect 
"Alignment of access forced using peeling" 2
FAIL: gcc.dg/vect/no-section-anchors-vect-69.c scan-tree-dump-times vect 
"vectorized 3 loops" 1
FAIL: gcc.dg/vect/no-vfa-vect-101.c scan-tree-dump-times vect "can't determine 
dependence" 1
FAIL: gcc.dg/vect/no-vfa-vect-102.c scan-tree-dump-times vect "possible 
dependence between data-refs" 1
FAIL: gcc.dg/vect/no-vfa-vect-102a.c scan-tree-dump-times vect "possible 
dependence between data-refs" 1
FAIL: gcc.dg/vect/no-vfa-vect-37.c scan-tree-dump-times vect "can't determine 
dependence" 2
FAIL: gcc.dg/vect/pr109025.c (internal compiler error: in 
anticipatable_occurrence_p, at config/riscv/riscv-vsetvl.cc:314)
FAIL: gcc.dg/vect/pr109025.c (test for excess errors)
FAIL: gcc.dg/vect/pr109025.c -flto -ffat-lto-objects (internal compiler error: 
in anticipatable_occurrence_p, at config/riscv/riscv-vsetvl.cc:314)
FAIL: gcc.dg/vect/pr109025.c -flto -ffat-lto-objects (test for excess errors)
FAIL: gcc.dg/vect/pr42604.c (internal compiler error: in 
anticipatable_occurrence_p, at config/riscv/riscv-vsetvl.cc:314)
FAIL: gcc.dg/vect/pr42604.c (test for excess errors)
FAIL: gcc.dg/vect/pr42604.c -flto -ffat-lto-objects (internal compiler error: 
in anticipatable_occurrence_p, at config/riscv/riscv-vsetvl.cc:314)
FAIL: gcc.dg/vect/pr42604.c -flto -ffat-lto-objects (test for excess errors)
FAIL: gcc.dg/vect/pr57705.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorized 1 loop" 2
FAIL: gcc.dg/vect/pr57705.c scan-tree-dump-times vect "vectorized 1 loop" 2
FAIL: gcc.dg/vect/pr63341-1.c -flto -ffat-lto-objects execution test
FAIL: gcc.dg/vect/pr63341-1.c execution test
FAIL: gcc.dg/vect/pr63341-2.c -flto -ffat-lto-objects execution test
FAIL: gcc.dg/vect/pr63341-2.c execution test
FAIL: gcc.dg/vect/pr65310.c -flto -ffat-lto-objects  scan-tree-dump vect "can't 
force alignment"
FAIL: gcc.dg/vect/pr65310.c -flto -ffat-lto-objects  scan-tree-dump-not vect 
"misalign = 0"
FAIL: gcc.dg/vect/pr65310.c scan-tree-dump vect "can't force alignment"
FAIL: gcc.dg/vect/pr65310.c scan-tree-dump-not vect "misalign = 0"
FAIL: gcc.dg/vect/pr65518.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorized 0 loops in function" 2
FAIL: gcc.dg/vect/pr65518.c scan-tree-dump-times vect "vectorized 0 loops in 
function" 2
FAIL: gcc.dg/vect/pr68445.c -flto -ffat-lto-objects  scan-tree-dump vect 
"vectorizing stmts using SLP"
FAIL: gcc.dg/vect/pr68445.c scan-tree-dump vect "vectorizing stmts using SLP"
FAIL: gcc.dg/vect/pr88598-1.c -flto -ffat-lto-objects  scan-tree-dump-not 
optimized "REDUC_PLUS"
FAIL: gcc.dg/vect/pr88598-1.c scan-tree-dump-not optimized "REDUC_PLUS"
FAIL: gcc.dg/vect/pr88598-2.c -flto -ffat-lto-objects  scan-tree-dump-not 
optimized "REDUC_PLUS"
FAIL: gcc.dg/vect/pr88598-2.c scan-tree-dump-not optimized "REDUC_PLUS"
FAIL: gcc.dg/vect/pr88598-3.c -flto -ffat-lto-objects  scan-tree-dump-not 
optimized "REDUC_PLUS"
FAIL: gcc.dg/vect/pr88598-3.c scan-tree-dump-not optimized "REDUC_PLUS"
FAIL: gcc.dg/vect/pr94994.c -flto -ffat-lto-objects execution test
FAIL: gcc.dg/vect/pr94994.c execution test
FAIL: gcc.dg/vect/pr97835.c -flto -ffat-lto-objects  scan-tree-dump vect 
"vectorizing stmts using SLP"
FAIL: gcc.dg/vect/pr97835.c scan-tree-dump vect "vectorizing stmts using SLP"
FAIL: 

[PATCH] rs6000: mark tieable between INT and FLOAT

2023-08-27 Thread Jiufu Guo via Gcc-patches
Hi,

For PowerPC, some INT mode and FLOAT modes can be marked as tieable,
for example: DI<->DF.
One note SFmode is special, it would only tieable with itself.

I updated previous patch more reasonable:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609504.html

Bootstrap and regtest pass on ppc64{,le}.
Is this ok for trunk?

BR,
Jeff (Jiufu)

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_modes_tieable_p): Mark more tieable
modes.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/pr102024.C: Updated.

---
 gcc/config/rs6000/rs6000.cc | 9 +
 gcc/testsuite/g++.target/powerpc/pr102024.C | 3 ++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 6ac3adcec6b..3cb0186089e 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -1968,6 +1968,15 @@ rs6000_modes_tieable_p (machine_mode mode1, machine_mode 
mode2)
   if (ALTIVEC_OR_VSX_VECTOR_MODE (mode2))
 return false;
 
+  /* SFmode format (IEEE DP) in register would not as required,
+ So SFmode is restrict here.  */
+  if (GET_MODE_CLASS (mode1) == MODE_FLOAT
+  && GET_MODE_CLASS (mode2) == MODE_INT)
+return GET_MODE_SIZE (mode1) == UNITS_PER_FP_WORD;
+  if (GET_MODE_CLASS (mode1) == MODE_INT
+  && GET_MODE_CLASS (mode2) == MODE_FLOAT)
+return GET_MODE_SIZE (mode2) == UNITS_PER_FP_WORD;
+
   if (SCALAR_FLOAT_MODE_P (mode1))
 return SCALAR_FLOAT_MODE_P (mode2);
   if (SCALAR_FLOAT_MODE_P (mode2))
diff --git a/gcc/testsuite/g++.target/powerpc/pr102024.C 
b/gcc/testsuite/g++.target/powerpc/pr102024.C
index 769585052b5..27d2dc5e80b 100644
--- a/gcc/testsuite/g++.target/powerpc/pr102024.C
+++ b/gcc/testsuite/g++.target/powerpc/pr102024.C
@@ -5,7 +5,8 @@
 // Test that a zero-width bit field in an otherwise homogeneous aggregate
 // generates a psabi warning and passes arguments in GPRs.
 
-// { dg-final { scan-assembler-times {\mstd\M} 4 } }
+// { dg-final { scan-assembler-times {\mmtvsrd\M} 4 { target has_arch_pwr8 } } 
}
+// { dg-final { scan-assembler-times {\mstd\M} 4 { target { ! has_arch_pwr8 } 
} } }
 
 struct a_thing
 {
-- 
2.17.1



Re: [pushed][PATCH v2] LoongArch: Remove redundant sign extension instructions caused by SLT instructions.

2023-08-27 Thread chenglulu

Pushed to r14-3511.

在 2023/8/25 下午5:31, Lulu Cheng 写道:

v1 -> v2:
1. Modify description information


Since the SLT instruction does not distinguish between 64-bit operations and 
32-bit
operations under the 64-bit LoongArch architecture, if the operand of slt is 
SImode,
the sign extension of the operand needs to be displayed.

But similar to the test case below, the sign extension is redundant:

extern int src1, src2, src3;

int
test (void)
{
  int data1 = src1 + src2;
  int data2 = src1 + src3;
  return data1 > data2 ? data1 : data2;
}
Assembly code before optimization:
...
add.w   $r4,$r4,$r14
add.w   $r13,$r13,$r14
slli.w  $r12,$r4,0
slli.w  $r14,$r13,0
slt $r12,$r12,$r14
masknez $r4,$r4,$r12
maskeqz $r12,$r13,$r12
or  $r4,$r4,$r12
slli.w  $r4,$r4,0
...

After optimization:
...
add.w   $r12,$r12,$r14
add.w   $r13,$r13,$r14
slt $r4,$r12,$r13
masknez $r12,$r12,$r4
maskeqz $r4,$r13,$r4
or  $r4,$r12,$r4
...

Similar to this test example, the two operands of SLT are obtained by the
addition operation, and add.w implicitly sign-extends, so the two operands
of SLT do not require sign-extend.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_expand_conditional_move):
Optimize the function implementation.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/slt-sign-extend.c: New test.
---
  gcc/config/loongarch/loongarch.cc | 53 +--
  .../gcc.target/loongarch/slt-sign-extend.c| 14 +
  2 files changed, 63 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/slt-sign-extend.c

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 86d58784113..1905599b9e8 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -4384,14 +4384,30 @@ loongarch_expand_conditional_move (rtx *operands)
enum rtx_code code = GET_CODE (operands[1]);
rtx op0 = XEXP (operands[1], 0);
rtx op1 = XEXP (operands[1], 1);
+  rtx op0_extend = op0;
+  rtx op1_extend = op1;
+
+  /* Record whether operands[2] and operands[3] modes are promoted to 
word_mode.  */
+  bool promote_p = false;
+  machine_mode mode = GET_MODE (operands[0]);
  
if (FLOAT_MODE_P (GET_MODE (op1)))

  loongarch_emit_float_compare (, , );
else
  {
+  if ((REGNO (op0) == REGNO (operands[2])
+  || (REGNO (op1) == REGNO (operands[3]) && (op1 != const0_rtx)))
+ && (GET_MODE_SIZE (GET_MODE (op0)) < word_mode))
+   {
+ mode = word_mode;
+ promote_p = true;
+   }
+
loongarch_extend_comparands (code, , );
  
op0 = force_reg (word_mode, op0);

+  op0_extend = op0;
+  op1_extend = force_reg (word_mode, op1);
  
if (code == EQ || code == NE)

{
@@ -4418,23 +4434,52 @@ loongarch_expand_conditional_move (rtx *operands)
&& register_operand (operands[2], VOIDmode)
&& register_operand (operands[3], VOIDmode))
  {
-  machine_mode mode = GET_MODE (operands[0]);
+  rtx op2 = operands[2];
+  rtx op3 = operands[3];
+
+  if (promote_p)
+   {
+ if (REGNO (XEXP (operands[1], 0)) == REGNO (operands[2]))
+   op2 = op0_extend;
+ else
+   {
+ loongarch_extend_comparands (code, , _rtx);
+ op2 = force_reg (mode, op2);
+   }
+
+ if (REGNO (XEXP (operands[1], 1)) == REGNO (operands[3]))
+   op3 = op1_extend;
+ else
+   {
+ loongarch_extend_comparands (code, , _rtx);
+ op3 = force_reg (mode, op3);
+   }
+   }
+
rtx temp = gen_reg_rtx (mode);
rtx temp2 = gen_reg_rtx (mode);
  
emit_insn (gen_rtx_SET (temp,

  gen_rtx_IF_THEN_ELSE (mode, cond,
-   operands[2], const0_rtx)));
+   op2, const0_rtx)));
  
/* Flip the test for the second operand.  */

cond = gen_rtx_fmt_ee ((code == EQ) ? NE : EQ, GET_MODE (op0), op0, 
op1);
  
emit_insn (gen_rtx_SET (temp2,

  gen_rtx_IF_THEN_ELSE (mode, cond,
-   operands[3], const0_rtx)));
+   op3, const0_rtx)));
  
/* Merge the two results, at least one is guaranteed to be zero.  */

-  emit_insn (gen_rtx_SET (operands[0], gen_rtx_IOR (mode, temp, temp2)));
+  if (promote_p)
+   {
+ rtx temp3 = gen_reg_rtx (mode);
+ emit_insn (gen_rtx_SET (temp3, gen_rtx_IOR (mode, temp, temp2)));
+ temp3 = gen_lowpart (GET_MODE (operands[0]), temp3);
+ 

Re: Re: [PATCH V2] RISC-V: Insert vsetivli zero, 0 for vmv.x.s/vfmv.f.s instructions satisfying REG_P(operand[1]) in -O0.

2023-08-27 Thread juzhe.zh...@rivai.ai
Thanks for taking care of this issue.
Ok to backport GCC-13.



juzhe.zh...@rivai.ai
 
From: Li Xu
Date: 2023-08-28 10:33
To: xuli1; gcc-patches
CC: kito.cheng; palmer; juzhe.zhong
Subject: Re: [PATCH V2] RISC-V: Insert vsetivli zero, 0 for vmv.x.s/vfmv.f.s 
instructions satisfying REG_P(operand[1]) in -O0.
This patch should be backported to releases/gcc-13 to address 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111076
 
--
Li Xu
>This issue happens is because the operand1 of scalar move can be
>REG_P (operand[1]) in the O0 case, which causes the VSETVL PASS to
>not insert the vsetvl instruction correctly, and the compiler crashes.
>
>Consider this following case:
>int16_t foo1 (void *base, size_t vl)
>{
>int16_t maxVal = __riscv_vmv_x_s_i16m1_i16 (__riscv_vle16_v_i16m1 (base, 
> vl));
>return maxVal;
>}
>
>Before this patch:
>bug.c:15:1: internal compiler error: Segmentation fault
>   15 | }
>  | ^
>0x145d723 crash_signal
>../.././riscv-gcc/gcc/toplev.cc:314
>0x22929dd const_csr_operand(rtx_def*, machine_mode)
>../.././riscv-gcc/gcc/config/riscv/predicates.md:44
>0x2292a21 csr_operand(rtx_def*, machine_mode)
>../.././riscv-gcc/gcc/config/riscv/predicates.md:46
>0x23dfbb0 recog_356
>../.././riscv-gcc/gcc/config/riscv/iterators.md:72
>0x23efecd recog(rtx_def*, rtx_insn*, int*)
>../.././riscv-gcc/gcc/config/riscv/iterators.md:89
>0xdddc15 recog_memoized(rtx_insn*)
>../.././riscv-gcc/gcc/recog.h:273
>
>After this patch:
> vsetivli zero,0,e16,m1,ta,ma
> vmv.x.s a5,v1
>
>gcc/ChangeLog:
>
>* config/riscv/riscv-vsetvl.cc (gen_vsetvl_pat): For vfmv.f.s/vmv.x.s 
> intruction replace null avl with (const_int 0).
>
>gcc/testsuite/ChangeLog:
>
>* gcc.target/riscv/rvv/base/scalar_move-10.c: New test.
>* gcc.target/riscv/rvv/base/scalar_move-11.c: New test.
>---
> gcc/config/riscv/riscv-vsetvl.cc  |  5 +++
> .../riscv/rvv/base/scalar_move-10.c   | 31 +++
> .../riscv/rvv/base/scalar_move-11.c   | 20 
> 3 files changed, 56 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-10.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-11.c
>
>diff --git a/gcc/config/riscv/riscv-vsetvl.cc 
>b/gcc/config/riscv/riscv-vsetvl.cc
>index d4d6f336ef9..14ebae1f3f6 100644
>--- a/gcc/config/riscv/riscv-vsetvl.cc
>+++ b/gcc/config/riscv/riscv-vsetvl.cc
>@@ -618,6 +618,11 @@ static rtx
> gen_vsetvl_pat (enum vsetvl_type insn_type, const vl_vtype_info , rtx vl)
> {
>   rtx avl = info.get_avl ();
>+  /* if optimization == 0 and the instruction is vmv.x.s/vfmv.f.s,
>+ set the value of avl to (const_int 0) so that VSETVL PASS will
>+ insert vsetvl correctly.*/
>+  if (info.has_avl_no_reg ())
>+avl = GEN_INT (0);
>   rtx sew = gen_int_mode (info.get_sew (), Pmode);
>   rtx vlmul = gen_int_mode (info.get_vlmul (), Pmode);
>   rtx ta = gen_int_mode (info.get_ta (), Pmode);
>diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-10.c 
>b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-10.c
>new file mode 100644
>index 000..9760d77fb22
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-10.c
>@@ -0,0 +1,31 @@
>+/* { dg-do compile } */
>+/* { dg-options "-march=rv64gcv -mabi=lp64d -O0" } */
>+/* { dg-final { check-function-bodies "**" "" } } */
>+
>+#include "riscv_vector.h"
>+
>+/*
>+** foo1:
>+** ...
>+** vsetivli\tzero,0,e16,m1,t[au],m[au]
>+** vmv.x.s\t[a-x0-9]+,v[0-9]+
>+** ...
>+*/
>+int16_t foo1 (void *base, size_t vl)
>+{
>+int16_t maxVal = __riscv_vmv_x_s_i16m1_i16 (__riscv_vle16_v_i16m1 (base, 
>vl));
>+return maxVal;
>+}
>+
>+/*
>+** foo2:
>+** ...
>+** vsetivli\tzero,0,e32,m1,t[au],m[au]
>+** vfmv.f.s\tf[a-x0-9]+,v[0-9]+
>+** ...
>+*/
>+float foo2 (void *base, size_t vl)
>+{
>+float maxVal = __riscv_vfmv_f_s_f32m1_f32 (__riscv_vle32_v_f32m1 (base, 
>vl));
>+return maxVal;
>+}
>diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-11.c 
>b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-11.c
>new file mode 100644
>index 000..8036acd0a52
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-11.c
>@@ -0,0 +1,20 @@
>+/* { dg-do compile } */
>+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O0" } */
>+/* { dg-final { check-function-bodies "**" "" } } */
>+
>+#include "riscv_vector.h"
>+
>+/*
>+** foo:
>+** ...
>+** vsetivli\tzero,0,e64,m4,t[au],m[au]
>+** vmv.x.s\t[a-x0-9]+,v[0-9]+
>+** vsetivli\tzero,0,e64,m4,t[au],m[au]
>+** vmv.x.s\t[a-x0-9]+,v[0-9]+
>+** ...
>+*/
>+int16_t foo (void *base, size_t vl)
>+{
>+int16_t maxVal = __riscv_vmv_x_s_i64m4_i64 (__riscv_vle64_v_i64m4 (base, 
>vl));
>+return maxVal;
>+}
>--
>2.17.1


Re: [PATCH V2] RISC-V: Insert vsetivli zero, 0 for vmv.x.s/vfmv.f.s instructions satisfying REG_P(operand[1]) in -O0.

2023-08-27 Thread Li Xu
This patch should be backported to releases/gcc-13 to address 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111076

--
Li Xu
>This issue happens is because the operand1 of scalar move can be
>REG_P (operand[1]) in the O0 case, which causes the VSETVL PASS to
>not insert the vsetvl instruction correctly, and the compiler crashes.
>
>Consider this following case:
>int16_t foo1 (void *base, size_t vl)
>{
>    int16_t maxVal = __riscv_vmv_x_s_i16m1_i16 (__riscv_vle16_v_i16m1 (base, 
>vl));
>    return maxVal;
>}
>
>Before this patch:
>bug.c:15:1: internal compiler error: Segmentation fault
>   15 | }
>  | ^
>0x145d723 crash_signal
>    ../.././riscv-gcc/gcc/toplev.cc:314
>0x22929dd const_csr_operand(rtx_def*, machine_mode)
>    ../.././riscv-gcc/gcc/config/riscv/predicates.md:44
>0x2292a21 csr_operand(rtx_def*, machine_mode)
>    ../.././riscv-gcc/gcc/config/riscv/predicates.md:46
>0x23dfbb0 recog_356
>    ../.././riscv-gcc/gcc/config/riscv/iterators.md:72
>0x23efecd recog(rtx_def*, rtx_insn*, int*)
>    ../.././riscv-gcc/gcc/config/riscv/iterators.md:89
>0xdddc15 recog_memoized(rtx_insn*)
>    ../.././riscv-gcc/gcc/recog.h:273
>
>After this patch:
>   vsetivlizero,0,e16,m1,ta,ma
>   vmv.x.s a5,v1
>
>gcc/ChangeLog:
>
>    * config/riscv/riscv-vsetvl.cc (gen_vsetvl_pat): For vfmv.f.s/vmv.x.s 
>intruction replace null avl with (const_int 0).
>
>gcc/testsuite/ChangeLog:
>
>    * gcc.target/riscv/rvv/base/scalar_move-10.c: New test.
>    * gcc.target/riscv/rvv/base/scalar_move-11.c: New test.
>---
> gcc/config/riscv/riscv-vsetvl.cc  |  5 +++
> .../riscv/rvv/base/scalar_move-10.c   | 31 +++
> .../riscv/rvv/base/scalar_move-11.c   | 20 
> 3 files changed, 56 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-10.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-11.c
>
>diff --git a/gcc/config/riscv/riscv-vsetvl.cc 
>b/gcc/config/riscv/riscv-vsetvl.cc
>index d4d6f336ef9..14ebae1f3f6 100644
>--- a/gcc/config/riscv/riscv-vsetvl.cc
>+++ b/gcc/config/riscv/riscv-vsetvl.cc
>@@ -618,6 +618,11 @@ static rtx
> gen_vsetvl_pat (enum vsetvl_type insn_type, const vl_vtype_info , rtx vl)
> {
>   rtx avl = info.get_avl ();
>+  /* if optimization == 0 and the instruction is vmv.x.s/vfmv.f.s,
>+ set the value of avl to (const_int 0) so that VSETVL PASS will
>+ insert vsetvl correctly.*/
>+  if (info.has_avl_no_reg ())
>+    avl = GEN_INT (0);
>   rtx sew = gen_int_mode (info.get_sew (), Pmode);
>   rtx vlmul = gen_int_mode (info.get_vlmul (), Pmode);
>   rtx ta = gen_int_mode (info.get_ta (), Pmode);
>diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-10.c 
>b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-10.c
>new file mode 100644
>index 000..9760d77fb22
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-10.c
>@@ -0,0 +1,31 @@
>+/* { dg-do compile } */
>+/* { dg-options "-march=rv64gcv -mabi=lp64d -O0" } */
>+/* { dg-final { check-function-bodies "**" "" } } */
>+
>+#include "riscv_vector.h"
>+
>+/*
>+** foo1:
>+** ...
>+** vsetivli\tzero,0,e16,m1,t[au],m[au]
>+** vmv.x.s\t[a-x0-9]+,v[0-9]+
>+** ...
>+*/
>+int16_t foo1 (void *base, size_t vl)
>+{
>+    int16_t maxVal = __riscv_vmv_x_s_i16m1_i16 (__riscv_vle16_v_i16m1 (base, 
>vl));
>+    return maxVal;
>+}
>+
>+/*
>+** foo2:
>+** ...
>+** vsetivli\tzero,0,e32,m1,t[au],m[au]
>+** vfmv.f.s\tf[a-x0-9]+,v[0-9]+
>+** ...
>+*/
>+float foo2 (void *base, size_t vl)
>+{
>+    float maxVal = __riscv_vfmv_f_s_f32m1_f32 (__riscv_vle32_v_f32m1 (base, 
>vl));
>+    return maxVal;
>+}
>diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-11.c 
>b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-11.c
>new file mode 100644
>index 000..8036acd0a52
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-11.c
>@@ -0,0 +1,20 @@
>+/* { dg-do compile } */
>+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O0" } */
>+/* { dg-final { check-function-bodies "**" "" } } */
>+
>+#include "riscv_vector.h"
>+
>+/*
>+** foo:
>+** ...
>+** vsetivli\tzero,0,e64,m4,t[au],m[au]
>+** vmv.x.s\t[a-x0-9]+,v[0-9]+
>+** vsetivli\tzero,0,e64,m4,t[au],m[au]
>+** vmv.x.s\t[a-x0-9]+,v[0-9]+
>+** ...
>+*/
>+int16_t foo (void *base, size_t vl)
>+{
>+    int16_t maxVal = __riscv_vmv_x_s_i64m4_i64 (__riscv_vle64_v_i64m4 (base, 
>vl));
>+    return maxVal;
>+}
>--
>2.17.1

Re: Re: [PATCH] RISC-V: Add initial pipeline description for an out-of-order core.

2023-08-27 Thread juzhe.zh...@rivai.ai
Ok.

I am not familiar with scheduling stuff but I hope you can fix those 2 issues.

I have no objection with this patch and I prefer Jeff or kito make the decision 
for this patch.

Thanks.


juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-08-23 22:56
To: 钟居哲; gcc-patches; palmer; kito.cheng; Jeff Law
CC: rdapp.gcc
Subject: Re: [PATCH] RISC-V: Add initial pipeline description for an 
out-of-order core.
> Does this patch fix these 2 following PR:
> 108271 – Missed RVV cost model (gnu.org) 
> 
> 108412 – RISC-V: Negative optimization of GCSE && LOOP INVARIANTS (gnu.org) 
> 
> 
> If yes, plz append these 2 cases into testsuite and indicate those 2 PR are 
> fixed.
> So that we can close them.
 
The second one is fixed on my local branch, the first not yet because there
is more to it still.  The second one is more due to pressure-aware scheduling
and I'm going to add it to the commit as well as the PR to the commit once this
is verified.
 
Regards
Robin
 


Re: [PATCH V2] RISC-V: Fix error combine of pred_mov pattern

2023-08-27 Thread Lehua Ding

Hi, Jeff,

Ping this patch since 18 days have passed. Is there any problem with 
this patch after the last discussion? This is a bugfix patch, it will 
affect the correctness, hope to have another look, thank you very much.


There seems to be a major question at the moment as to why I add a 
force_reg, and I've copied my answer from V1 Thread.


>> As the above says, the code addresses the problem which produced
>> after addressing the combine problem.
> But combine doesn't run at -O0.  So something is inconsistent.  I
> certainly believe we need to avoid the mem->mem case, but that's
> independent of combine and affects all optimization levels.

I think it's the comment written here that is the problem. I plan to 
change it to this:

  /* Since there is no intrinsic where target is a mem operand, it
 should be converted to reg if it is a mem operand.  */

Best,
Lehua

On 2023/8/10 20:21, Lehua Ding wrote:

Hi,

This patch fix PR110943 which will produce some error code. This is because
the error combine of some pred_mov pattern. Consider this code:

```

void foo9 (void *base, void *out, size_t vl)
{
 int64_t scalar = *(int64_t*)(base + 100);
 vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
 *(vint64m2_t*)out = v;
}
```

RTL before combine pass:

```
(insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ])
 (if_then_else:RVVM2DI (unspec:RVVMF32BI [
 (const_vector:RVVMF32BI repeat [
 (const_int 1 [0x1])
 ])
 (const_int 1 [0x1])
 (const_int 2 [0x2]) repeated x2
 (const_int 0 [0])
 (reg:SI 66 vl)
 (reg:SI 67 vtype)
 ] UNSPEC_VPREDICATE)
 (const_vector:RVVM2DI repeat [
 (const_int 0 [0])
 ])
 (unspec:RVVM2DI [
 (reg:SI 0 zero)
 ] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089 
{pred_movrvvm2di})
(insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1 MEM[(vint64m2_t 
*)out_4(D)]+0 S[32, 32] A128])
 (reg/v:RVVM2DI 134 [ v ])) "/app/example.c":7:23 717 
{*movrvvm2di_whole})
```

RTL after combine pass:
```
(insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t *)out_4(D)]+0 
S[32, 32] A128])
 (if_then_else:RVVM2DI (unspec:RVVMF32BI [
 (const_vector:RVVMF32BI repeat [
 (const_int 1 [0x1])
 ])
 (const_int 1 [0x1])
 (const_int 2 [0x2]) repeated x2
 (const_int 0 [0])
 (reg:SI 66 vl)
 (reg:SI 67 vtype)
 ] UNSPEC_VPREDICATE)
 (const_vector:RVVM2DI repeat [
 (const_int 0 [0])
 ])
 (unspec:RVVM2DI [
 (reg:SI 0 zero)
 ] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089 
{pred_movrvvm2di})
```

This combine change the semantics of insn 14. I refine the conditon of @pred_mov
pattern to a more restrict. It's Ok for trunk?

Best,
Lehua

PR target/110943

gcc/ChangeLog:

* config/riscv/predicates.md (vector_const_int_or_double_0_operand):
  New.
* config/riscv/riscv-vector-builtins.cc 
(function_expander::function_expander):
  force_reg mem operand.
* config/riscv/vector.md (@pred_mov): Wrapper.
(*pred_mov): Remove imm -> reg pattern.
(*pred_broadcast_imm): Add imm -> reg pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Update.
* gcc.target/riscv/rvv/base/pr110943.c: New test.

---
  gcc/config/riscv/predicates.md|  5 +
  gcc/config/riscv/riscv-vector-builtins.cc |  8 +-
  gcc/config/riscv/vector.md| 97 +++
  .../gcc.target/riscv/rvv/base/pr110943.c  | 33 +++
  .../riscv/rvv/base/zvfhmin-intrinsic.c| 10 +-
  5 files changed, 104 insertions(+), 49 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 9db28c2def7..f2e406c718a 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -295,6 +295,11 @@
(ior (match_operand 0 "register_operand")
 (match_operand 0 "const_int_operand")))
  
+(define_predicate "vector_const_int_or_double_0_operand"

+  (and (match_code "const_vector")
+   (match_test "satisfies_constraint_vi (op)
+|| satisfies_constraint_Wc0 (op)")))
+
  (define_predicate "vector_move_operand"
(ior (match_operand 0 "nonimmediate_operand")
 (and (match_code "const_vector")
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index abab06c00ed..2da542585a8 100644
--- 

Re: [PATCH ver 3] rs6000, add overloaded DFP quantize support

2023-08-27 Thread Kewen.Lin via Gcc-patches
Hi Carl,

on 2023/8/25 03:53, Carl Love wrote:
> GCC maintainers:
> 
> Version 3, fixed the built-in instance names.  Missed removing the "n"
> the name.  Added the tighter constraints on the predicates for the
> define_insn.  Updated the wording for the built-ins in the
> documentation file.  Changed the test file name again.  Updated the
> ChangeLog file, added the PR target line.  Retested the patch on Power
> 10LE and Power 8 and Power 9.
> 
> Version 2, renamed the built-in instances.  Changed the name of the
> overloaded built-in.  Added the missing documentation for the new
> built-ins.  Fixed typos.  Changed name of the test.  Updated the
> effective target for the test.  Retested the patch on Power 10LE and
> Power 8 and Power 9.
> 
> The following patch adds four built-ins for the decimal floating point
> (DFP) quantize instructions on rs6000.  The built-ins are for 64-bit
> and 128-bit DFP operands.
> 
> The patch also adds a test case for the new builtins.
> 
> The Patch has been tested on Power 10LE and Power 9 LE/BE.
> 
> Please let me know if the patch is acceptable for mainline.  Thanks.
> 
>  Carl Love
> 
> 
> ---
> rs6000, add overloaded DFP quantize support
> 
> Add decimal floating point (DFP) quantize built-ins for both 64-bit DFP
> and 128-DFP operands.  In each case, there is an immediate version and a
> variable version of the built-in.  The RM value is a 2-bit constant int
> which specifies the rounding mode to use.  For the immediate versions of
> the built-in, the TE field is a 5-bit constant that specifies the value of
> the ideal exponent for the result.  The built-in specifications are:
> 
>   __Decimal64 builtin_dfp_quantize (_Decimal64, _Decimal64,
>   const int RM)
>   __Decimal64 builtin_dfp_quantize (const int TE, _Decimal64,
>   const int RM)
>   __Decimal128 builtin_dfp_quantize (_Decimal128, _Decimal128,
>const int RM)
>   __Decimal128 builtin_dfp_quantize (const int TE, _Decimal128,
>const int RM)
> 
> A testcase is added for the new built-in definitions.
> 
> gcc/ChangeLog:
>   * config/rs6000/dfp.md: New UNSPEC_DQUAN.

Nit: (UNSPEC_DQUAN): New unspec.

>   (dfp_dqua_, dfp_dqua_i): New define_insn.
>   * config/rs6000/rs6000-builtins.def (__builtin_dfp_dqua,
>   __builtin_dfp_dquai, __builtin_dfp_dquaq, __builtin_dfp_dquaqi):
>   New buit-in definitions.
>   * config/rs6000/rs6000-overload.def (__builtin_dfp_quantize): New
>   overloaded definition.
>   * doc/extend.texi: Add documentation for __builtin_dfp_quantize.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/pr93448.c: New test case.
> 
>   PR target/93448
> ---
>  gcc/config/rs6000/dfp.md   |  25 ++-
>  gcc/config/rs6000/rs6000-builtins.def  |  15 ++
>  gcc/config/rs6000/rs6000-overload.def  |  10 ++
>  gcc/doc/extend.texi|  17 ++
>  gcc/testsuite/gcc.target/powerpc/pr93448.c | 200 +
>  5 files changed, 266 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr93448.c
> 
> diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md
> index 5ed8a73ac51..052dc0946d3 100644
> --- a/gcc/config/rs6000/dfp.md
> +++ b/gcc/config/rs6000/dfp.md
> @@ -271,7 +271,8 @@ (define_c_enum "unspec"
> UNSPEC_DIEX
> UNSPEC_DSCLI
> UNSPEC_DTSTSFI
> -   UNSPEC_DSCRI])
> +   UNSPEC_DSCRI
> +   UNSPEC_DQUAN])
>  
>  (define_code_iterator DFP_TEST [eq lt gt unordered])
>  
> @@ -395,3 +396,25 @@ (define_insn "dfp_dscri_"
>"dscri %0,%1,%2"
>[(set_attr "type" "dfp")
> (set_attr "size" "")])
> +
> +(define_insn "dfp_dqua_"
> +  [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d")
> +(unspec:DDTD [(match_operand:DDTD 1 "gpc_reg_operand" "d")
> +   (match_operand:DDTD 2 "gpc_reg_operand" "d")
> +   (match_operand:SI 3 "const_0_to_3_operand" "n")]
> + UNSPEC_DQUAN))]
> +  "TARGET_DFP"
> +  "dqua %0,%1,%2,%3"
> +  [(set_attr "type" "dfp")
> +   (set_attr "size" "")])
> +
> +(define_insn "dfp_dqua_i"

Sorry for nitpicking, but what I suggested previously was "dfp_dquai_"
instead of "dfp_dqua_i", "dquai" matches the according mnemonic so it's
read better, i expands to "idd" and "itd" that look odd to me.
Do you agree "dquai" is better?  If yes, the changelog and the related
expanders need to be updated as well.

The others look good to me, thanks!

BR,
Kewen

> +  [(set (match_operand:DDTD 0 "gpc_reg_operand" "=d")
> +(unspec:DDTD [(match_operand:SI 1 "s5bit_cint_operand" "n")
> +   (match_operand:DDTD 2 "gpc_reg_operand" "d")
> +   (match_operand:SI 3 "const_0_to_3_operand" "n")]
> + UNSPEC_DQUAN))]
> +  "TARGET_DFP"
> +  "dquai %1,%0,%2,%3"
> +  [(set_attr "type" "dfp")
> 

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-08-27 Thread Kewen.Lin via Gcc-patches
on 2023/8/26 06:04, Peter Bergner wrote:
> On 8/25/23 6:20 AM, Kewen.Lin wrote:
>> Assuming the current PCREL_SUPPORTED_BY_OS unchanged, when
>> PCREL_SUPPORTED_BY_OS is true, all its required conditions are
>> satisfied, it should be safe.  while PCREL_SUPPORTED_BY_OS is
>> false, it means the given subtarget doesn't support it, or one
>> or more of conditions below don't hold:
>>
>>  - TARGET_POWER10 
>>  - TARGET_PREFIXED
>>  - ELFv2_ABI_CHECK
>>  - TARGET_CMODEL == CMODEL_MEDIUM
> 
> The pcrel instructions are 64-bit/prefix instructions, so I think
> for PCREL_SUPPORTED_BY_OS, we want to keep the TARGET_POWER10 and
> the TARGET_PREFIXED checks.  Then we should have the checks for
> the OS that can support pcrel, in this case, only ELFv2_ABI_CHECK
> for now.  I think we should remove the TARGET_CMODEL check, since
> that isn't strictly required, it's a current code requirement for
> ELFv2, but could change in the future.  In fact, Mike has talked
> about adding pcrel support for ELFv2 and -mcmodel=large, so I think
> that should move moved out of PCREL_SUPPORTED_BY_OS and into the
> option override checks.

Thanks for clarifying this!  Yes, I know the pcrel support requires
TARGET_PREFIXED (and its required TARGET_POWER10), but IMHO these
TARGET_PREFIXED and TARGET_POWER10 are common for any subtargets
which support or will support pcrel, they can be checked in common
code, instead of being a part of PCREL_SUPPORTED_BY_OS.

We already has the code to check pcrel's need on prefixed and
prefixed's need on Power10, we can just move these checkings after
PCREL_SUPPORTED_BY_OS check.

Assuming we only have ELFv2_ABI_CHECK in PCREL_SUPPORTED_BY_OS, we
can have either TARGET_PCREL or !TARGET_PCREL after the checking.
For the latter, it's fine and don't need any checks. For the former,
if it's implicit, for !TARGET_PREFIXED we will clean it silently;
while if it's explicit, for !TARGET_PREFIXED we will emit an error.
TARGET_PREFIXED checking has considered Power10, so it's also
concerned accordingly.

> 
> 
> 
[snip ...]
> 
> 
>> btw, I was also expecting that we don't implicitly set
>> OPTION_MASK_PCREL any more for Power10, that is to remove
>> OPTION_MASK_PCREL from OTHER_POWER10_MASKS.
> 
> I'm on the fence about this one and would like to hear from Segher
> and Mike on what they think.  In some respect, pcrel is a Power10
> hardware feature, so that would seem to make sense to set the flag,
> but yeah, we only have one OS that currently supports it, so maybe
> not setting it makes sense?  Like I said, I think I need Segher and
> Mike to chime in with their thoughts.

Yeah, looking forward to their opinions.  IMHO, with the current proposed
change, pcrel doesn't look like a pure Power10 hardware feature, it also
quite relies on ABIs, that's why I thought it seems good not to turn it
on by default for Power10.

BR,
Kewen


[PATCH] RISC-V: Fix VSETVL test failures

2023-08-27 Thread Juzhe-Zhong
Committed.

Fix failures:
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c   -O2   
scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9][0-9]\\:\\s+vle32\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 1
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c   -O2   
scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vle16\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 2
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c   -O2   
scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vle8\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 3
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9][0-9]\\:\\s+vle32\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 1
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vle16\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 2
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vle8\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 3
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9][0-9]\\:\\s+vle32\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 1
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vle16\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 2
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vle8\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 3
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-11.c   -O2   
scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vle32\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 1
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-11.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vle32\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 1
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-11.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vle32\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 1
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c   -O2   
scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9][0-9]\\:\\s+vlm\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 2
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c   -O2   
scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vlm\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 5
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9][0-9]\\:\\s+vlm\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 2
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vlm\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 5
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9][0-9]\\:\\s+vlm\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 2
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vlm\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 5
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c   -O2   
scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vle32\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 1
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vle32\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 1
FAIL: gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times 
add\\ta[0-7],a[0-7],a[0-7]\\s+\\.L[0-9][0-9]\\:\\s+vle32\\.v\\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\\s*\\([a-x0-9]+\\)
 1
FAIL: 

Re: Re: [PATCH V2] RISC-V: Refactor Phase 3 (Demand fusion) of VSETVL PASS

2023-08-27 Thread juzhe.zh...@rivai.ai
Thanks kito.
Address all comments and committed with V3:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628423.html 



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2023-08-25 01:01
To: Juzhe-Zhong
CC: gcc-patches; kito.cheng; jeffreyalaw; rdapp.gcc
Subject: Re: [PATCH V2] RISC-V: Refactor Phase 3 (Demand fusion) of VSETVL PASS
>
>-  Phase 3 - Backward && forward demanded info propagation and fusion 
> across
>   blocks.
>
 
Need update comment here.
 
>-  Phase 6 - Propagate AVL between vsetvl instructions.
 
Need update comment here too.
 
> +/* Return true if the current VSETVL is dominated by preceding VSETVL.  */
> +static bool
> +vsetvl_dominated_by_p (const basic_block cfg_bb,
> +  const vector_insn_info ,
> +  const vector_insn_info , bool fuse_p)
 
"VSETVL1 is dominated by preceding VSETVL2." ?
and what's the definition of dominated?
it seems like not in the traditional sense of "dominate"?
 
 
> vector_insn_info::merge (const vector_insn_info _info,
> -enum merge_type type) const
> +enum merge_type type, int bb_index) const
 
I would suggest just split this into two funciton, local_merge and
global_merge, and remove merge_type,
generally I like generalized those function by arguments, but those
two are different enough after this change.
 
 
> +  /* Recompute the AVL source when bb_index*/
 
This sentence seems to be incomplete?
 
 
> + if (dest_block_info.probability > 
> src_block_info.probability)
> +   prob = dest_block_info.probability;
 
prob = std::max(dest_block_info.probability, src_block_info.probability);
 
> @@ -3720,6 +3138,8 @@ pass_vsetvl::compute_local_properties (void)
>for (const bb_info *bb : crtl->ssa->bbs ())
>  {
>unsigned int curr_bb_idx = bb->index ();
> +  if (curr_bb_idx == ENTRY_BLOCK || curr_bb_idx == EXIT_BLOCK)
> +   continue;
>const auto local_dem
> = m_vector_manager->vector_block_infos[curr_bb_idx].local_dem;
>const auto reaching_out
 
This small change seems could be a small optimization for early exit
for this loop and could be a separated patch? if so plz send a
separated, and pre-aproved for that :)
 
 
 
> + if (src_block_info.reaching_out.empty_p ())
> +   {
...
> + else if (src_block_info.reaching_out.dirty_p ())
 
Could you add more comment to explain more for each condition?
 
> +   {
> + rtx vl = NULL_RTX;
> + if (!reaching_out.get_avl_source ())
> +   {
> + gcc_assert (vsetvl_insn_p (reaching_out.get_insn ()->rtl ()));
> + vl = get_vl (reaching_out.get_insn ()->rtl ());
> +   }
> + else
> +   vl = reaching_out.get_avl_reg_rtx ();
> + new_pat = gen_vsetvl_pat (VSETVL_NORMAL, reaching_out, vl);
> +   }
 
need more comment here too
 
> +  edge eg;
> +  edge_iterator eg_iterator;
> +  FOR_EACH_EDGE (eg, eg_iterator, cfg_bb->succs)
> {
> - fprintf (dump_file,
> -  "\nInsert vsetvl insn %d at the end of :\n",
> -  INSN_UID (new_insn), cfg_bb->index);
> - print_rtl_single (dump_file, new_insn);
> + /* We should not get an abnormal edge here.  */
> + gcc_assert (!(eg->flags & EDGE_ABNORMAL));
> + if (m_vector_manager->vsetvl_dominated_by_all_preds_p (cfg_bb,
> +
> reaching_out))
> +   continue;
> +
 
Also need more comments here .
 


[PATCH] IFCOMBINE: Remove outer condition for two same conditionals

2023-08-27 Thread Andrew Pinski via Gcc-patches
This adds a simple case to remove an outer condition if the two inner
condtionals are the same and lead the same location.
This can show up due to jump threading or inlining or someone wrote code
like this.

ifcombine-1.c shows the simple case where this is supposed to solve.
Even though PRE can handle some cases, ifcombine is earlier and even runs
at -O1.

Note in the case of the PR here, it comes from jump threading.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/110891
* tree-ssa-ifcombine.cc (ifcombine_bb_same): New function.
(tree_ssa_ifcombine_bb): Call ifcombine_bb_same.

gcc/testsuite/ChangeLog:

PR tree-optimization/110891
* gcc.dg/tree-ssa/ifcombine-1.c: New test.
* gcc.dg/tree-ssa/pr110891-1.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/ifcombine-1.c |  27 ++
 gcc/testsuite/gcc.dg/tree-ssa/pr110891-1.c  |  53 +++
 gcc/tree-ssa-ifcombine.cc   | 100 
 3 files changed, 180 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ifcombine-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110891-1.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifcombine-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ifcombine-1.c
new file mode 100644
index 000..02d08efef87
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifcombine-1.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized -fdump-tree-ifcombine" } */
+
+int g();
+int h();
+
+int j, l;
+
+int f(int a, int *b)
+{
+if (a == 0)
+{
+if (b == ) goto L9; else goto L7;
+}
+else
+{
+if (b == ) goto L9; else goto L7;
+}
+L7: return g();
+L9: return h();
+}
+
+/* ifcombine can optimize away the outer most if here. */
+/* { dg-final { scan-tree-dump-times "optimized away the test from bb " 1 
"ifcombine" } } */
+/* We should have remove the outer if and one of the inner ifs; leaving us 
with one if. */
+/* { dg-final { scan-tree-dump-times "if " 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "goto " 3 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr110891-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr110891-1.c
new file mode 100644
index 000..320d8823077
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr110891-1.c
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+void foo(void);
+static int a, c = 7, d, o, q;
+static int *b = , *f, *j = , *n = , *ae;
+static short e, m;
+static short *i = 
+static char r;
+void __assert_fail(char *, char *, int, const char *) 
__attribute__((__noreturn__));
+static const short g();
+static void h();
+static int *k(int) {
+(*i)++;
+*j ^= *b;
+return 
+}
+static void l(unsigned p) {
+int *aa = 
+h();
+o = 5 ^ g() && p;
+if (f ==  || f ==  || f == )
+;
+else {
+foo();
+__assert_fail("", "", 3, __PRETTY_FUNCTION__);
+}
+*aa ^= *n;
+if (*aa)
+if (!(((p) >= 0) && ((p) <= 0))) {
+__builtin_unreachable();
+}
+k(p);
+}
+static const short g() { return q; }
+static void h() {
+unsigned ag = c;
+d = ag > r ? ag : 0;
+ae = k(c);
+f = ae;
+if (ae ==  || ae ==  || ae == )
+;
+else
+__assert_fail("", "", 4, __PRETTY_FUNCTION__);
+}
+int main() {
+l(a);
+m || (*b |= 64);
+*b &= 5;
+}
+
+/* We should be able to optimize away foo. */
+/* { dg-final { scan-tree-dump-not "foo " "optimized" } } */
diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index 46b076804f4..f79545b9a0b 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -666,6 +666,103 @@ ifcombine_ifandif (basic_block inner_cond_bb, bool 
inner_inv,
   return false;
 }
 
+/* Function to remove an outer condition if two inner basic blocks have the 
same condition and both empty otherwise. */
+
+static bool
+ifcombine_bb_same (basic_block cond_bb, basic_block outer_cond_bb,
+  basic_block then_bb, basic_block else_bb)
+{
+  basic_block inner_cond_bbt = nullptr, inner_cond_bbf = nullptr;
+
+  /* See if the the outer condition is a condition. */
+  if (!recognize_if_then_else (outer_cond_bb, _cond_bbt, 
_cond_bbf))
+return false;
+  basic_block other_cond_bb;
+  if (cond_bb == inner_cond_bbt)
+other_cond_bb = inner_cond_bbf;
+  else
+other_cond_bb = inner_cond_bbt;
+
+  /* The other bb has to have a single predecessor too. */
+  if (!single_pred_p (other_cond_bb))
+return false;
+
+  /* Other inner conditional bb needs to go to the same then and else blocks. 
*/
+  if (!recognize_if_then_else (other_cond_bb, _bb, _bb))
+return false;
+
+  /* Both edges of both inner basic blocks need to have the same values for 
the incoming phi for both then and else basic blocks. */
+  if (!same_phi_args_p (cond_bb, other_cond_bb, 

[PATCH] fortran: Restore interface to its previous state on error [PR48776]

2023-08-27 Thread Mikael Morin via Gcc-patches
Hello,

this fixes an old error-recovery bug.
Tested on x86_64-pc-linux-gnu.

OK for master?

-- >8 --

Keep memory of the content of the current interface body being parsed
and restore it to its previous state if it has been modified at the time
a parse attempt fails.

This fixes memory errors and random segmentation faults caused by
dangling symbol pointers kept in interfaces' linked lists of symbols.
If a parsing attempt fails and symbols are freed, they should also be
removed from the current interface linked list.

As the list of symbol is a linked list, and parsing only adds new
symbols to the head of the list, all that is needed to track the
previous content of the list is a pointer to its previous head.
This adds such a pointer, and the restoration of the list of symbols
to that pointer on error.

PR fortran/48776

gcc/fortran/ChangeLog:

* gfortran.h (gfc_drop_interface_elements_before): New prototype.
(gfc_current_interface_head): Return a reference to the pointer.
* interface.cc (gfc_current_interface_head): Ditto.
(free_interface_elements_until): New function, generalizing
gfc_free_interface.
(gfc_free_interface): Use free_interface_elements_until.
(gfc_drop_interface_elements_before): New function.
* parse.cc
(current_interface_ptr, previous_interface_head): New static variables.
(current_interface_valid_p, get_current_interface_ptr): New functions.
(decode_statement): Initialize previous_interface_head.
(reject_statement): Restore current interface pointer to point to
previous_interface_head.

gcc/testsuite/ChangeLog:

* gfortran.dg/interface_procedure_1.f90: New test.
---
 gcc/fortran/gfortran.h|  3 +-
 gcc/fortran/interface.cc  | 43 ---
 gcc/fortran/parse.cc  | 54 +++
 .../gfortran.dg/interface_procedure_1.f90 | 23 
 4 files changed, 115 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/interface_procedure_1.f90

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index fd47000a88e..0fabe7badde 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3824,6 +3824,7 @@ bool gfc_ref_dimen_size (gfc_array_ref *, int dimen, 
mpz_t *, mpz_t *);
 
 /* interface.cc -- FIXME: some of these should be in symbol.cc */
 void gfc_free_interface (gfc_interface *);
+void gfc_drop_interface_elements_before (gfc_interface **, gfc_interface *);
 bool gfc_compare_derived_types (gfc_symbol *, gfc_symbol *);
 bool gfc_compare_types (gfc_typespec *, gfc_typespec *);
 bool gfc_check_dummy_characteristics (gfc_symbol *, gfc_symbol *,
@@ -3843,7 +3844,7 @@ void gfc_free_formal_arglist (gfc_formal_arglist *);
 bool gfc_extend_assign (gfc_code *, gfc_namespace *);
 bool gfc_check_new_interface (gfc_interface *, gfc_symbol *, locus);
 bool gfc_add_interface (gfc_symbol *);
-gfc_interface *gfc_current_interface_head (void);
+gfc_interface *_current_interface_head (void);
 void gfc_set_current_interface_head (gfc_interface *);
 gfc_symtree* gfc_find_sym_in_symtree (gfc_symbol*);
 bool gfc_arglist_matches_symbol (gfc_actual_arglist**, gfc_symbol*);
diff --git a/gcc/fortran/interface.cc b/gcc/fortran/interface.cc
index ea82056e9e3..c01df0460d7 100644
--- a/gcc/fortran/interface.cc
+++ b/gcc/fortran/interface.cc
@@ -78,18 +78,47 @@ along with GCC; see the file COPYING3.  If not see
 gfc_interface_info current_interface;
 
 
+/* Free the leading members of the gfc_interface linked list given in INTR
+   up to the END element (exclusive: the END element is not freed).
+   If END is not nullptr, it is assumed that END is in the linked list starting
+   with INTR.  */
+
+static void
+free_interface_elements_until (gfc_interface *intr, gfc_interface *end)
+{
+  gfc_interface *next;
+
+  for (; intr != end; intr = next)
+{
+  next = intr->next;
+  free (intr);
+}
+}
+
+
 /* Free a singly linked list of gfc_interface structures.  */
 
 void
 gfc_free_interface (gfc_interface *intr)
 {
-  gfc_interface *next;
+  free_interface_elements_until (intr, nullptr);
+}
 
-  for (; intr; intr = next)
-{
-  next = intr->next;
-  free (intr);
-}
+
+/* Update the interface pointer given by IFC_PTR to make it point to TAIL.
+   It is expected that TAIL (if non-null) is in the list pointed to by
+   IFC_PTR, hence the tail of it.  The members of the list before TAIL are
+   freed before the pointer reassignment.  */
+
+void
+gfc_drop_interface_elements_before (gfc_interface **ifc_ptr,
+   gfc_interface *tail)
+{
+  if (ifc_ptr == nullptr)
+return;
+
+  free_interface_elements_until (*ifc_ptr, tail);
+  *ifc_ptr = tail;
 }
 
 
@@ -4953,7 +4982,7 @@ gfc_add_interface (gfc_symbol *new_sym)
 }
 
 
-gfc_interface *
+gfc_interface *&
 gfc_current_interface_head (void)
 {
   switch (current_interface.type)
diff 

[committed] RISC-V: Fix spill-11.c testsuite failure

2023-08-27 Thread Jeff Law


Jivan's work also results in using a different save/restore function for 
the spill-11 test.  So the expected output needs minor adjusting.


Committed to the trunk.

Jeff
commit 3745feb19ed072e0865b12a891d7dbf7ba12c337
Author: Jeff Law 
Date:   Sun Aug 27 13:00:13 2023 -0600

RISC-V: Fix spill-11.c testsuite failure

Jivan's work also results in using a different save/restore function for the
spill-11 test.  So the expected output needs minor adjusting

gcc/testsuite
* gcc.target/riscv/rvv/base/spill-11.c: Adjust expected output.

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/spill-11.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/spill-11.c
index 179be1c8c5b..484a2510885 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/spill-11.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/spill-11.c
@@ -9,7 +9,7 @@ void fn3 (char*);
 
 /*
 ** stack_save_restore_2:
-** call\tt0,__riscv_save_2
+** call\tt0,__riscv_save_1
 ** csrr\tt0,vlenb
 ** slli\tt1,t0,1
 ** sub\tsp,sp,t1
@@ -23,7 +23,7 @@ void fn3 (char*);
 ** li\tt0,8192
 ** addi\tt0,t0,-192
 ** add\tsp,sp,t0
-** tail\t__riscv_restore_2
+** tail\t__riscv_restore_1
 */
 int stack_save_restore_2 (float a1, float a2, float a3, float a4,
   float a5, float a6, float a7, float a8,


[committed] RISC-V: Fix spill-12 test

2023-08-27 Thread Jeff Law


Jivan's recent work on IRA results in more efficient code for this test. 
This adjusts the expected output for the removal of 5 instructions and 
conversion of an addi into a simple mv.


Pushed to the trunk,
Jeffcommit 6567837fd823a93f7f7948a73ff9dc1153592e8c
Author: Jeff Law 
Date:   Sun Aug 27 12:52:38 2023 -0600

RISC-V: Fix spill-12 test

Jivan's recent work on IRA results in more efficient code for this test. 
This
adjusts the expected output for the removal of 5 instructions and 
conversion of
an addi into a simple mv.

gcc/testsuite
* gcc.target/riscv/rvv/base/spill-12.c: Update expected output.

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/spill-12.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/spill-12.c
index de6e0604a3c..7e83cb7b7c1 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/spill-12.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/spill-12.c
@@ -15,12 +15,7 @@ void fn3 (char*);
 ** addi\tt0,t0,192
 ** add\tsp,sp,t0
 ** ...
-** li\ta0,-8192
-** addi\ta0,a0,192
-** li\ta5,8192
-** addi\ta5,a5,-192
-** add\ta5,a5,a0
-** add\ta0,a5,sp
+** mv\ta0,sp
 ** ...
 ** tail\t__riscv_restore_0
 */


[committed] RISC-V: Fix xtheadcondmov-indirect.c

2023-08-27 Thread Jeff Law


The pressure sensitive scheduling change perturbs the output ever so 
slightly for this test.  Seemed easiest to just turn that off rather 
than generalize the expected output enough to work across all the 
relevant optimization options.


Pushed to the trunk.

Jeff
commit b3b13fb1cbad6e5836dee947e85d2954bcacabed
Author: Jeff Law 
Date:   Sun Aug 27 12:38:30 2023 -0600

RISC-V: Fix xtheadcondmov-indirect.c

The pressure sensitive scheduling change perturbs the output ever so 
slightly
for this test.  Seemed easiest to just turn that off rather than generalize 
the
expected output enough to work across all the relevant optimization options.

gcc/testsuite/
* gcc.target/riscv/xtheadcondmov-indirect.c: Turn off pressure
sensitive scheduling.

diff --git a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c 
b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c
index 8292999d0c7..c3253ba5239 100644
--- a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c
+++ b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv32gc_xtheadcondmov" { target { rv32 } } } */
-/* { dg-options "-march=rv64gc_xtheadcondmov" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadcondmov -fno-sched-pressure" { target { 
rv32 } } } */
+/* { dg-options "-march=rv64gc_xtheadcondmov -fno-sched-pressure" { target { 
rv64 } } } */
 /* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
 /* { dg-final { check-function-bodies "**" "" } } */
 


[Patch/fortran] PR87477 [meta-bug] [F03] issues concerning the ASSOCIATE statement

2023-08-27 Thread Paul Richard Thomas via Gcc-patches
After two months on trunk, this has been backported:

Fortran: Fix some problems blocking associate meta-bug [PR87477]

2023-08-27  Paul Thomas  

gcc/fortran
PR fortran/87477
* parse.cc (parse_associate): Replace the existing evaluation
of the target rank with calls to gfc_resolve_ref and
gfc_expression_rank. Identify untyped target function results
with structure constructors by finding the appropriate derived
type.
* resolve.cc (resolve_symbol): Allow associate variables to be
assumed shape.

gcc/testsuite/
PR fortran/87477
* gfortran.dg/associate_54.f90 : Cope with extra error.

PR fortran/102109
* gfortran.dg/pr102109.f90 : New test.

PR fortran/102112
* gfortran.dg/pr102112.f90 : New test.

PR fortran/102190
* gfortran.dg/pr102190.f90 : New test.

PR fortran/102532
* gfortran.dg/pr102532.f90 : New test.

PR fortran/109948
* gfortran.dg/pr109948.f90 : New test.

PR fortran/99326
* gfortran.dg/pr99326.f90 : New test.

Regards

Paul