Re: [PATCH v6] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2023-01-19 Thread chenglulu



在 2023/1/18 下午5:14, Richard Sandiford 写道:

Lulu Cheng  writes:

Co-authored-by: Yang Yujie 

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_classify_address):
Add precessint for CONST_INT.
(loongarch_print_operand_reloc): Operand modifier 'c' is supported.
(loongarch_print_operand): Increase the processing of '%c'.
* doc/extend.texi: Adds documents for LoongArch operand modifiers.
And port the public operand modifiers information to this document.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/tst-asm-const.c: Moved to...
* gcc.target/loongarch/pr107731.c: ...here.
---
V2 -> v3:
1. Correct a clerical error.
2. Adding document for loongarch operand modifiers.

v3 -> v4:
Copy the description of "%c" "%n" "%a" "%l" from gccint.pdf to gcc.pdf.

v4 -> v5:
Move the operand modifiers description of "%c", "%n", "%a", "%l" to the top of 
the
x86Operandmodifiers section.

v5 -> v6:
Adjust the location of the added section in the document.

---
  gcc/config/loongarch/loongarch.cc | 14 +
  gcc/doc/extend.texi   | 51 +--
  .../loongarch/{tst-asm-const.c => pr107731.c} |  6 +--
  3 files changed, 64 insertions(+), 7 deletions(-)
  rename gcc/testsuite/gcc.target/loongarch/{tst-asm-const.c => pr107731.c} 
(78%)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index c6b03fcf2f9..cdf190b985e 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -2075,6 +2075,11 @@ loongarch_classify_address (struct 
loongarch_address_info *info, rtx x,
return (loongarch_valid_base_register_p (info->reg, mode, strict_p)
  && loongarch_valid_lo_sum_p (info->symbol_type, mode,
   info->offset));
+case CONST_INT:
+  /* Small-integer addresses don't occur very often, but they
+are legitimate if $r0 is a valid base register.  */
+  info->type = ADDRESS_CONST_INT;
+  return IMM12_OPERAND (INTVAL (x));
  
  default:

return false;
@@ -4933,6 +4938,7 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool 
hi64_part,
  
 'A'	Print a _DB suffix if the memory model requires a release.

 'b'Print the address of a memory operand, without offset.
+   'c'  Print an integer.
 'C'Print the integer branch condition for comparison OP.
 'd'Print CONST_INT OP in decimal.
 'F'Print the FPU branch condition for comparison OP.
@@ -4979,6 +4985,14 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
 fputs ("_db", file);
break;
  
+case 'c':

+  if (CONST_INT_P (op))
+   fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (op));
+  else
+   output_operand_lossage ("unsupported operand for code '%c'", letter);
+
+  break;
+
  case 'C':
loongarch_print_int_branch_condition (file, code, letter);
break;
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1103e9936f7..6a5d9faf2f3 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -10402,8 +10402,10 @@ ensures that modifying @var{a} does not affect the 
address referenced by
  is undefined if @var{a} is modified before using @var{b}.
  
  @code{asm} supports operand modifiers on operands (for example @samp{%k2}

-instead of simply @samp{%2}). Typically these qualifiers are hardware
-dependent. The list of supported modifiers for x86 is found at
+instead of simply @samp{%2}). @ref{GenericOperandmodifiers,
+Generic Operand modifiers} lists the modifiers that are available
+on all targets.  Other modifiers are hardware dependent.
+For example, the list of supported modifiers for x86 is found at
  @ref{x86Operandmodifiers,x86 Operand modifiers}.
  
  If the C code that follows the @code{asm} makes no use of any of the output

@@ -10671,8 +10673,10 @@ optimizers may discard the @code{asm} statement as 
unneeded
  (see @ref{Volatile}).
  
  @code{asm} supports operand modifiers on operands (for example @samp{%k2}

-instead of simply @samp{%2}). Typically these qualifiers are hardware
-dependent. The list of supported modifiers for x86 is found at
+instead of simply @samp{%2}). @ref{GenericOperandmodifiers,
+Generic Operand modifiers} lists the modifiers that are available
+on all targets.  Other modifiers are hardware dependent.
+For example, the list of supported modifiers for x86 is found at
  @ref{x86Operandmodifiers,x86 Operand modifiers}.
  
  In this example using the fictitious @code{combine} instruction, the

@@ -11024,6 +11028,30 @@ lab:
  @}
  @end example
  
+@anchor{GenericOperandmodifiers}

+@subsubsection Generic Operand Modifiers
+@noindent
+The following table shows the modifiers supported by all targets and their 
effects:
+
+@multitable {Modifier} {Print the opcode suffix for the size of th} {Operand}

I guess this should be {Modifier} {Description} {...} too.  Maybe

Re: git out-of-order commit (was Re: [PATCH] Fortran: Remove unused declaration)

2023-01-19 Thread Bernhard Reutner-Fischer via Gcc-patches
On 19 January 2023 20:39:08 CET, Jason Merrill  wrote:
>On Sat, Nov 12, 2022 at 4:24 PM Harald Anlauf via Gcc-patches
> wrote:
>>
>> Am 12.11.22 um 22:05 schrieb Bernhard Reutner-Fischer via Gcc-patches:
>> > This function definition was removed years ago, remove it's prototype.
>> >
>> > gcc/fortran/ChangeLog:
>> >
>> >   * gfortran.h (gfc_check_include): Remove declaration.
>> > ---
>> >   gcc/fortran/gfortran.h | 1 -
>> >   1 file changed, 1 deletion(-)
>> > ---
>> > Regtests cleanly, ok for trunk?
>> >
>> > diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
>> > index c4deec0d5b8..ce3ad61bb52 100644
>> > --- a/gcc/fortran/gfortran.h
>> > +++ b/gcc/fortran/gfortran.h
>> > @@ -3208,7 +3208,6 @@ int gfc_at_eof (void);
>> >   int gfc_at_bol (void);
>> >   int gfc_at_eol (void);
>> >   void gfc_advance_line (void);
>> > -int gfc_check_include (void);
>> >   int gfc_define_undef_line (void);
>> >
>> >   int gfc_wide_is_printable (gfc_char_t);
>>
>> OK, thanks.
>
>Somehow this was applied with a CommitDate in 2021, breaking scripts
>that assume monotonically increasing CommitDate.  Anyone know how that
>could have happened?

Sorry for that.
I think i cherry-picked this commit to master before pushing it, not 100% sure 
though.
What shall we do now?


[PATCH] RISC-V: Add vlse/vsse C/C++ API intrinsics support

2023-01-19 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/predicates.md (pmode_reg_or_0_operand): New predicate.
* config/riscv/riscv-vector-builtins-bases.cc (class loadstore): Add 
vlse/vsse intrinsic support.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vlse): Ditto.
(vsse): Ditto.
* config/riscv/riscv-vector-builtins.cc 
(function_expander::use_contiguous_load_insn): Ditto.
* config/riscv/vector.md (@pred_strided_load): Ditto.
(@pred_strided_store): Ditto.

---
 gcc/config/riscv/predicates.md|  4 +
 .../riscv/riscv-vector-builtins-bases.cc  | 26 +-
 .../riscv/riscv-vector-builtins-bases.h   |  2 +
 .../riscv/riscv-vector-builtins-functions.def |  2 +
 gcc/config/riscv/riscv-vector-builtins.cc | 33 ++-
 gcc/config/riscv/vector.md| 90 +--
 6 files changed, 143 insertions(+), 14 deletions(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 5a5a49bf7c0..bae9cfa02dd 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -286,6 +286,10 @@
(match_test "GET_CODE (op) == UNSPEC
 && (XINT (op, 1) == UNSPEC_VUNDEF)"
 
+(define_special_predicate "pmode_reg_or_0_operand"
+  (ior (match_operand 0 "const_0_operand")
+   (match_operand 0 "pmode_register_operand")))
+
 ;; The scalar operand can be directly broadcast by RVV instructions.
 (define_predicate "direct_broadcast_operand"
   (ior (match_operand 0 "register_operand")
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 0da4797d272..17a1294cf85 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -84,8 +84,8 @@ public:
   }
 };
 
-/* Implements vle.v/vse.v/vlm.v/vsm.v codegen.  */
-template 
+/* Implements vle.v/vse.v/vlm.v/vsm.v/vlse.v/vsse.v codegen.  */
+template 
 class loadstore : public function_base
 {
   unsigned int call_properties (const function_instance &) const override
@@ -106,9 +106,23 @@ class loadstore : public function_base
   rtx expand (function_expander ) const override
   {
 if (STORE_P)
-  return e.use_contiguous_store_insn (code_for_pred_store (e.vector_mode 
()));
+  {
+   if (STRIDED_P)
+ return e.use_contiguous_store_insn (
+   code_for_pred_strided_store (e.vector_mode ()));
+   else
+ return e.use_contiguous_store_insn (
+   code_for_pred_store (e.vector_mode ()));
+  }
 else
-  return e.use_contiguous_load_insn (code_for_pred_mov (e.vector_mode ()));
+  {
+   if (STRIDED_P)
+ return e.use_contiguous_load_insn (
+   code_for_pred_strided_load (e.vector_mode ()));
+   else
+ return e.use_contiguous_load_insn (
+   code_for_pred_mov (e.vector_mode ()));
+  }
   }
 };
 
@@ -118,6 +132,8 @@ static CONSTEXPR const loadstore vle_obj;
 static CONSTEXPR const loadstore vse_obj;
 static CONSTEXPR const loadstore vlm_obj;
 static CONSTEXPR const loadstore vsm_obj;
+static CONSTEXPR const loadstore vlse_obj;
+static CONSTEXPR const loadstore vsse_obj;
 
 /* Declare the function base NAME, pointing it to an instance
of class _obj.  */
@@ -130,5 +146,7 @@ BASE (vle)
 BASE (vse)
 BASE (vlm)
 BASE (vsm)
+BASE (vlse)
+BASE (vsse)
 
 } // end namespace riscv_vector
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index 28151a8d8d2..d8676e94b28 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -30,6 +30,8 @@ extern const function_base *const vle;
 extern const function_base *const vse;
 extern const function_base *const vlm;
 extern const function_base *const vsm;
+extern const function_base *const vlse;
+extern const function_base *const vsse;
 }
 
 } // end namespace riscv_vector
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 63aa8fe32c8..348262928c8 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -44,5 +44,7 @@ DEF_RVV_FUNCTION (vle, loadstore, full_preds, 
all_v_scalar_const_ptr_ops)
 DEF_RVV_FUNCTION (vse, loadstore, none_m_preds, all_v_scalar_ptr_ops)
 DEF_RVV_FUNCTION (vlm, loadstore, none_preds, b_v_scalar_const_ptr_ops)
 DEF_RVV_FUNCTION (vsm, loadstore, none_preds, b_v_scalar_ptr_ops)
+DEF_RVV_FUNCTION (vlse, loadstore, full_preds, 
all_v_scalar_const_ptr_ptrdiff_ops)
+DEF_RVV_FUNCTION (vsse, loadstore, none_m_preds, all_v_scalar_ptr_ptrdiff_ops)
 
 #undef DEF_RVV_FUNCTION
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index f95fe0d58d5..b97a2c94550 

[PATCH] xtensa: Revise 89afb2e86fcb29c559b2957fdcbea0d01740c49b

2023-01-19 Thread Takayuki 'January June' Suwa via Gcc-patches
In the previously posted patch
"xtensa: Make complex hard register clobber elimination more robust and 
accurate",
the check code for insns that refer to the [DS]Cmode hard register before
it is overwritten after it is clobbered is incomplete.  Fortunately such
insns are seldom emitted, so it didn't matter.

This patch fixes that for the sake of completeness.

gcc/ChangeLog:

* config/xtensa/xtensa.md:
Fix exit from loops detecting references before overwriting in the
split pattern.
---
 gcc/config/xtensa/xtensa.md | 72 +++--
 1 file changed, 37 insertions(+), 35 deletions(-)

diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 8432d7bcb..e26772413 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -2976,45 +2976,47 @@
 {
   auto_sbitmap bmp (FIRST_PSEUDO_REGISTER);
   rtx_insn *insn;
-  rtx reg = gen_rtx_REG (SImode, 0);
+  rtx reg = gen_rtx_REG (SImode, 0), dest;
+  unsigned int regno;
+  sbitmap_iterator iter;
   bitmap_set_range (bmp, REGNO (operands[0]), REG_NREGS (operands[0]));
   for (insn = next_nonnote_nondebug_insn_bb (curr_insn);
insn; insn = next_nonnote_nondebug_insn_bb (insn))
-{
-  sbitmap_iterator iter;
-  unsigned int regno;
-  if (NONJUMP_INSN_P (insn))
-   {
- EXECUTE_IF_SET_IN_BITMAP (bmp, 2, regno, iter)
-   {
- set_regno_raw (reg, regno, REG_NREGS (reg));
- if (reg_overlap_mentioned_p (reg, PATTERN (insn)))
-   break;
-   }
- if (GET_CODE (PATTERN (insn)) == SET)
-   {
- rtx x = SET_DEST (PATTERN (insn));
- if (REG_P (x) && HARD_REGISTER_P (x))
-   bitmap_clear_range (bmp, REGNO (x), REG_NREGS (x));
- else if (SUBREG_P (x) && HARD_REGISTER_P (SUBREG_REG (x)))
-   {
- struct subreg_info info;
- subreg_get_info (regno = REGNO (SUBREG_REG (x)),
-  GET_MODE (SUBREG_REG (x)),
-  SUBREG_BYTE (x), GET_MODE (x), );
- if (!info.representable_p)
-   break;
- bitmap_clear_range (bmp, regno + info.offset, info.nregs);
-   }
-   }
- if (bitmap_empty_p (bmp))
-   goto FALLTHRU;
-   }
-  else if (CALL_P (insn))
+if (NONJUMP_INSN_P (insn))
+  {
EXECUTE_IF_SET_IN_BITMAP (bmp, 2, regno, iter)
-if (call_used_or_fixed_reg_p (regno))
-  break;
-}
+ {
+   set_regno_raw (reg, regno, REG_NREGS (reg));
+   if (reg_referenced_p (reg, PATTERN (insn)))
+ goto ABORT;
+ }
+   if (GET_CODE (PATTERN (insn)) == SET
+   || GET_CODE (PATTERN (insn)) == CLOBBER)
+ {
+   dest = SET_DEST (PATTERN (insn));
+   if (REG_P (dest) && HARD_REGISTER_P (dest))
+ bitmap_clear_range (bmp, REGNO (dest), REG_NREGS (dest));
+   else if (SUBREG_P (dest)
+&& HARD_REGISTER_P (SUBREG_REG (dest)))
+ {
+   struct subreg_info info;
+   subreg_get_info (regno = REGNO (SUBREG_REG (dest)),
+GET_MODE (SUBREG_REG (dest)),
+SUBREG_BYTE (dest), GET_MODE (dest),
+);
+   if (!info.representable_p)
+ break;
+   bitmap_clear_range (bmp, regno + info.offset, info.nregs);
+ }
+ }
+   if (bitmap_empty_p (bmp))
+ goto FALLTHRU;
+  }
+else if (CALL_P (insn))
+  EXECUTE_IF_SET_IN_BITMAP (bmp, 2, regno, iter)
+   if (call_used_or_fixed_reg_p (regno))
+ goto ABORT;
+ABORT:
   FAIL;
 FALLTHRU:;
 })
-- 
2.30.2


[PATCH] RISC-V: Add vle/vse C++ overloaded API intrinsic testcases

2023-01-19 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/vle-1.C: New test.
* g++.target/riscv/rvv/base/vle_tu-1.C: New test.
* g++.target/riscv/rvv/base/vle_tum-1.C: New test.
* g++.target/riscv/rvv/base/vle_tumu-1.C: New test.
* g++.target/riscv/rvv/base/vse-1.C: New test.

---
 .../g++.target/riscv/rvv/base/vle-1.C | 345 +
 .../g++.target/riscv/rvv/base/vle_tu-1.C  | 345 +
 .../g++.target/riscv/rvv/base/vle_tum-1.C | 345 +
 .../g++.target/riscv/rvv/base/vle_tumu-1.C| 345 +
 .../g++.target/riscv/rvv/base/vse-1.C | 685 ++
 5 files changed, 2065 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/vle-1.C
 create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/vle_tu-1.C
 create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/vle_tum-1.C
 create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/vle_tumu-1.C
 create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/vse-1.C

diff --git a/gcc/testsuite/g++.target/riscv/rvv/base/vle-1.C 
b/gcc/testsuite/g++.target/riscv/rvv/base/vle-1.C
new file mode 100644
index 000..e06f62a8fb9
--- /dev/null
+++ b/gcc/testsuite/g++.target/riscv/rvv/base/vle-1.C
@@ -0,0 +1,345 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-schedule-insns 
-fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+vint8mf8_t
+test___riscv_vle8(vbool64_t mask,int8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vint8mf4_t
+test___riscv_vle8(vbool32_t mask,int8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vint8mf2_t
+test___riscv_vle8(vbool16_t mask,int8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vint8m1_t
+test___riscv_vle8(vbool8_t mask,int8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vint8m2_t
+test___riscv_vle8(vbool4_t mask,int8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vint8m4_t
+test___riscv_vle8(vbool2_t mask,int8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vint8m8_t
+test___riscv_vle8(vbool1_t mask,int8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vuint8mf8_t
+test___riscv_vle8(vbool64_t mask,uint8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vuint8mf4_t
+test___riscv_vle8(vbool32_t mask,uint8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vuint8mf2_t
+test___riscv_vle8(vbool16_t mask,uint8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vuint8m1_t
+test___riscv_vle8(vbool8_t mask,uint8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vuint8m2_t
+test___riscv_vle8(vbool4_t mask,uint8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vuint8m4_t
+test___riscv_vle8(vbool2_t mask,uint8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vuint8m8_t
+test___riscv_vle8(vbool1_t mask,uint8_t* base,size_t vl)
+{
+  return __riscv_vle8(mask,base,vl);
+}
+
+vint16mf4_t
+test___riscv_vle16(vbool64_t mask,int16_t* base,size_t vl)
+{
+  return __riscv_vle16(mask,base,vl);
+}
+
+vint16mf2_t
+test___riscv_vle16(vbool32_t mask,int16_t* base,size_t vl)
+{
+  return __riscv_vle16(mask,base,vl);
+}
+
+vint16m1_t
+test___riscv_vle16(vbool16_t mask,int16_t* base,size_t vl)
+{
+  return __riscv_vle16(mask,base,vl);
+}
+
+vint16m2_t
+test___riscv_vle16(vbool8_t mask,int16_t* base,size_t vl)
+{
+  return __riscv_vle16(mask,base,vl);
+}
+
+vint16m4_t
+test___riscv_vle16(vbool4_t mask,int16_t* base,size_t vl)
+{
+  return __riscv_vle16(mask,base,vl);
+}
+
+vint16m8_t
+test___riscv_vle16(vbool2_t mask,int16_t* base,size_t vl)
+{
+  return __riscv_vle16(mask,base,vl);
+}
+
+vuint16mf4_t
+test___riscv_vle16(vbool64_t mask,uint16_t* base,size_t vl)
+{
+  return __riscv_vle16(mask,base,vl);
+}
+
+vuint16mf2_t
+test___riscv_vle16(vbool32_t mask,uint16_t* base,size_t vl)
+{
+  return __riscv_vle16(mask,base,vl);
+}
+
+vuint16m1_t
+test___riscv_vle16(vbool16_t mask,uint16_t* base,size_t vl)
+{
+  return __riscv_vle16(mask,base,vl);
+}
+
+vuint16m2_t
+test___riscv_vle16(vbool8_t mask,uint16_t* base,size_t vl)
+{
+  return __riscv_vle16(mask,base,vl);
+}
+
+vuint16m4_t
+test___riscv_vle16(vbool4_t mask,uint16_t* base,size_t vl)
+{
+  return __riscv_vle16(mask,base,vl);
+}
+
+vuint16m8_t
+test___riscv_vle16(vbool2_t mask,uint16_t* base,size_t vl)
+{
+  return __riscv_vle16(mask,base,vl);
+}
+
+vint32mf2_t
+test___riscv_vle32(vbool64_t mask,int32_t* base,size_t vl)
+{
+  return __riscv_vle32(mask,base,vl);
+}
+
+vint32m1_t
+test___riscv_vle32(vbool32_t mask,int32_t* base,size_t vl)
+{
+  return __riscv_vle32(mask,base,vl);
+}
+
+vint32m2_t
+test___riscv_vle32(vbool16_t mask,int32_t* base,size_t vl)
+{
+  return __riscv_vle32(mask,base,vl);
+}
+
+vint32m4_t
+test___riscv_vle32(vbool8_t mask,int32_t* base,size_t vl)
+{
+  return __riscv_vle32(mask,base,vl);
+}
+

[PATCH] RISC-V: Fix vop_m overloaded C++ API name.

2023-01-19 Thread juzhe . zhong
From: Ju-Zhe Zhong 

According to https://github.com/riscv-non-isa/rvv-intrinsic-doc/tree/master/
For "vop_m" intrinsics, C++ overloaded API does not have "_m" suffix.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-shapes.cc (struct loadstore_def): 
Remove _m suffix for "vop_m" C++ overloaded API name.

---
 gcc/config/riscv/riscv-vector-builtins-shapes.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index 76cf14a8cc4..56697f71cbd 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -128,6 +128,10 @@ struct loadstore_def : public build_base
b.append_name (type_suffixes[instance.type.index].vector);
   }
 
+/* According to rvv-intrinsic-doc, it does not add "_m" suffix
+   for vop_m C++ overloaded API.  */
+if (overloaded_p && instance.pred == PRED_TYPE_m)
+  return b.finish_name ();
 b.append_name (predication_suffixes[instance.pred]);
 return b.finish_name ();
   }
-- 
2.36.3



[PATCH] c++: Quash bogus -Wunused-value with new [PR107797]

2023-01-19 Thread Marek Polacek via Gcc-patches
We shouldn't emit "right operand of comma operator has no effect"
when that comma operator was created by the compiler for "new int{}".
convert_to_void/COMPOUND_EXPR already checks warning_suppressed_p so
we can just suppress -Wunused-value.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/107797

gcc/cp/ChangeLog:

* cvt.cc (ocp_convert): copy_warning when creating a new
COMPOUND_EXPR.
* init.cc (build_new_1): Suppress -Wunused-value on
compiler-generated COMPOUND_EXPRs.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wunused-value-1.C: New test.
---
 gcc/cp/cvt.cc   |  6 --
 gcc/cp/init.cc  |  2 ++
 gcc/testsuite/g++.dg/warn/Wunused-value-1.C | 12 
 3 files changed, 18 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wunused-value-1.C

diff --git a/gcc/cp/cvt.cc b/gcc/cp/cvt.cc
index 0cbfd8060cb..17827d06a4a 100644
--- a/gcc/cp/cvt.cc
+++ b/gcc/cp/cvt.cc
@@ -711,8 +711,10 @@ ocp_convert (tree type, tree expr, int convtype, int flags,
return error_mark_node;
   if (e == TREE_OPERAND (expr, 1))
return expr;
-  return build2_loc (EXPR_LOCATION (expr), COMPOUND_EXPR, TREE_TYPE (e),
-TREE_OPERAND (expr, 0), e);
+  e = build2_loc (EXPR_LOCATION (expr), COMPOUND_EXPR, TREE_TYPE (e),
+ TREE_OPERAND (expr, 0), e);
+  copy_warning (e, expr);
+  return e;
 }
 
   complete_type (type);
diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index f816c474cef..52e96fbe590 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -3800,6 +3800,8 @@ build_new_1 (vec **placement, tree type, 
tree nelts,
   if (cookie_expr)
 rval = build2 (COMPOUND_EXPR, TREE_TYPE (rval), cookie_expr, rval);
 
+  suppress_warning (rval, OPT_Wunused_value);
+
   if (rval == data_addr && TREE_CODE (alloc_expr) == TARGET_EXPR)
 /* If we don't have an initializer or a cookie, strip the TARGET_EXPR
and return the call (which doesn't need to be adjusted).  */
diff --git a/gcc/testsuite/g++.dg/warn/Wunused-value-1.C 
b/gcc/testsuite/g++.dg/warn/Wunused-value-1.C
new file mode 100644
index 000..2ba5587fce0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wunused-value-1.C
@@ -0,0 +1,12 @@
+// PR c++/107797
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wunused" }
+
+void
+g ()
+{
+  (long) new int{};
+  long(new int{});
+  (long) new int();
+  long(new int());
+}

base-commit: 86caab6c5d1e26e1c54c3dceacc873d6e27bfc09
-- 
2.39.0



[PATCH v3] c++: -Wdangling-reference with reference wrapper [PR107532]

2023-01-19 Thread Marek Polacek via Gcc-patches
On Thu, Jan 19, 2023 at 01:02:02PM -0500, Jason Merrill wrote:
> On 1/18/23 20:13, Marek Polacek wrote:
> > On Wed, Jan 18, 2023 at 04:07:59PM -0500, Jason Merrill wrote:
> > > On 1/18/23 12:52, Marek Polacek wrote:
> > > > Here, -Wdangling-reference triggers where it probably shouldn't, causing
> > > > some grief.  The code in question uses a reference wrapper with a member
> > > > function returning a reference to a subobject of a non-temporary object:
> > > > 
> > > > const Plane & meta = fm.planes().inner();
> > > > 
> > > > I've tried a few approaches, e.g., checking that the member function's
> > > > return type is the same as the type of the enclosing class (which is
> > > > the case for member functions returning *this), but that then breaks
> > > > Wdangling-reference4.C with std::optional.
> > > > 
> > > > So I figured that perhaps we want to look at the object we're invoking
> > > > the member function(s) on and see if that is a temporary, as in, don't
> > > > warn about
> > > > 
> > > > const Plane & meta = fm.planes().inner();
> > > > 
> > > > but do warn about
> > > > 
> > > > const Plane & meta = FrameMetadata().planes().inner();
> > > > 
> > > > It's ugly, but better than asking users to add #pragmas into their code.
> > > 
> > > Hmm, that doesn't seem right; the former is only OK because Ref is in 
> > > fact a
> > > reference-like type.  If planes() returned a class that held data, we 
> > > would
> > > want to warn.
> > 
> > Sure, it's always some kind of tradeoff with warnings :/.
> > > In this case, we might recognize the reference-like class because it has a
> > > reference member and a constructor taking the same reference type.
> > 
> > That occurred to me too, but then I found out that std::reference_wrapper
> > actually uses T*, not T&, as you say.  But here's a patch to do that
> > (I hope).
> > > That wouldn't help with std::reference_wrapper or std::ref_view because 
> > > they
> > > have pointer members instead of references, but perhaps loosening the 
> > > check
> > > to include that case would make sense?
> > 
> > Sorry, I don't understand what you mean by loosening the check.  I could
> > hardcode std::reference_wrapper and std::ref_view but I don't think that's
> > what you meant.
> 
> Indeed that's not what I meant, but as I was saying in our meeting I think
> it's worth doing; the compiler has various tweaks to handle specific
> standard-library classes better.
 
Okay, done in the patch below.  Except that I'm not including a test for
std::ranges::ref_view because I don't really know how that works.

> > Surely I cannot _not_ warn for any class that contains a T*.
> 
> I was thinking if a constructor takes a T& and the class has a T* that would
> be close enough, though this also wouldn't handle the standard library
> classes so the benefit is questionable.
> 
> > Here's the patch so that we have some actual code to discuss...  Thanks.
> > 
> > -- >8 --
> > Here, -Wdangling-reference triggers where it probably shouldn't, causing
> > some grief.  The code in question uses a reference wrapper with a member
> > function returning a reference to a subobject of a non-temporary object:
> > 
> >const Plane & meta = fm.planes().inner();
> > 
> > I've tried a few approaches, e.g., checking that the member function's
> > return type is the same as the type of the enclosing class (which is
> > the case for member functions returning *this), but that then breaks
> > Wdangling-reference4.C with std::optional.
> > 
> > Perhaps we want to look at the member function's enclosing class
> > to see if it's a reference wrapper class (meaning, has a reference
> > member and a constructor taking the same reference type) and don't
> > warn if so, supposing that the member function returns a reference
> > to a non-temporary object.
> > 
> > It's ugly, but better than asking users to add #pragmas into their code.
> > 
> > PR c++/107532
> > 
> > gcc/cp/ChangeLog:
> > 
> > * call.cc (do_warn_dangling_reference): Don't warn when the
> > member function comes from a reference wrapper class.
> 
> Let's factor the new code out into e.g. reference_like_class_p

Done.  Thanks,

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Here, -Wdangling-reference triggers where it probably shouldn't, causing
some grief.  The code in question uses a reference wrapper with a member
function returning a reference to a subobject of a non-temporary object:

  const Plane & meta = fm.planes().inner();

I've tried a few approaches, e.g., checking that the member function's
return type is the same as the type of the enclosing class (which is
the case for member functions returning *this), but that then breaks
Wdangling-reference4.C with std::optional.

Perhaps we want to look at the member function's enclosing class
to see if it's a reference wrapper class (meaning, has a reference
member and a constructor taking the same reference type, or is
std::reference_wrapper or 

Re: [PATCH] c++: Fix up handling of references to anon union members in initializers [PR53932]

2023-01-19 Thread Andrew Pinski via Gcc-patches
On Thu, Jan 19, 2023 at 12:13 PM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> For anonymous union members we create artificial VAR_DECLs which
> have DECL_VALUE_EXPR for the actual COMPONENT_REF.  That works
> just fine inside of functions (including global dynamic constructors),
> because during gimplification such VAR_DECLs are gimplified as
> their DECL_VALUE_EXPR.  This is also done during regimplification.
>
> But references to these artificial vars in DECL_INITIAL expressions
> aren't ever replaced by the DECL_VALUE_EXPRs, so we end up either
> with link failures like on the testcase below, or worse ICEs with
> LTO.
>
> The following patch fixes those during cp_fully_fold_init where we
> already walk all the trees (!data->genericize means that
> function rather than cp_fold_function).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

I noticed (static) structured bindings has a similar issue but is not
fixed by this because this checks to see if it is an anonymous union
decl.
I filed PR 108474 for that.

Thanks,
Andrew Pinski

>
> 2023-01-19  Jakub Jelinek  
>
> PR c++/53932
> * cp-gimplify.cc (cp_fold_r): During cp_fully_fold_init replace
> DECL_ANON_UNION_VAR_P VAR_DECLs with their corresponding
> DECL_VALUE_EXPR.
>
> * g++.dg/init/pr53932.C: New test.
>
> --- gcc/cp/cp-gimplify.cc.jj2023-01-16 11:52:16.065734330 +0100
> +++ gcc/cp/cp-gimplify.cc   2023-01-19 18:13:54.592661735 +0100
> @@ -1010,6 +1010,16 @@ cp_fold_r (tree *stmt_p, int *walk_subtr
> }
>break;
>
> +case VAR_DECL:
> +  /* In initializers replace anon union artificial VAR_DECLs
> +with their DECL_VALUE_EXPRs, as nothing will do it later.  */
> +  if (DECL_ANON_UNION_VAR_P (stmt) && !data->genericize)
> +   {
> + *stmt_p = stmt = unshare_expr (DECL_VALUE_EXPR (stmt));
> + break;
> +   }
> +  break;
> +
>  default:
>break;
>  }
> --- gcc/testsuite/g++.dg/init/pr53932.C.jj  2023-01-19 18:22:24.837231192 
> +0100
> +++ gcc/testsuite/g++.dg/init/pr53932.C 2023-01-19 18:20:51.776586408 +0100
> @@ -0,0 +1,25 @@
> +// PR c++/53932
> +// { dg-do link }
> +
> +static union { int i; };
> +int  = i;
> +int s = i;
> +int *t = 
> +
> +void
> +foo (int **p, int *q)
> +{
> +  static int  = i;
> +  static int v = i;
> +  static int *w = 
> +  int  = i;
> +  int y = i;
> +  int *z = 
> +  *p = 
> +  *q = i;
> +}

> +
> +int
> +main ()
> +{
> +}
>
> Jakub
>


Re: [PATCH] value-relation: Fix up relation_union [PR108447]

2023-01-19 Thread Andrew MacLeod via Gcc-patches



On 1/19/23 15:16, Jakub Jelinek wrote:

Hi!

While looking at the PR, I've noticed one row in rr_union_table
is wrong.  relation_union should be commutative, but due to that
bug is not.  The following patch adds a self-test for that
property (fails without the first hunk) and fixes that line.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

The actual floating point relation problem isn't fixed by this patch
though.

2023-01-19  Jakub Jelinek  

PR tree-optimization/108447
* value-relation.cc (rr_union_table): Fix VREL_UNDEFINED row order.
(relation_tests): Add self-tests for relation_{intersect,union}
commutativity.
* selftest.h (relation_tests): Declare.
* function-tests.cc (test_ranges): Call it.
<...>



+relation_tests ()
+{
+  // Verify commutativity of relation_intersect and relation_union.
+  for (relation_kind r1 = VREL_VARYING; r1 < VREL_PE8;
+   r1 = relation_kind (r1 + 1))
+for (relation_kind r2 = VREL_VARYING; r2 < VREL_PE8;
+r2 = relation_kind (r2 + 1))
+  {
+   ASSERT_EQ (relation_intersect (r1, r2), relation_intersect (r2, r1));
+   ASSERT_EQ (relation_union (r1, r2), relation_union (r2, r1));
+  }
+}


Easy test, I like it.





Re: [PATCH] tree-optimization/104475 - bogus -Wstringop-overflow

2023-01-19 Thread Jason Merrill via Gcc-patches

On 1/18/23 03:06, Richard Biener wrote:

On Tue, 17 Jan 2023, Jason Merrill wrote:


On 12/7/22 06:25, Richard Biener wrote:

The following avoids a bogus -Wstringop-overflow diagnostic by
properly recognizing that >m_mutex cannot be nullptr in C++
even if m_mutex is at offset zero.  The frontend already diagnoses
a >m_mutex != nullptr comparison and the following transfers
this knowledge to the middle-end which sees >m_mutex as
simple pointer arithmetic.  The new ADDR_NONZERO flag on an
ADDR_EXPR is used to carry this information and it's checked in
the tree_expr_nonzero_p API which causes this to be folded early.

To avoid the bogus diagnostic this avoids separating the nullptr
path via jump-threading by eliminating the nullptr check.

I'd appreciate C++ folks picking this up and put the flag on
the appropriate ADDR_EXPRs - I've tried avoiding to put it on
all of them and didn't try hard to mimick what -Waddress warns
on (the code is big, maybe some refactoring would help but also
not sure what exactly the C++ standard constraints are here).


This is allowed by the standard, at least after CWG2535, but we need to check
-fsanitize=null before asserting that the address is non-null. With that
elaboration, a flag on the ADDR_EXPR may not be a convenient way to express
the property?


Adding a flag on the ADDR_EXPR was mostly out of caution for other
languages that do not have this guarantee (it seems C has a similar
guarantee at least) and for the middle-end (accidentally) producing
such expressions.  That is, I intended to set the flag on ADDR_EXPRs
written by the user as opposed to those created artificially.

I noticed the &* contraction rule and wondered how to conservatively
enforce that - I suppose we'd rely on the frontend to never actually
produce the ADDR_EXPR here.


Makes sense.


That said, we could re-define GENERIC/GIMPLE here to the extent
that ADDR_EXPR of a COMPONENT_REF (or all handled components?)


Not ARRAY_REF, I think; in C++ [0] (i.e. p+0) seems well-formed for 
null p, though any other index is undefined.



is never nullptr when the target specifies nullptr is not a valid
object address.  We currently already assert there's a valid
object for >x if x lives at non-zero offset, so the case we
fail to handle is specifically _only_ the one the component is
at offset zero.  Note >x != (void *)4 isn't currently optimized
when x is at offset 4 even though *p would be at address zero
and -Waddress also doesn't diagnose this case - we could
canonicalize this to to p != (void *)0 but then we cannot
treat this as false anymore because of the address-taking of a component.


Any thoughts about where the -fsanitize=null check goes?


Richard.


Bootstrapped and tested on x86_64-unknown-linux-gnu.

Thanks,
Richard.

PR tree-optimization/104475
gcc/
  * tree-core.h: Document use of nothrow_flag on ADDR_EXPR.
  * tree.h (ADDR_NONZERO): New.
  * fold-const.cc (tree_single_nonzero_warnv_p): Check
  ADDR_NONZERO.

gcc/cp/
  * typeck.cc (cp_build_addr_expr_1): Set ADDR_NONZERO
  on the built address if it is of a COMPONENT_REF.

* g++.dg/opt/pr104475.C: New testcase.
---
   gcc/cp/typeck.cc|  3 +++
   gcc/fold-const.cc   |  4 +++-
   gcc/testsuite/g++.dg/opt/pr104475.C | 12 
   gcc/tree-core.h |  3 +++
   gcc/tree.h  |  4 
   5 files changed, 25 insertions(+), 1 deletion(-)
   create mode 100644 gcc/testsuite/g++.dg/opt/pr104475.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index 7dfe5acc67e..3563750803e 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -7232,6 +7232,9 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue,
tsubst_flags_t complain)
 gcc_assert (same_type_ignoring_top_level_qualifiers_p
  (TREE_TYPE (object), decl_type_context (field)));
 val = build_address (arg);
+  if (TREE_CODE (val) == ADDR_EXPR
+ && TREE_CODE (TREE_OPERAND (val, 0)) == COMPONENT_REF)
+   ADDR_NONZERO (val) = 1;
   }
   
 if (TYPE_PTR_P (argtype)

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index e80be8049e1..cdfe3f50ae3 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -15308,8 +15308,10 @@ tree_single_nonzero_warnv_p (tree t, bool
*strict_overflow_p)
   
   case ADDR_EXPR:

 {
-   tree base = TREE_OPERAND (t, 0);
+   if (ADDR_NONZERO (t))
+ return true;
   +tree base = TREE_OPERAND (t, 0);
if (!DECL_P (base))
  base = get_base_address (base);
   diff --git a/gcc/testsuite/g++.dg/opt/pr104475.C
b/gcc/testsuite/g++.dg/opt/pr104475.C
new file mode 100644
index 000..013c70302c6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/pr104475.C
@@ -0,0 +1,12 @@
+// { dg-do compile }
+// { dg-require-effective-target c++11 }
+// { dg-options "-O -Waddress -fdump-tree-original" }
+
+struct X { int i; };
+
+bool foo (struct X *p)
+{
+  return >i != nullptr; /* { dg-warning "never be 

Re: [PATCH] c++: Fix up handling of references to anon union members in initializers [PR53932]

2023-01-19 Thread Jason Merrill via Gcc-patches

On 1/19/23 15:13, Jakub Jelinek wrote:

Hi!

For anonymous union members we create artificial VAR_DECLs which
have DECL_VALUE_EXPR for the actual COMPONENT_REF.  That works
just fine inside of functions (including global dynamic constructors),
because during gimplification such VAR_DECLs are gimplified as
their DECL_VALUE_EXPR.  This is also done during regimplification.

But references to these artificial vars in DECL_INITIAL expressions
aren't ever replaced by the DECL_VALUE_EXPRs, so we end up either
with link failures like on the testcase below, or worse ICEs with
LTO.

The following patch fixes those during cp_fully_fold_init where we
already walk all the trees (!data->genericize means that
function rather than cp_fold_function).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2023-01-19  Jakub Jelinek  

PR c++/53932
* cp-gimplify.cc (cp_fold_r): During cp_fully_fold_init replace
DECL_ANON_UNION_VAR_P VAR_DECLs with their corresponding
DECL_VALUE_EXPR.

* g++.dg/init/pr53932.C: New test.

--- gcc/cp/cp-gimplify.cc.jj2023-01-16 11:52:16.065734330 +0100
+++ gcc/cp/cp-gimplify.cc   2023-01-19 18:13:54.592661735 +0100
@@ -1010,6 +1010,16 @@ cp_fold_r (tree *stmt_p, int *walk_subtr
}
break;
  
+case VAR_DECL:

+  /* In initializers replace anon union artificial VAR_DECLs
+with their DECL_VALUE_EXPRs, as nothing will do it later.  */
+  if (DECL_ANON_UNION_VAR_P (stmt) && !data->genericize)
+   {
+ *stmt_p = stmt = unshare_expr (DECL_VALUE_EXPR (stmt));
+ break;
+   }
+  break;
+
  default:
break;
  }
--- gcc/testsuite/g++.dg/init/pr53932.C.jj  2023-01-19 18:22:24.837231192 
+0100
+++ gcc/testsuite/g++.dg/init/pr53932.C 2023-01-19 18:20:51.776586408 +0100
@@ -0,0 +1,25 @@
+// PR c++/53932
+// { dg-do link }
+
+static union { int i; };
+int  = i;
+int s = i;
+int *t = 
+
+void
+foo (int **p, int *q)
+{
+  static int  = i;
+  static int v = i;
+  static int *w = 
+  int  = i;
+  int y = i;
+  int *z = 
+  *p = 
+  *q = i;
+}
+
+int
+main ()
+{
+}

Jakub





Re: git out-of-order commit (was Re: [PATCH] Fortran: Remove unused declaration)

2023-01-19 Thread Harald Anlauf via Gcc-patches

Am 19.01.23 um 20:39 schrieb Jason Merrill via Gcc-patches:

On Sat, Nov 12, 2022 at 4:24 PM Harald Anlauf via Gcc-patches
 wrote:


Am 12.11.22 um 22:05 schrieb Bernhard Reutner-Fischer via Gcc-patches:

This function definition was removed years ago, remove it's prototype.

gcc/fortran/ChangeLog:

   * gfortran.h (gfc_check_include): Remove declaration.
---
   gcc/fortran/gfortran.h | 1 -
   1 file changed, 1 deletion(-)
---
Regtests cleanly, ok for trunk?

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index c4deec0d5b8..ce3ad61bb52 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3208,7 +3208,6 @@ int gfc_at_eof (void);
   int gfc_at_bol (void);
   int gfc_at_eol (void);
   void gfc_advance_line (void);
-int gfc_check_include (void);
   int gfc_define_undef_line (void);

   int gfc_wide_is_printable (gfc_char_t);


OK, thanks.


Somehow this was applied with a CommitDate in 2021, breaking scripts
that assume monotonically increasing CommitDate.  Anyone know how that
could have happened?


It is quite unusual that the CommitDate is before the AuthorDate:

% git show --pretty=fuller 7ce0cee77adf33397d0ba61e7445effd8a5d8fcc |
head -5
commit 7ce0cee77adf33397d0ba61e7445effd8a5d8fcc
Author: Bernhard Reutner-Fischer 
AuthorDate: Sat Nov 6 06:51:00 2021 +0100
Commit: Bernhard Reutner-Fischer 
CommitDate: Sat Nov 6 06:48:00 2021 +0100

Could this have prevented checks to work properly?

Harald


Jason






[PATCH] value-relation: Fix up relation_union [PR108447]

2023-01-19 Thread Jakub Jelinek via Gcc-patches
Hi!

While looking at the PR, I've noticed one row in rr_union_table
is wrong.  relation_union should be commutative, but due to that
bug is not.  The following patch adds a self-test for that
property (fails without the first hunk) and fixes that line.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

The actual floating point relation problem isn't fixed by this patch
though.

2023-01-19  Jakub Jelinek  

PR tree-optimization/108447
* value-relation.cc (rr_union_table): Fix VREL_UNDEFINED row order.
(relation_tests): Add self-tests for relation_{intersect,union}
commutativity.
* selftest.h (relation_tests): Declare.
* function-tests.cc (test_ranges): Call it.

--- gcc/value-relation.cc.jj2023-01-19 18:33:47.296304283 +0100
+++ gcc/value-relation.cc   2023-01-19 18:41:24.280658434 +0100
@@ -115,7 +115,7 @@ relation_kind rr_union_table[VREL_LAST][
   { VREL_VARYING, VREL_VARYING, VREL_VARYING, VREL_VARYING, VREL_VARYING,
 VREL_VARYING, VREL_VARYING, VREL_VARYING },
 // VREL_UNDEFINED
-  { VREL_VARYING, VREL_LT, VREL_LE, VREL_GT, VREL_GE, VREL_UNDEFINED,
+  { VREL_VARYING, VREL_UNDEFINED, VREL_LT, VREL_LE, VREL_GT, VREL_GE,
 VREL_EQ, VREL_NE },
 // VREL_LT
   { VREL_VARYING, VREL_LT, VREL_LT, VREL_LE, VREL_NE, VREL_VARYING, VREL_LE,
@@ -1718,3 +1718,26 @@ equiv_relation_iterator::get_name (relat
 }
   return NULL_TREE;
 }
+
+#if CHECKING_P
+#include "selftest.h"
+
+namespace selftest
+{
+void
+relation_tests ()
+{
+  // Verify commutativity of relation_intersect and relation_union.
+  for (relation_kind r1 = VREL_VARYING; r1 < VREL_PE8;
+   r1 = relation_kind (r1 + 1))
+for (relation_kind r2 = VREL_VARYING; r2 < VREL_PE8;
+r2 = relation_kind (r2 + 1))
+  {
+   ASSERT_EQ (relation_intersect (r1, r2), relation_intersect (r2, r1));
+   ASSERT_EQ (relation_union (r1, r2), relation_union (r2, r1));
+  }
+}
+
+} // namespace selftest
+
+#endif // CHECKING_P
--- gcc/selftest.h.jj   2023-01-02 09:32:34.083116265 +0100
+++ gcc/selftest.h  2023-01-19 18:38:58.267781163 +0100
@@ -244,6 +244,7 @@ extern void predict_cc_tests ();
 extern void pretty_print_cc_tests ();
 extern void range_tests ();
 extern void range_op_tests ();
+extern void relation_tests ();
 extern void gimple_range_tests ();
 extern void read_rtl_function_cc_tests ();
 extern void rtl_tests_cc_tests ();
--- gcc/function-tests.cc.jj2023-01-02 09:32:51.941858230 +0100
+++ gcc/function-tests.cc   2023-01-19 18:39:13.942553180 +0100
@@ -583,6 +583,7 @@ test_ranges ()
   push_cfun (fun);
   range_tests ();
   range_op_tests ();
+  relation_tests ();
 
   build_cfg (fndecl);
   convert_to_ssa (fndecl);

Jakub



[PATCH] c++: Fix up handling of references to anon union members in initializers [PR53932]

2023-01-19 Thread Jakub Jelinek via Gcc-patches
Hi!

For anonymous union members we create artificial VAR_DECLs which
have DECL_VALUE_EXPR for the actual COMPONENT_REF.  That works
just fine inside of functions (including global dynamic constructors),
because during gimplification such VAR_DECLs are gimplified as
their DECL_VALUE_EXPR.  This is also done during regimplification.

But references to these artificial vars in DECL_INITIAL expressions
aren't ever replaced by the DECL_VALUE_EXPRs, so we end up either
with link failures like on the testcase below, or worse ICEs with
LTO.

The following patch fixes those during cp_fully_fold_init where we
already walk all the trees (!data->genericize means that
function rather than cp_fold_function).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-01-19  Jakub Jelinek  

PR c++/53932
* cp-gimplify.cc (cp_fold_r): During cp_fully_fold_init replace
DECL_ANON_UNION_VAR_P VAR_DECLs with their corresponding
DECL_VALUE_EXPR.

* g++.dg/init/pr53932.C: New test.

--- gcc/cp/cp-gimplify.cc.jj2023-01-16 11:52:16.065734330 +0100
+++ gcc/cp/cp-gimplify.cc   2023-01-19 18:13:54.592661735 +0100
@@ -1010,6 +1010,16 @@ cp_fold_r (tree *stmt_p, int *walk_subtr
}
   break;
 
+case VAR_DECL:
+  /* In initializers replace anon union artificial VAR_DECLs
+with their DECL_VALUE_EXPRs, as nothing will do it later.  */
+  if (DECL_ANON_UNION_VAR_P (stmt) && !data->genericize)
+   {
+ *stmt_p = stmt = unshare_expr (DECL_VALUE_EXPR (stmt));
+ break;
+   }
+  break;
+
 default:
   break;
 }
--- gcc/testsuite/g++.dg/init/pr53932.C.jj  2023-01-19 18:22:24.837231192 
+0100
+++ gcc/testsuite/g++.dg/init/pr53932.C 2023-01-19 18:20:51.776586408 +0100
@@ -0,0 +1,25 @@
+// PR c++/53932
+// { dg-do link }
+
+static union { int i; };
+int  = i;
+int s = i;
+int *t = 
+
+void
+foo (int **p, int *q)
+{
+  static int  = i;
+  static int v = i;
+  static int *w = 
+  int  = i;
+  int y = i;
+  int *z = 
+  *p = 
+  *q = i;
+}
+
+int
+main ()
+{
+}

Jakub



[PATCH] niter: Fix up unused var warning [PR108457]

2023-01-19 Thread Jakub Jelinek via Gcc-patches
Hi!

tree-ssa-loop-niter.cc (build_cltz_expr) gets unused variable mode
warning on some architectures where C[LT]Z_DEFINED_VALUE_AT_ZERO
macro(s) don't use the first argument (which includes the
defaults.h definitions of:
#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE)  0
#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE)  0
Other uses of this macro avoid this problem by avoiding temporaries
which are only used as argument to those macros, the following patch
does it the same way for consistency.  Plus some formatting fixes
while at it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-01-19  Jakub Jelinek  

PR tree-optimization/108457
* tree-ssa-loop-niter.cc (build_cltz_expr): Use
SCALAR_INT_TYPE_MODE (utype) directly as C[LT]Z_DEFINED_VALUE_AT_ZERO
argument instead of a temporary.  Formatting fixes.

--- gcc/tree-ssa-loop-niter.cc.jj   2023-01-16 11:52:05.806885510 +0100
+++ gcc/tree-ssa-loop-niter.cc  2023-01-19 13:10:42.872595970 +0100
@@ -2252,16 +2252,16 @@ build_cltz_expr (tree src, bool leading,
   call = build_call_expr_internal_loc (UNKNOWN_LOCATION, ifn,
   integer_type_node, 1, src);
   int val;
-  scalar_int_mode mode = SCALAR_INT_TYPE_MODE (utype);
   int optab_defined_at_zero
-   = leading ? CLZ_DEFINED_VALUE_AT_ZERO (mode, val)
- : CTZ_DEFINED_VALUE_AT_ZERO (mode, val);
+   = (leading
+  ? CLZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (utype), val)
+  : CTZ_DEFINED_VALUE_AT_ZERO (SCALAR_INT_TYPE_MODE (utype), val));
   if (define_at_zero && !(optab_defined_at_zero == 2 && val == prec))
{
  tree is_zero = fold_build2 (NE_EXPR, boolean_type_node, src,
  build_zero_cst (TREE_TYPE (src)));
- call = fold_build3(COND_EXPR, integer_type_node, is_zero, call,
-build_int_cst (integer_type_node, prec));
+ call = fold_build3 (COND_EXPR, integer_type_node, is_zero, call,
+ build_int_cst (integer_type_node, prec));
}
 }
   else if (prec == 2 * lli_prec)
@@ -2275,22 +2275,22 @@ build_cltz_expr (tree src, bool leading,
   /* We count the zeroes in src1, and add the number in src2 when src1
 is 0.  */
   if (!leading)
-   std::swap(src1, src2);
+   std::swap (src1, src2);
   tree call1 = build_call_expr (fn, 1, src1);
   tree call2 = build_call_expr (fn, 1, src2);
   if (define_at_zero)
{
  tree is_zero2 = fold_build2 (NE_EXPR, boolean_type_node, src2,
   build_zero_cst (TREE_TYPE (src2)));
- call2 = fold_build3(COND_EXPR, integer_type_node, is_zero2, call2,
- build_int_cst (integer_type_node, lli_prec));
+ call2 = fold_build3 (COND_EXPR, integer_type_node, is_zero2, call2,
+  build_int_cst (integer_type_node, lli_prec));
}
   tree is_zero1 = fold_build2 (NE_EXPR, boolean_type_node, src1,
   build_zero_cst (TREE_TYPE (src1)));
-  call = fold_build3(COND_EXPR, integer_type_node, is_zero1, call1,
-fold_build2 (PLUS_EXPR, integer_type_node, call2,
- build_int_cst (integer_type_node,
-lli_prec)));
+  call = fold_build3 (COND_EXPR, integer_type_node, is_zero1, call1,
+ fold_build2 (PLUS_EXPR, integer_type_node, call2,
+  build_int_cst (integer_type_node,
+ lli_prec)));
 }
   else
 {
@@ -2302,14 +2302,13 @@ build_cltz_expr (tree src, bool leading,
{
  tree is_zero = fold_build2 (NE_EXPR, boolean_type_node, src,
  build_zero_cst (TREE_TYPE (src)));
- call = fold_build3(COND_EXPR, integer_type_node, is_zero, call,
-build_int_cst (integer_type_node, prec));
+ call = fold_build3 (COND_EXPR, integer_type_node, is_zero, call,
+ build_int_cst (integer_type_node, prec));
}
 
   if (leading && prec < i_prec)
-   call = fold_build2(MINUS_EXPR, integer_type_node, call,
-  build_int_cst (integer_type_node,
- i_prec - prec));
+   call = fold_build2 (MINUS_EXPR, integer_type_node, call,
+   build_int_cst (integer_type_node, i_prec - prec));
 }
 
   return call;

Jakub



[committed] openmp: Fix up OpenMP expansion of non-rectangular loops [PR108459]

2023-01-19 Thread Jakub Jelinek via Gcc-patches
Hi!

expand_omp_for_init_counts was using for the case where collapse(2)
inner loop has init expression dependent on non-constant multiple of
the outer iterator and the condition upper bound expression doesn't
depend on the outer iterator fold_unary (NEGATE_EXPR, ...).  This
will just return NULL if it can't be folded, we need fold_build1
instead.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2023-01-19  Jakub Jelinek  

PR middle-end/108459
* omp-expand.cc (expand_omp_for_init_counts): Use fold_build1 rather
than fold_unary for NEGATE_EXPR.

* testsuite/libgomp.c/pr108459.c: New test.

--- gcc/omp-expand.cc.jj2023-01-02 09:32:49.399894958 +0100
+++ gcc/omp-expand.cc   2023-01-19 12:01:05.103410564 +0100
@@ -2003,8 +2003,8 @@ expand_omp_for_init_counts (struct omp_f
t = fold_build2 (MINUS_EXPR, itype, unshare_expr (fd->loops[i].m2),
 unshare_expr (fd->loops[i].m1));
  else if (fd->loops[i].m1)
-   t = fold_unary (NEGATE_EXPR, itype,
-   unshare_expr (fd->loops[i].m1));
+   t = fold_build1 (NEGATE_EXPR, itype,
+unshare_expr (fd->loops[i].m1));
  else
t = unshare_expr (fd->loops[i].m2);
  tree m2minusm1
--- libgomp/testsuite/libgomp.c/pr108459.c.jj   2023-01-19 12:22:07.191038771 
+0100
+++ libgomp/testsuite/libgomp.c/pr108459.c  2023-01-19 12:21:17.973755215 
+0100
@@ -0,0 +1,41 @@
+/* PR middle-end/108459 */
+
+char a[17][17];
+
+__attribute__((noipa)) void
+foo (int x, int y)
+{
+  #pragma omp for collapse(2)
+  for (int i = 1; i <= 16; i++)
+for (int j = i * x + y; j <= 16; j++)
+  a[i][j] = 1;
+}
+
+int
+main ()
+{
+  #pragma omp parallel
+  foo (1, 1);
+  for (int i = 0; i <= 16; i++)
+for (int j = 0; j <= 16; j++)
+  if (i >= 1 && j >= i + 1)
+   {
+ if (a[i][j] != 1)
+   __builtin_abort ();
+ a[i][j] = 0;
+   }
+  else if (a[i][j])
+   __builtin_abort ();
+  #pragma omp parallel
+  foo (2, -2);
+  for (int i = 0; i <= 16; i++)
+for (int j = 0; j <= 16; j++)
+  if (i >= 1 && j >= 2 * i - 2)
+   {
+ if (a[i][j] != 1)
+   __builtin_abort ();
+   }
+  else if (a[i][j])
+   __builtin_abort ();
+  return 0;
+}

Jakub



git out-of-order commit (was Re: [PATCH] Fortran: Remove unused declaration)

2023-01-19 Thread Jason Merrill via Gcc-patches
On Sat, Nov 12, 2022 at 4:24 PM Harald Anlauf via Gcc-patches
 wrote:
>
> Am 12.11.22 um 22:05 schrieb Bernhard Reutner-Fischer via Gcc-patches:
> > This function definition was removed years ago, remove it's prototype.
> >
> > gcc/fortran/ChangeLog:
> >
> >   * gfortran.h (gfc_check_include): Remove declaration.
> > ---
> >   gcc/fortran/gfortran.h | 1 -
> >   1 file changed, 1 deletion(-)
> > ---
> > Regtests cleanly, ok for trunk?
> >
> > diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
> > index c4deec0d5b8..ce3ad61bb52 100644
> > --- a/gcc/fortran/gfortran.h
> > +++ b/gcc/fortran/gfortran.h
> > @@ -3208,7 +3208,6 @@ int gfc_at_eof (void);
> >   int gfc_at_bol (void);
> >   int gfc_at_eol (void);
> >   void gfc_advance_line (void);
> > -int gfc_check_include (void);
> >   int gfc_define_undef_line (void);
> >
> >   int gfc_wide_is_printable (gfc_char_t);
>
> OK, thanks.

Somehow this was applied with a CommitDate in 2021, breaking scripts
that assume monotonically increasing CommitDate.  Anyone know how that
could have happened?

Jason



[committed] analyzer: use dominator info in -Wanalyzer-deref-before-check [PR108455]

2023-01-19 Thread David Malcolm via Gcc-patches
My integration testing [1] of -fanalyzer in GCC 13 is showing a lot of
diagnostics from the new -Wanalyzer-deref-before-check warning on
real-world C projects, and most of these seem to be false positives.

This patch updates the warning to make it much less likely to fire:
- only intraprocedural cases are now reported
- reject cases in which there are control flow paths to the check
  that didn't come through the dereference, by looking at BB dominator
  information.  This fixes a false positive seen in git-2.39.0's
  pack-revindex.c: load_revindex_from_disk (PR analyzer/108455), in
  which a shared "cleanup:" section checks "data" for NULL, and
  depending on how much of the function is executed "data" might or
  might not have already been dereferenced.

The counts of -Wanalyzer-deref-before-check diagnostics in [1]
before/after this patch show this improvement:
  Known false positives:6 ->  0  (-6)
  Known true positives: 1 ->  1
  Unclassified positives: 123 -> 63 (-60)

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-5261-g0d6f7b1dd62e9c.

[1] https://github.com/davidmalcolm/gcc-analyzer-integration-tests

gcc/analyzer/ChangeLog:
PR analyzer/108455
* analyzer.h (class checker_event): New forward decl.
(class state_change_event): Indent.
(class warning_event): New forward decl.
* checker-event.cc (state_change_event::state_change_event): Add
"enode" param.
(warning_event::get_desc): Update for new param of
evdesc::final_event ctor.
* checker-event.h (state_change_event::state_change_event): Add
"enode" param.
(state_change_event::get_exploded_node): New accessor.
(state_change_event::m_enode): New field.
(warning_event::warning_event): New "enode" param.
(warning_event::get_exploded_node): New accessor.
(warning_event::m_enode): New field.
* diagnostic-manager.cc
(state_change_event_creator::on_global_state_change): Pass
src_node to state_change_event ctor.
(state_change_event_creator::on_state_change): Likewise.
(null_assignment_sm_context::set_next_state): Pass NULL for
new param of state_change_event ctor.
* infinite-recursion.cc
(infinite_recursion_diagnostic::add_final_event): Update for new
param of warning_event ctor.
* pending-diagnostic.cc (pending_diagnostic::add_final_event):
Pass enode to warning_event ctor.
* pending-diagnostic.h (evdesc::final_event): Add reference to
warning_event.
* sm-malloc.cc: Include "analyzer/checker-event.h" and
"analyzer/exploded-graph.h".
(deref_before_check::deref_before_check): Initialize new fields.
(deref_before_check::emit): Reject warnings in which we were
unable to determine the enodes of the dereference and the check.
Reject warnings interprocedural warnings. Reject warnings in which
the dereference doesn't dominate the check.
(deref_before_check::describe_state_change): Set m_deref_enode.
(deref_before_check::describe_final_event): Set m_check_enode.
(deref_before_check::m_deref_enode): New field.
(deref_before_check::m_check_enode): New field.

gcc/testsuite/ChangeLog:
PR analyzer/108455
* gcc.dg/analyzer/deref-before-check-1.c: Add test coverage
involving dominance.
* gcc.dg/analyzer/deref-before-check-pr108455-1.c: New test.
* gcc.dg/analyzer/deref-before-check-pr108455-git-pack-revindex.c:
New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/analyzer.h   |   4 +-
 gcc/analyzer/checker-event.cc |   8 +-
 gcc/analyzer/checker-event.h  |  11 +-
 gcc/analyzer/diagnostic-manager.cc|  12 +-
 gcc/analyzer/infinite-recursion.cc|   3 +-
 gcc/analyzer/pending-diagnostic.cc|   1 +
 gcc/analyzer/pending-diagnostic.h |   6 +-
 gcc/analyzer/sm-malloc.cc |  35 -
 .../gcc.dg/analyzer/deref-before-check-1.c|  36 +
 .../analyzer/deref-before-check-pr108455-1.c  |  36 +
 ...-before-check-pr108455-git-pack-revindex.c | 133 ++
 11 files changed, 272 insertions(+), 13 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.dg/analyzer/deref-before-check-pr108455-1.c
 create mode 100644 
gcc/testsuite/gcc.dg/analyzer/deref-before-check-pr108455-git-pack-revindex.c

diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h
index bfd098b8613..8f79e4b5df5 100644
--- a/gcc/analyzer/analyzer.h
+++ b/gcc/analyzer/analyzer.h
@@ -93,7 +93,9 @@ class bounded_ranges_manager;
 class pending_diagnostic;
 class pending_note;
 struct event_loc_info;
-class state_change_event;
+class checker_event;
+  class state_change_event;
+  class warning_event;
 class checker_path;
 class extrinsic_state;
 class sm_state_map;
diff --git 

Re: [PATCH] c++: Fix up handling of non-dependent subscript with static operator[] [PR108437]

2023-01-19 Thread Jason Merrill via Gcc-patches

On 1/19/23 03:52, Jakub Jelinek wrote:

Hi!

As the following testcases shows, when adding static operator[]
support I've missed that the 2 build_min_non_dep_op_overload functions
need to be adjusted.  The first one we only use for the single index
case, but as cp_tree_code_length (ARRAY_REF) is 2, we were running
into an assertion there which compared nargs and expected_nargs.



For ARRAY_REF, the operator[] is either a non-static member or newly
static member, never out of class and for the static member case
if user uses single index the operator[] needs to have a single
argument as well, but the function is called with 2 - the object
it is invoked on and the index.


This should be in a comment for the new special handling of ARRAY_REF.

OK with that change.


We need to evaluate side-effects
of the object and use just a single argument in the call - the index.
The other build_min_non_dep_op_overload overload has been added
solely for ARRAY_REF - CALL_EXPR is the other operator that accepts
variable number of operands but that one goes through different
routines.  There we asserted it is a METHOD_TYPE, so again
we shouldn't assert that but handle the case when it is not one
by making sure object's side-effects are evaluated if needed and
passing all the index arguments to the static operator[].

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-01-19  Jakub Jelinek  

PR c++/108437
* cp-tree.h (keep_unused_object_arg): Declare.
* call.cc (keep_unused_object_arg): No longer static.
* tree.cc (build_min_non_dep_op_overload): Handle ARRAY_REF
with overload being static member function.

* g++.dg/cpp23/subscript12.C: New test.
* g++.dg/cpp23/subscript13.C: New test.

--- gcc/cp/cp-tree.h.jj 2023-01-16 11:52:16.067734300 +0100
+++ gcc/cp/cp-tree.h2023-01-18 16:47:55.258697839 +0100
@@ -6599,6 +6599,7 @@ inline tree build_new_op (const op_locat
return build_new_op (loc, code, flags, arg1, arg2, NULL_TREE, NULL_TREE,
   NULL, complain);
  }
+extern tree keep_unused_object_arg (tree, tree, tree);
  extern tree build_op_call (tree, vec **,
 tsubst_flags_t);
  extern tree build_op_subscript(const op_location_t &, 
tree,
--- gcc/cp/call.cc.jj   2023-01-16 11:52:16.059734418 +0100
+++ gcc/cp/call.cc  2023-01-18 16:48:02.966586328 +0100
@@ -5187,7 +5187,7 @@ build_operator_new_call (tree fnname, ve
 or static operator(), in which cases the source expression
 would be `obj[...]' or `obj(...)'.  */
  
-static tree

+tree
  keep_unused_object_arg (tree result, tree obj, tree fn)
  {
if (result == NULL_TREE
--- gcc/cp/tree.cc.jj   2023-01-16 11:52:16.093733917 +0100
+++ gcc/cp/tree.cc  2023-01-18 17:01:08.937242864 +0100
@@ -3693,14 +3693,14 @@ build_min_non_dep_op_overload (enum tree
  {
va_list p;
int nargs, expected_nargs;
-  tree fn, call;
+  tree fn, call, obj = NULL_TREE;
  
non_dep = extract_call_expr (non_dep);
  
nargs = call_expr_nargs (non_dep);
  
expected_nargs = cp_tree_code_length (op);

-  if (TREE_CODE (TREE_TYPE (overload)) == METHOD_TYPE)
+  if (TREE_CODE (TREE_TYPE (overload)) == METHOD_TYPE || op == ARRAY_REF)
  expected_nargs -= 1;
if ((op == POSTINCREMENT_EXPR
 || op == POSTDECREMENT_EXPR)
@@ -3715,6 +3715,8 @@ build_min_non_dep_op_overload (enum tree
if (TREE_CODE (TREE_TYPE (overload)) == FUNCTION_TYPE)
  {
fn = overload;
+  if (op == ARRAY_REF)
+   obj = va_arg (p, tree);
for (int i = 0; i < nargs; i++)
{
  tree arg = va_arg (p, tree);
@@ -3746,6 +3748,8 @@ build_min_non_dep_op_overload (enum tree
CALL_EXPR_ORDERED_ARGS (call_expr) = CALL_EXPR_ORDERED_ARGS (non_dep);
CALL_EXPR_REVERSE_ARGS (call_expr) = CALL_EXPR_REVERSE_ARGS (non_dep);
  
+  if (obj)

+return keep_unused_object_arg (call, obj, overload);
return call;
  }
  
@@ -3759,11 +3763,15 @@ build_min_non_dep_op_overload (tree non_

non_dep = extract_call_expr (non_dep);
  
unsigned int nargs = call_expr_nargs (non_dep);

-  gcc_assert (TREE_CODE (TREE_TYPE (overload)) == METHOD_TYPE);
-  tree binfo = TYPE_BINFO (TREE_TYPE (object));
-  tree method = build_baselink (binfo, binfo, overload, NULL_TREE);
-  tree fn = build_min (COMPONENT_REF, TREE_TYPE (overload),
-  object, method, NULL_TREE);
+  tree fn = overload;
+  if (TREE_CODE (TREE_TYPE (overload)) == METHOD_TYPE)
+{
+  tree binfo = TYPE_BINFO (TREE_TYPE (object));
+  tree method = build_baselink (binfo, binfo, overload, NULL_TREE);
+  fn = build_min (COMPONENT_REF, TREE_TYPE (overload),
+ object, method, NULL_TREE);
+  object = NULL_TREE;
+}
gcc_assert (vec_safe_length (args) == nargs);
  
tree call = build_min_non_dep_call_vec (non_dep, fn, args);

@@ -3774,6 +3782,8 @@ 

Re: [PATCH v2] c++: -Wdangling-reference with reference wrapper [PR107532]

2023-01-19 Thread Jason Merrill via Gcc-patches

On 1/18/23 20:13, Marek Polacek wrote:

On Wed, Jan 18, 2023 at 04:07:59PM -0500, Jason Merrill wrote:

On 1/18/23 12:52, Marek Polacek wrote:

Here, -Wdangling-reference triggers where it probably shouldn't, causing
some grief.  The code in question uses a reference wrapper with a member
function returning a reference to a subobject of a non-temporary object:

const Plane & meta = fm.planes().inner();

I've tried a few approaches, e.g., checking that the member function's
return type is the same as the type of the enclosing class (which is
the case for member functions returning *this), but that then breaks
Wdangling-reference4.C with std::optional.

So I figured that perhaps we want to look at the object we're invoking
the member function(s) on and see if that is a temporary, as in, don't
warn about

const Plane & meta = fm.planes().inner();

but do warn about

const Plane & meta = FrameMetadata().planes().inner();

It's ugly, but better than asking users to add #pragmas into their code.


Hmm, that doesn't seem right; the former is only OK because Ref is in fact a
reference-like type.  If planes() returned a class that held data, we would
want to warn.


Sure, it's always some kind of tradeoff with warnings :/.
  

In this case, we might recognize the reference-like class because it has a
reference member and a constructor taking the same reference type.


That occurred to me too, but then I found out that std::reference_wrapper
actually uses T*, not T&, as you say.  But here's a patch to do that
(I hope).
  

That wouldn't help with std::reference_wrapper or std::ref_view because they
have pointer members instead of references, but perhaps loosening the check
to include that case would make sense?


Sorry, I don't understand what you mean by loosening the check.  I could
hardcode std::reference_wrapper and std::ref_view but I don't think that's
what you meant.


Indeed that's not what I meant, but as I was saying in our meeting I 
think it's worth doing; the compiler has various tweaks to handle 
specific standard-library classes better.



Surely I cannot _not_ warn for any class that contains a T*.


I was thinking if a constructor takes a T& and the class has a T* that 
would be close enough, though this also wouldn't handle the standard 
library classes so the benefit is questionable.



Here's the patch so that we have some actual code to discuss...  Thanks.

-- >8 --
Here, -Wdangling-reference triggers where it probably shouldn't, causing
some grief.  The code in question uses a reference wrapper with a member
function returning a reference to a subobject of a non-temporary object:

   const Plane & meta = fm.planes().inner();

I've tried a few approaches, e.g., checking that the member function's
return type is the same as the type of the enclosing class (which is
the case for member functions returning *this), but that then breaks
Wdangling-reference4.C with std::optional.

Perhaps we want to look at the member function's enclosing class
to see if it's a reference wrapper class (meaning, has a reference
member and a constructor taking the same reference type) and don't
warn if so, supposing that the member function returns a reference
to a non-temporary object.

It's ugly, but better than asking users to add #pragmas into their code.

PR c++/107532

gcc/cp/ChangeLog:

* call.cc (do_warn_dangling_reference): Don't warn when the
member function comes from a reference wrapper class.


Let's factor the new code out into e.g. reference_like_class_p


gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference8.C: New test.
---
  gcc/cp/call.cc| 32 
  .../g++.dg/warn/Wdangling-reference8.C| 77 +++
  2 files changed, 109 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference8.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 0780b5840a3..b0670a21240 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -13832,6 +13832,38 @@ do_warn_dangling_reference (tree expr)
if (!(TYPE_REF_OBJ_P (rettype) || std_pair_ref_ref_p (rettype)))
  return NULL_TREE;
  
+	/* An attempt to reduce the number of -Wdangling-reference

+  false positives concerning reference wrappers (c++/107532).
+  If the enclosing class is a reference-like class, that is, has
+  a reference member and a constructor taking the same reference type,
+  we suppose that the member function is returning a reference
+  to a non-temporary object.  */
+   if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fndecl)
+   && !DECL_OVERLOADED_OPERATOR_P (fndecl))
+ {
+   tree ctx = CP_DECL_CONTEXT (fndecl);
+   for (tree fields = TYPE_FIELDS (ctx);
+fields;
+fields = DECL_CHAIN (fields))
+ {
+   if (TREE_CODE (fields) != FIELD_DECL || DECL_ARTIFICIAL 
(fields))
+ continue;

Re: [PATCH] IPA: do not release body if still needed

2023-01-19 Thread Martin Jambor
Hi,

On Wed, Jan 18 2023, Jan Hubicka wrote:
>> The code removing function bodies when the last call graph clone of a
>> node is removed is too aggressive when there are nodes up the
>> clone_of chain which still need them.  Fixed by expanding the check.
>> 
>> gcc/ChangeLog:
>> 
>> 2023-01-18  Martin Jambor  
>> 
>>  PR ipa/107944
>>  * cgraph.cc (cgraph_node::remove): Check whether nodes up the
>>  lcone_of chain also do not need the body.
>> ---
>>  gcc/cgraph.cc | 14 --
>>  1 file changed, 12 insertions(+), 2 deletions(-)
>> 
>> diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
>> index 5e60c2b73db..5f72ace9b57 100644
>> --- a/gcc/cgraph.cc
>> +++ b/gcc/cgraph.cc
>> @@ -1893,8 +1893,18 @@ cgraph_node::remove (void)
>>else if (clone_of)
>>  {
>>clone_of->clones = next_sibling_clone;
>> -  if (!clone_of->analyzed && !clone_of->clones && !clones)
>> -clone_of->release_body ();
>> +  if (!clones)
>> +{
>> +  bool need_body = false;
>> +  for (cgraph_node *n = clone_of; n; n = n->clone_of)
>> +if (n->analyzed || n->clones)
>> +  {
>> +need_body = true;
> If you want to walk immediate clones and see if any of them is needed, I
> wonder why you don't also walk recursively clones of clones?

The intent is to avoid PR 100413.  When a node is being removed, we need
to figure out if it is the last one needing the body.  If a (possibly
indirect) clone_of has a clone, they are still to be materialized and so
the body is necessary.  If those other clones are all also going to be
removed as unreachable rather than materialized, then the last one will
release the body.

>
> Original idea was that the clones should be materialized and removed one
> by one (or proved unreachable and just removed) and once we remove last
> one, we should figure out that body is not needed. For that one does not
> not need the walk at all.

So if you have clones of F like this

   F
  / \
 A   B
/ \
   C   D
  / \
 M   R

And then A and C are removed as unreachable or materialized, M is
materialized, and afterwards R is removed as unreachable then the
removal of R also has to trigger releasing the body.  In order not to
trigger the bug we are fixing, it needs to check that neither of D, B or
F need the body themselves or have any clones which need it.  Thus the
walk.

Now, the method as an alternative point where it releases the body a few
lines below, and having two looks a bit clumsy.  But it is not entirely
straightforward how to combine the conditions guarding the two places.

>
> How exactly we end up with clones that are not analyzed?

I hope I am not misremembering but analyzed gets cleared when a node is
there just to hold body for its clones and is no longer necessary for
any other reason, no?

Martin


>
> Honza
>> +break;
>> +  }
>> +  if (!need_body)
>> +clone_of->release_body ();
>> +}
>>  }
>>if (next_sibling_clone)
>>  next_sibling_clone->prev_sibling_clone = prev_sibling_clone;
>> -- 
>> 2.39.0
>> 


[PATCH] PR target/107299: Fix build issue when long double is IEEE 128-bit

2023-01-19 Thread Michael Meissner via Gcc-patches
This patch updates the IEEE 128-bit types used in libgcc.

At the moment, we cannot build GCC when the target uses IEEE 128-bit long
doubles, such as building the compiler for a native Fedora 36 system.  The
build dies when it is trying to build the _mulkc3.c and _divkc3 modules.

This patch changes libgcc to use long double for the IEEE 128-bit base type if
long double is IEEE 128-bit, and it uses _Float128 otherwise.  The built-in
functions are adjusted to be the correct version based on the IEEE 128-bit base
type used.

While it is desirable to ultimately have __float128 and _Float128 use the same
internal type and mode within GCC, at present if you use the option
-mabi=ieeelongdouble, the __float128 type will use the long double type and not
the _Float128 type.  We get an internal compiler error if we combine the
signbitf128 built-in with a long double type.

I've gone through several iterations of trying to fix this within GCC, and
there are various problems that have come up.  I developed this alternative
patch that changes libgcc so that it does not tickle the issue.  I hope we can
fix the compiler at some point, but right now, this is preventing people on
Fedora 36 systems from building compilers where the default long double is IEEE
128-bit.

I have built a GCC compiler tool chain on the following platforms and there
were no regressions caused by these patches.

*   Power10 little endian, IBM long double, --with-cpu=power10

*   Power9 little endian, IBM long double, --with-cpu=power9

*   Power8 big endian, IBM long double, --with-cpu=power8, both
32-bit/64-bit tests.

In addition, I have built a GCC compiler tool chain on the following systems
with IEEE 128-bit long double as the default.  Comparing the test suite runs to
the runs for the toolchain with IBM long double as the default, I only get the
expected differences (C++ modules test fail on IEEE long double, 3 Fortran
tests pass on IEEE long double that fail on IBM long double, C test pr105334.c
fails, and C test fp128_conversions.c fails on power10):

*   Power10 little endian, IEEE long double, --with-cpu=power10

*   Power9 little endian, IEEE long double, --with-cpu=power9

Can I check this change into the master branch?

2023-01-19   Michael Meissner  

PR target/107299
* config/rs6000/_divkc3.c (COPYSIGN): Use the correct built-in based on
whether long double is IBM or IEEE.
(INFINITY): Likewise.
(FABS): Likewise.
* config/rs6000/_mulkc3.c (COPYSIGN): Likewise.
(INFINITY): Likewise.
* config/rs6000/quad-float128.h (TF): Remove definition.
(TFtype): Define to be long double or _Float128.
(TCtype): Define to be _Complex long double or _Complex _Float128.
* libgcc2.h (TFtype): Allow machine config files to override this.
(TCtype): Likewise.
* soft-fp/quad.h (TFtype): Likewise.
---
 libgcc/config/rs6000/_divkc3.c   |  8 
 libgcc/config/rs6000/_mulkc3.c   |  7 +++
 libgcc/config/rs6000/quad-float128.h | 19 ++-
 libgcc/libgcc2.h |  4 
 libgcc/soft-fp/quad.h|  2 ++
 5 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/libgcc/config/rs6000/_divkc3.c b/libgcc/config/rs6000/_divkc3.c
index 59ab2137d1d..8eeb0f76ba4 100644
--- a/libgcc/config/rs6000/_divkc3.c
+++ b/libgcc/config/rs6000/_divkc3.c
@@ -26,9 +26,17 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 #include "soft-fp.h"
 #include "quad-float128.h"
 
+#ifndef __LONG_DOUBLE_IEEE128__
 #define COPYSIGN(x,y) __builtin_copysignf128 (x, y)
 #define INFINITY __builtin_inff128 ()
 #define FABS __builtin_fabsf128
+
+#else
+#define COPYSIGN(x,y) __builtin_copysignl (x, y)
+#define INFINITY __builtin_infl ()
+#define FABS __builtin_fabsl
+#endif
+
 #define isnan __builtin_isnan
 #define isinf __builtin_isinf
 #define isfinite __builtin_isfinite
diff --git a/libgcc/config/rs6000/_mulkc3.c b/libgcc/config/rs6000/_mulkc3.c
index cfae81f8b5f..290dc89bbc1 100644
--- a/libgcc/config/rs6000/_mulkc3.c
+++ b/libgcc/config/rs6000/_mulkc3.c
@@ -26,8 +26,15 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 #include "soft-fp.h"
 #include "quad-float128.h"
 
+#ifndef __LONG_DOUBLE_IEEE128__
 #define COPYSIGN(x,y) __builtin_copysignf128 (x, y)
 #define INFINITY __builtin_inff128 ()
+
+#else
+#define COPYSIGN(x,y) __builtin_copysignl (x, y)
+#define INFINITY __builtin_infl ()
+#endif
+
 #define isnan __builtin_isnan
 #define isinf __builtin_isinf
 
diff --git a/libgcc/config/rs6000/quad-float128.h 
b/libgcc/config/rs6000/quad-float128.h
index ae0622c744c..8332184348a 100644
--- a/libgcc/config/rs6000/quad-float128.h
+++ b/libgcc/config/rs6000/quad-float128.h
@@ -27,21 +27,14 @@
License along with the GNU C Library; if not, see
.  */
 
-/* quad.h defines the TFtype type by:
-   typedef float TFtype 

Re: [PATCH] modula-2, testsuite: Make libs and interfaces consistent.

2023-01-19 Thread Gaius Mulley via Gcc-patches
Iain Sandoe  writes:

> Tested on x86_64-linux-gnu (with a 32b multilib), powerpc, i686 and
> x86_64-darwin.  OK for trunk?
> thanks,
> Iain

LGTM, thank you

regards,
Gaius


[committed] wwwdocs: gcc-4.9: Adjust www.open-std.org links to https

2023-01-19 Thread Gerald Pfeifer
Pushed.

Gerald
---
 htdocs/gcc-4.9/changes.html | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/htdocs/gcc-4.9/changes.html b/htdocs/gcc-4.9/changes.html
index 274bd814..9090c0ea 100644
--- a/htdocs/gcc-4.9/changes.html
+++ b/htdocs/gcc-4.9/changes.html
@@ -222,7 +222,7 @@
   
 The G++ implementation of C++1y 
return type deduction for normal
 functions has been updated to conform to
-http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3638.html;>N3638,
 
+https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3638.html;>N3638,
 
 the proposal accepted into the working paper.  Most notably, it adds
 decltype(auto) for getting decltype semantics
 rather than the template argument deduction semantics of plain
@@ -312,7 +312,7 @@ auto add = [] wwwdocs: typename T (T a, T b) { 
return a + b; };
   
 G++ supports unconstrained generic functions as specified
 by 4.1.2 and 5.1.1 of
-http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3889.pdf;>
+https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3889.pdf;>
 N3889: Concepts Lite Specification.  Briefly,
 auto may be used as a type-specifier in a parameter
 declaration of any function declarator in order to introduce an
-- 
2.39.0


[committed] style: Tweak a comment

2023-01-19 Thread Gerald Pfeifer
Pushed.

Gerald
---
 htdocs/style.mhtml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/style.mhtml b/htdocs/style.mhtml
index 8afaa1e1..1b778151 100644
--- a/htdocs/style.mhtml
+++ b/htdocs/style.mhtml
@@ -9,7 +9,7 @@
   
 >
 
-;;; Redefine the  tag so that we can add default  headers.
+;;; Redefine the  tag so that we can easily add some headers.
 
 
  
-- 
2.39.0


[Patch] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]

2023-01-19 Thread Tobias Burnus

This is all about non-rectangular loop nests in OpenMP.

The attached patch depends on the obvious fix for https://gcc.gnu.org/PR108459,
which is together with a nice testcase in Jakub's WIP patch attached to the PR;
without, gfortran.dg/gomp/canonical-loop-1.f90 fails with an ICE (segfault).

My patch fixes part of the Fortran issues found. Namely, it ensures that a 
"regular"
non-rectangular loop nest actually works by passing the outer-loop-var, the 
multiplier
and offset in a TREE_VEC to the middle end. It additionally avoids pointlessly
creating a temporary variable for a VAR_DECL (main advantage: dump looks 
cleaner and
avoids some dependency analysis) - and likewise for 'step' given that 'step' was
evaluated before.

There is an additional issue - not quite addressed in this patch: There are 
cases
when a loop variable is replaced by another variable ('count') and then at the 
beginning
of the loop body, the original variable gets the value from the count variable. 
Obviously,
this no longer works with non-rectangular loop nests.
The 'count' appears in two cases: (a) when the iteration step is not 1 or -1 
and (b) if
the iteration variable is a pointer (scalar with allocatable, pointer, optional 
argument
or just a dummy argument; oddly, even if it has the value attribute).

There is pending work to be done in this case, as mentioned in comment 6 and 8 
of the PR.
This patch adds some 'sorry' messages for them. I hope and think that I have 
not missed
a case where 'count' is used which I did not catch, but I should have all or at 
least most.

OK for mainline, once the other patch has been committed?

Tobias

PS: I still need to verify that everything is fine, once the other patch has 
been committed.
A flaky mainboard on the laptop causes multiple random freezes per day, which 
makes testing
+ patch writing a bit harder. (At least the mainboard replacement is scheduled 
for tomorrow :-) )
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]

This patch ensures that loop bounds depending on outer loop vars use the
proper TREE_VEC format. It additionally gives a sorry if such an outer
var has a non-one/non-minus-one increment as currently a count variable
is used in this case (see PR).

gcc/fortran/ChangeLog:

	PR fortran/107424
	* trans-openmp.cc (gfc_nonrect_loop_expr): New.
	(gfc_trans_omp_do): Call it for start/end loop bound
	for non-rectangular loop nests.

gcc/testsuite/

	PR fortran/107424
	* gfortran.dg/gomp/non-rectangular-loop-3.f90: New test.

libgomp/ChangeLog:

	PR fortran/107424
	* testsuite/libgomp.fortran/non-rectangular-loop-1.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-2.f90: New test.

 gcc/fortran/trans-openmp.cc| 167 +-
 .../gfortran.dg/gomp/non-rectangular-loop-3.f90|  85 +++
 .../libgomp.fortran/non-rectangular-loop-1.f90 | 637 +
 .../libgomp.fortran/non-rectangular-loop-1a.f90| 374 
 .../libgomp.fortran/non-rectangular-loop-2.f90 | 243 
 5 files changed, 1495 insertions(+), 11 deletions(-)

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 87213de0918..73376894316 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -5120,6 +5120,136 @@ typedef struct dovar_init_d {
   tree init;
 } dovar_init;
 
+static bool
+gfc_nonrect_loop_expr (stmtblock_t *pblock, gfc_se *sep, int loop_n,
+		   gfc_code *code, gfc_expr *expr, vec *inits)
+{
+  int i;
+  for (i = 0; i < loop_n; i++)
+{
+  gcc_assert (code->ext.iterator->var->expr_type == EXPR_VARIABLE);
+  if (gfc_find_sym_in_expr (code->ext.iterator->var->symtree->n.sym, expr))
+	break;
+  code = code->block->next;
+}
+  if (i >= loop_n)
+return false;
+
+  /* Canonic format: TREE_VEC with [var, multiplier, offset].  */
+  gfc_symbol *var = code->ext.iterator->var->symtree->n.sym;
+
+  gfc_se se;
+  tree tree_var, a1, a2;
+  a1 = integer_one_node;
+  a2 = integer_zero_node;
+
+  gfc_init_se (, NULL);
+  gfc_conv_expr_lhs (, code->ext.iterator->var);
+  gfc_add_block_to_block (pblock, );
+  tree_var = se.expr;
+
+  {
+/* FIXME: Handle non-unity iterations, cf. PR fortran/107424.
+   The issue is that for those a 'count' variable is used.  */
+dovar_init *di;
+unsigned ix;
+tree t = tree_var;
+while (TREE_CODE (t) == INDIRECT_REF)
+  t = TREE_OPERAND (t, 0);
+FOR_EACH_VEC_ELT (*inits, ix, di)
+  {
+	tree t2 = di->var;
+	while (TREE_CODE (t2) == INDIRECT_REF)
+	  t2 = TREE_OPERAND (t2, 0);
+	if (t == t2)
+	  {
+	HOST_WIDE_INT intval;
+	if (gfc_extract_hwi 

[PATCH] RISC-V: Add vse.v C API intrinsics testcases

2023-01-19 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/vse-1.c: New test.
* gcc.target/riscv/rvv/base/vse-2.c: New test.
* gcc.target/riscv/rvv/base/vse-3.c: New test.
* gcc.target/riscv/rvv/base/vse_m-1.c: New test.
* gcc.target/riscv/rvv/base/vse_m-2.c: New test.
* gcc.target/riscv/rvv/base/vse_m-3.c: New test.

---
 .../gcc.target/riscv/rvv/base/vse-1.c | 345 ++
 .../gcc.target/riscv/rvv/base/vse-2.c | 345 ++
 .../gcc.target/riscv/rvv/base/vse-3.c | 345 ++
 .../gcc.target/riscv/rvv/base/vse_m-1.c   | 345 ++
 .../gcc.target/riscv/rvv/base/vse_m-2.c   | 345 ++
 .../gcc.target/riscv/rvv/base/vse_m-3.c   | 345 ++
 6 files changed, 2070 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vse-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vse-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vse-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vse_m-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vse_m-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vse_m-3.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vse-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/vse-1.c
new file mode 100644
index 000..c08e1e1265a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vse-1.c
@@ -0,0 +1,345 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-schedule-insns 
-fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+void
+test___riscv_vse8_v_i8mf8(int8_t* base,vint8mf8_t value,size_t vl)
+{
+  __riscv_vse8_v_i8mf8(base,value,vl);
+}
+
+void
+test___riscv_vse8_v_i8mf4(int8_t* base,vint8mf4_t value,size_t vl)
+{
+  __riscv_vse8_v_i8mf4(base,value,vl);
+}
+
+void
+test___riscv_vse8_v_i8mf2(int8_t* base,vint8mf2_t value,size_t vl)
+{
+  __riscv_vse8_v_i8mf2(base,value,vl);
+}
+
+void
+test___riscv_vse8_v_i8m1(int8_t* base,vint8m1_t value,size_t vl)
+{
+  __riscv_vse8_v_i8m1(base,value,vl);
+}
+
+void
+test___riscv_vse8_v_i8m2(int8_t* base,vint8m2_t value,size_t vl)
+{
+  __riscv_vse8_v_i8m2(base,value,vl);
+}
+
+void
+test___riscv_vse8_v_i8m4(int8_t* base,vint8m4_t value,size_t vl)
+{
+  __riscv_vse8_v_i8m4(base,value,vl);
+}
+
+void
+test___riscv_vse8_v_i8m8(int8_t* base,vint8m8_t value,size_t vl)
+{
+  __riscv_vse8_v_i8m8(base,value,vl);
+}
+
+void
+test___riscv_vse8_v_u8mf8(uint8_t* base,vuint8mf8_t value,size_t vl)
+{
+  __riscv_vse8_v_u8mf8(base,value,vl);
+}
+
+void
+test___riscv_vse8_v_u8mf4(uint8_t* base,vuint8mf4_t value,size_t vl)
+{
+  __riscv_vse8_v_u8mf4(base,value,vl);
+}
+
+void
+test___riscv_vse8_v_u8mf2(uint8_t* base,vuint8mf2_t value,size_t vl)
+{
+  __riscv_vse8_v_u8mf2(base,value,vl);
+}
+
+void
+test___riscv_vse8_v_u8m1(uint8_t* base,vuint8m1_t value,size_t vl)
+{
+  __riscv_vse8_v_u8m1(base,value,vl);
+}
+
+void
+test___riscv_vse8_v_u8m2(uint8_t* base,vuint8m2_t value,size_t vl)
+{
+  __riscv_vse8_v_u8m2(base,value,vl);
+}
+
+void
+test___riscv_vse8_v_u8m4(uint8_t* base,vuint8m4_t value,size_t vl)
+{
+  __riscv_vse8_v_u8m4(base,value,vl);
+}
+
+void
+test___riscv_vse8_v_u8m8(uint8_t* base,vuint8m8_t value,size_t vl)
+{
+  __riscv_vse8_v_u8m8(base,value,vl);
+}
+
+void
+test___riscv_vse16_v_i16mf4(int16_t* base,vint16mf4_t value,size_t vl)
+{
+  __riscv_vse16_v_i16mf4(base,value,vl);
+}
+
+void
+test___riscv_vse16_v_i16mf2(int16_t* base,vint16mf2_t value,size_t vl)
+{
+  __riscv_vse16_v_i16mf2(base,value,vl);
+}
+
+void
+test___riscv_vse16_v_i16m1(int16_t* base,vint16m1_t value,size_t vl)
+{
+  __riscv_vse16_v_i16m1(base,value,vl);
+}
+
+void
+test___riscv_vse16_v_i16m2(int16_t* base,vint16m2_t value,size_t vl)
+{
+  __riscv_vse16_v_i16m2(base,value,vl);
+}
+
+void
+test___riscv_vse16_v_i16m4(int16_t* base,vint16m4_t value,size_t vl)
+{
+  __riscv_vse16_v_i16m4(base,value,vl);
+}
+
+void
+test___riscv_vse16_v_i16m8(int16_t* base,vint16m8_t value,size_t vl)
+{
+  __riscv_vse16_v_i16m8(base,value,vl);
+}
+
+void
+test___riscv_vse16_v_u16mf4(uint16_t* base,vuint16mf4_t value,size_t vl)
+{
+  __riscv_vse16_v_u16mf4(base,value,vl);
+}
+
+void
+test___riscv_vse16_v_u16mf2(uint16_t* base,vuint16mf2_t value,size_t vl)
+{
+  __riscv_vse16_v_u16mf2(base,value,vl);
+}
+
+void
+test___riscv_vse16_v_u16m1(uint16_t* base,vuint16m1_t value,size_t vl)
+{
+  __riscv_vse16_v_u16m1(base,value,vl);
+}
+
+void
+test___riscv_vse16_v_u16m2(uint16_t* base,vuint16m2_t value,size_t vl)
+{
+  __riscv_vse16_v_u16m2(base,value,vl);
+}
+
+void
+test___riscv_vse16_v_u16m4(uint16_t* base,vuint16m4_t value,size_t vl)
+{
+  __riscv_vse16_v_u16m4(base,value,vl);
+}
+
+void
+test___riscv_vse16_v_u16m8(uint16_t* base,vuint16m8_t value,size_t vl)
+{
+  __riscv_vse16_v_u16m8(base,value,vl);
+}
+
+void
+test___riscv_vse32_v_i32mf2(int32_t* base,vint32mf2_t value,size_t vl)
+{
+  

Re: [PATCH 1/2] aarch64: fix ICE in aarch64_layout_arg [PR108411]

2023-01-19 Thread Christophe Lyon via Gcc-patches




On 1/19/23 10:22, Richard Sandiford wrote:

Christophe Lyon  writes:

The previous patch added an assert which should not be applied to PST
types (Pure Scalable Types) because alignment does not matter in this
case.  This patch moves the assert after the PST case is handled to
avoid the ICE.

PR target/108411
gcc/
* config/aarch64/aarch64.cc (aarch64_layout_arg): Improve
comment. Move assert about alignment a bit later.
---
  gcc/config/aarch64/aarch64.cc | 28 +---
  1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index d36b57341b3..7175b453b3a 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -7659,7 +7659,18 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const 
function_arg_info )
 && (currently_expanding_function_start
   || currently_expanding_gimple_stmt));
  
-  /* There are several things to note here:

+  /* HFAs and HVAs can have an alignment greater than 16 bytes.  For example:
+
+   typedef struct foo {
+ __Int8x16_t foo[2] __attribute__((aligned(32)));
+   } foo;
+
+ is still a HVA despite its larger-than-normal alignment.
+ However, such over-aligned HFAs and HVAs are guaranteed to have
+ no padding.
+
+ If we exclude HFAs and HVAs from the discussion below, then there
+ are several things to note:
  
   - Both the C and AAPCS64 interpretations of a type's alignment should

 give a value that is no greater than the type's size.
@@ -7704,12 +7715,6 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const 
function_arg_info )
 would treat the alignment as though it was *equal to* 16 bytes.
  
   Both behaviors were wrong, but in different cases.  */

-  unsigned int alignment
-= aarch64_function_arg_alignment (mode, type, _break,
- _break_packed);
-  gcc_assert (alignment <= 16 * BITS_PER_UNIT
- && (!alignment || abi_break < alignment)
- && (!abi_break_packed || alignment < abi_break_packed));
  
pcum->aapcs_arg_processed = true;
  
@@ -7780,6 +7785,15 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const function_arg_info )

 );
gcc_assert (!sve_p || !allocate_nvrn);
  
+  unsigned int alignment

+= aarch64_function_arg_alignment (mode, type, _break,
+ _break_packed);
+
+  gcc_assert (allocate_nvrn || (alignment <= 16 * BITS_PER_UNIT
+   && (!alignment || abi_break < alignment)
+   && (!abi_break_packed
+   || alignment < abi_break_packed)));


I think allocate_nvrn should only circumvent the first part, so:

   gcc_assert ((allocate_nvrn || alignment <= 16 * BITS_PER_UNIT)
  && (!alignment || abi_break < alignment)
  && (!abi_break_packed || alignment < abi_break_packed));


OK with that change, and sorry for not thinking about this originally.


OK thanks, now committed with that change (and after checking the 
testsuite still passes :-) )


Christophe



Richard


+
/* allocate_ncrn may be false-positive, but allocate_nvrn is quite reliable.
   The following code thus handles passing by SIMD/FP registers first.  */


Re: [PATCH] modula2/108144 - fix --enable-version-specific-runtime-libs

2023-01-19 Thread Gaius Mulley via Gcc-patches
Richard Biener  writes:

> The following fixes --enable-version-specific-runtime-libs for
> the modula2 target libraries.  The issue is that the install
> happens via for example
>
> toolexeclib_LTLIBRARIES = libm2cor.la
>
> and toolexeclibdir is set to $(toolexecdir)/$(gcc_version)$(MULTISUBDIR)
> but the Makefile.am do not define $(gcc_version) but instead
> $(version) which is used locally to define libsubdir.  The fix
> is to consistently define and use $(gcc_version), also properly
> supporting --with-gcc-major-version-only
>
> Built and installed on x86_64-unknown-linux-gnu with
> --enable-version-specific-runtime-libs and --with-gcc-major-version-only.
>
> OK?
>
> Thanks,
> Richard. 

yes LGTM and thanks for the fix!

regards,
Gaius


Re: [PATCH 1/2] gcc/file-prefix-map: Allow remapping of relative paths

2023-01-19 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 01, 2022 at 01:46:20PM -0600, Jeff Law via Gcc-patches wrote:
> 
> On 8/17/22 06:15, Richard Purdie via Gcc-patches wrote:
> > Relative paths currently aren't remapped by -ffile-prefix-map and friends.
> > When cross compiling with separate 'source' and 'build' directories, the 
> > same
> > relative paths between directories may not be available on target as 
> > compared
> > to build time.
> > 
> > In order to be able to remap these relative build paths to paths that would
> > work on target, resolve paths within the file-prefix-map function using
> > realpath().
> 
> Understood.
> 
> 
> > 
> > This does cause a change of behaviour if users were previously relying upon
> > symlinks or absolute paths not being resolved.
> 
> I'm not too worried about this scenario.

This breaks ccache testsuite and -fdebug-prefix-map behavior in directories
which are symlinks, see PR108464/  I can't see how the new behavior would be
correct in that case, user is asking to remap say /home/jakub/foobar2 to
some other path, but exactly /home/jakub/foobar2 appears in the debug info,
rather than the other path.

Jakub



Re: [PATCH] wwwdocs: Announce Solaris 11.3 obsoletion

2023-01-19 Thread Rainer Orth
Hi Gerald,

>> Btw., I noticed the -gz=zstd addition is listed under Caveats.  I don't
>> think this belongs here and probably only landed due to the -gz=zlib-gnu
>> removal above.
>
> Agreed. Can you address this on the way?

sure: done like so:

gcc-13: Move -gz=zstd to General Improvements

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index 7047e742..ba42170c 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -45,7 +45,6 @@ a work-in-progress.
   supported, either.)
 Legacy debug info compression option -gz=zlib-gnu was removed
   and the option is ignored right now.
-New debug info compression option value -gz=zstd has been added.
 -Warray-bounds=2 will no longer issue warnings for out of
   bounds accesses to trailing struct members of one-element array type
   anymore. Instead it diagnoses accesses to trailing arrays according to
@@ -107,6 +106,7 @@ a work-in-progress.
 AddressSanitizer defaults to detect_stack_use_after_return=1 on GNU/Linux targets.
 For compatibility, it can be disabled with env ASAN_OPTIONS=detect_stack_use_after_return=0.
   
+  New debug info compression option value -gz=zstd has been added.
   
 Link-time optimization improvements:
 


Re: [aarch64] Use wzr/xzr for assigning vector element to 0

2023-01-19 Thread Prathamesh Kulkarni via Gcc-patches
On Wed, 18 Jan 2023 at 19:59, Richard Sandiford
 wrote:
>
> Prathamesh Kulkarni  writes:
> > On Tue, 17 Jan 2023 at 18:29, Richard Sandiford
> >  wrote:
> >>
> >> Prathamesh Kulkarni  writes:
> >> > Hi Richard,
> >> > For the following (contrived) test:
> >> >
> >> > void foo(int32x4_t v)
> >> > {
> >> >   v[3] = 0;
> >> >   return v;
> >> > }
> >> >
> >> > -O2 code-gen:
> >> > foo:
> >> > fmovs1, wzr
> >> > ins v0.s[3], v1.s[0]
> >> > ret
> >> >
> >> > I suppose we can instead emit the following code-gen ?
> >> > foo:
> >> >  ins v0.s[3], wzr
> >> >  ret
> >> >
> >> > combine produces:
> >> > Failed to match this instruction:
> >> > (set (reg:V4SI 95 [ v ])
> >> > (vec_merge:V4SI (const_vector:V4SI [
> >> > (const_int 0 [0]) repeated x4
> >> > ])
> >> > (reg:V4SI 97)
> >> > (const_int 8 [0x8])))
> >> >
> >> > So, I wrote the following pattern to match the above insn:
> >> > (define_insn "aarch64_simd_vec_set_zero"
> >> >   [(set (match_operand:VALL_F16 0 "register_operand" "=w")
> >> > (vec_merge:VALL_F16
> >> > (match_operand:VALL_F16 1 "const_dup0_operand" "w")
> >> > (match_operand:VALL_F16 3 "register_operand" "0")
> >> > (match_operand:SI 2 "immediate_operand" "i")))]
> >> >   "TARGET_SIMD"
> >> >   {
> >> > int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL 
> >> > (operands[2])));
> >> > operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt);
> >> > return "ins\\t%0.[%p2], wzr";
> >> >   }
> >> > )
> >> >
> >> > which now matches the above insn produced by combine.
> >> > However, in reload dump, it creates a new insn for assigning
> >> > register to (const_vector (const_int 0)),
> >> > which results in:
> >> > (insn 19 8 13 2 (set (reg:V4SI 33 v1 [99])
> >> > (const_vector:V4SI [
> >> > (const_int 0 [0]) repeated x4
> >> > ])) "wzr-test.c":8:1 1269 {*aarch64_simd_movv4si}
> >> >  (nil))
> >> > (insn 13 19 14 2 (set (reg/i:V4SI 32 v0)
> >> > (vec_merge:V4SI (reg:V4SI 33 v1 [99])
> >> > (reg:V4SI 32 v0 [97])
> >> > (const_int 8 [0x8]))) "wzr-test.c":8:1 1808
> >> > {aarch64_simd_vec_set_zerov4si}
> >> >  (nil))
> >> >
> >> > and eventually the code-gen:
> >> > foo:
> >> > moviv1.4s, 0
> >> > ins v0.s[3], wzr
> >> > ret
> >> >
> >> > To get rid of redundant assignment of 0 to v1, I tried to split the
> >> > above pattern
> >> > as in the attached patch. This works to emit code-gen:
> >> > foo:
> >> > ins v0.s[3], wzr
> >> > ret
> >> >
> >> > However, I am not sure if this is the right approach. Could you suggest,
> >> > if it'd be possible to get rid of UNSPEC_SETZERO in the patch ?
> >>
> >> The problem is with the "w" constraint on operand 1, which tells LRA
> >> to force the zero into an FPR.  It should work if you remove the
> >> constraint.
> > Ah indeed, sorry about that, changing the constrained works.
>
> "i" isn't right though, because that's for scalar integers.
> There's no need for any constraint here -- the predicate does
> all of the work.
>
> > Does the attached patch look OK after bootstrap+test ?
> > Since we're in stage-4, shall it be OK to commit now, or queue it for 
> > stage-1 ?
>
> It needs tests as well. :-)
>
> Also:
>
> > Thanks,
> > Prathamesh
> >
> >
> >>
> >> Also, I think you'll need to use zr for the zero, so that
> >> it uses xzr for 64-bit elements.
> >>
> >> I think this and the existing patterns ought to test
> >> exact_log2 (INTVAL (operands[2])) >= 0 in the insn condition,
> >> since there's no guarantee that RTL optimisations won't form
> >> vec_merges that have other masks.
> >>
> >> Thanks,
> >> Richard
> >
> > [aarch64] Use wzr/xzr for assigning 0 to vector element.
> >
> > gcc/ChangeLog:
> >   * config/aaarch64/aarch64-simd.md (aarch64_simd_vec_set_zero):
> >   New pattern.
> >   * config/aarch64/predicates.md (const_dup0_operand): New.
> >
> > diff --git a/gcc/config/aarch64/aarch64-simd.md 
> > b/gcc/config/aarch64/aarch64-simd.md
> > index 104088f67d2..8e54ee4e886 100644
> > --- a/gcc/config/aarch64/aarch64-simd.md
> > +++ b/gcc/config/aarch64/aarch64-simd.md
> > @@ -1083,6 +1083,20 @@
> >[(set_attr "type" "neon_ins, neon_from_gp, 
> > neon_load1_one_lane")]
> >  )
> >
> > +(define_insn "aarch64_simd_vec_set_zero"
> > +  [(set (match_operand:VALL_F16 0 "register_operand" "=w")
> > + (vec_merge:VALL_F16
> > + (match_operand:VALL_F16 1 "const_dup0_operand" "i")
> > + (match_operand:VALL_F16 3 "register_operand" "0")
> > + (match_operand:SI 2 "immediate_operand" "i")))]
> > +  "TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0"
> > +  {
> > +int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
> > +operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt);
> > +return "ins\\t%0.[%p2], zr";
> > +  }
> > +)
> > +
> >  (define_insn 

Re: [RFC] Introduce -finline-memset-loops

2023-01-19 Thread Richard Biener via Gcc-patches
On Thu, Jan 19, 2023 at 12:25 PM Alexandre Oliva  wrote:
>
> On Jan 16, 2023, Richard Biener  wrote:
>
> > On Sat, Jan 14, 2023 at 2:55 AM Alexandre Oliva  wrote:
> >> Target-specific code is great for tight optimizations, but the main
> >> purpose of this feature is not an optimization.  AFAICT it actually
> >> slows things down in general (due to code growth, and to conservative
> >> assumptions about alignment), except perhaps for some microbenchmarks.
> >> It's rather a means to avoid depending on the C runtime, particularly
> >> due to compiler-introduced memset calls.
>
> > OK, that's what I guessed but you didn't spell out.  So does it make sense
> > to mention -ffreestanding in the docs at least?  My fear is that we'd get
> > complaints that -O3 -finline-memset-loops turns nicely optimized memset
> > loops into dumb ones (via loop distribution and then stupid re-expansion).
> > So does it also make sense to turn off -floop-distribute-patterns[-memset]
> > with -finline-memset-loops?
>
> I don't think they should be tied together.  Verbose as it is, the
> expansion of memset is a sort of local optimum given what the compiler
> knows about length and alignment: minimizing what can be minimized,
> namely the compare count, by grouping stores in large straight-line
> blocks.
>
> Though an optimized memset could in theory perform better, whether
> through ifuncs or by bumping alignment, if you're tuning generated code
> for a specific target, and you know dest is aligned, the generated code
> can likely beat a general-purpose optimized memset, even if by a thin
> margin, such as the code that the general-purpose memset would have to
> run to detect the alignment and realize it doesn't need to be bumped,
> and to extend the byte value to be stored to wider modes.
>
> So I can envision cases in which -floop-distribute-patterns could turn
> highly inefficient stores into a memset with known-strict alignment and
> length multiplier, that could then be profitably expanded inline so as
> to take advantage of both for performance reasons.
>
> Indeed, when I started working on this, I thought the issue was
> performance, and this led me to pursue the store-by-multiple-pieces
> logic.  It can indeed bring about performance improvements, both over
> generic-target and highly-optimized memset implementations.  But it can
> also be put to use to avoid C runtime calls.  So while I wouldn't
> suggest enabling it by default at any optimization level, I wouldn't tie
> it to the single purpose of freestanding environments either.
>
>
> >> My initial goal was to be able to show that inline expansion would NOT
> >> bring about performance improvements, but performance was not the
> >> concern that led to the request.
> >>
> >> If the approach seems generally acceptable, I may even end up extending
> >> it to other such builtins.  I have a vague recollection that memcmp is
> >> also an issue for us.
>
> > The C/C++ runtime produce at least memmove, memcpy and memcmp as well.
>
> *nod*.  The others are far more challenging to expand inline in a way
> that could potentially be more performant:
>
> - memcmp can only do by_pieces when testing for equality, presumably
> because grouping multiple bytes into larger words to speed things up
> won't always get you the expected result if you just subtract the larger
> words, endianness reversal prior to subtracting might be required, which
> would harm performance.  I don't see that using similar
> power-of-two-sizes grouping strategies to minimize looping overheads
> would be so advantageous, if at all, given the need for testing and
> branching at every word.
>
> - memcpy seems doable, but all of the block moves other than cpymem
> assume non-overlapping memcpy.  Even if we were to output a test for
> overlap that a naïve expansion would break, and an alternate block to go
> backwards, all of the block copying logic would have to be "oriented" to
> proceed explicitly forward, backward, or don't-care, where currently we
> only have don't-care.
>
> Though my initial plan, when posting this patch, was to see how well the
> general approach was received, before thinking much about how to apply
> it to the other builtins, now I am concerned that extending it to them
> is more than I can tackle.
>
> Would it make more sense to extend it, even constrained by the
> limitations mentioned above, or handle memset only?  In the latter case,
> would it still make sense to adopt a command-line option that suggests a
> broader effect than it already has, even if it's only a hopeful future
> extension?  -finline-all-stringops[={memset,memcpy,...}], that you
> suggested, seems to be a reasonable and extensible one to adopt.

Well, if the main intent is to avoid relying on a C runtime for calls
generated by the compiler then yes!  Otherwise it would be incomplete.
In that light ...

> >> Is (optionally) tending to this (uncommon, I suppose) need (or
> >> preference?) not something GCC would like 

Re: [RFC] Introduce -finline-memset-loops

2023-01-19 Thread Alexandre Oliva via Gcc-patches
On Jan 16, 2023, Richard Biener  wrote:

> On Sat, Jan 14, 2023 at 2:55 AM Alexandre Oliva  wrote:
>> Target-specific code is great for tight optimizations, but the main
>> purpose of this feature is not an optimization.  AFAICT it actually
>> slows things down in general (due to code growth, and to conservative
>> assumptions about alignment), except perhaps for some microbenchmarks.
>> It's rather a means to avoid depending on the C runtime, particularly
>> due to compiler-introduced memset calls.

> OK, that's what I guessed but you didn't spell out.  So does it make sense
> to mention -ffreestanding in the docs at least?  My fear is that we'd get
> complaints that -O3 -finline-memset-loops turns nicely optimized memset
> loops into dumb ones (via loop distribution and then stupid re-expansion).
> So does it also make sense to turn off -floop-distribute-patterns[-memset]
> with -finline-memset-loops?

I don't think they should be tied together.  Verbose as it is, the
expansion of memset is a sort of local optimum given what the compiler
knows about length and alignment: minimizing what can be minimized,
namely the compare count, by grouping stores in large straight-line
blocks.

Though an optimized memset could in theory perform better, whether
through ifuncs or by bumping alignment, if you're tuning generated code
for a specific target, and you know dest is aligned, the generated code
can likely beat a general-purpose optimized memset, even if by a thin
margin, such as the code that the general-purpose memset would have to
run to detect the alignment and realize it doesn't need to be bumped,
and to extend the byte value to be stored to wider modes.

So I can envision cases in which -floop-distribute-patterns could turn
highly inefficient stores into a memset with known-strict alignment and
length multiplier, that could then be profitably expanded inline so as
to take advantage of both for performance reasons.

Indeed, when I started working on this, I thought the issue was
performance, and this led me to pursue the store-by-multiple-pieces
logic.  It can indeed bring about performance improvements, both over
generic-target and highly-optimized memset implementations.  But it can
also be put to use to avoid C runtime calls.  So while I wouldn't
suggest enabling it by default at any optimization level, I wouldn't tie
it to the single purpose of freestanding environments either.


>> My initial goal was to be able to show that inline expansion would NOT
>> bring about performance improvements, but performance was not the
>> concern that led to the request.
>> 
>> If the approach seems generally acceptable, I may even end up extending
>> it to other such builtins.  I have a vague recollection that memcmp is
>> also an issue for us.

> The C/C++ runtime produce at least memmove, memcpy and memcmp as well.

*nod*.  The others are far more challenging to expand inline in a way
that could potentially be more performant:

- memcmp can only do by_pieces when testing for equality, presumably
because grouping multiple bytes into larger words to speed things up
won't always get you the expected result if you just subtract the larger
words, endianness reversal prior to subtracting might be required, which
would harm performance.  I don't see that using similar
power-of-two-sizes grouping strategies to minimize looping overheads
would be so advantageous, if at all, given the need for testing and
branching at every word.

- memcpy seems doable, but all of the block moves other than cpymem
assume non-overlapping memcpy.  Even if we were to output a test for
overlap that a naïve expansion would break, and an alternate block to go
backwards, all of the block copying logic would have to be "oriented" to
proceed explicitly forward, backward, or don't-care, where currently we
only have don't-care.

Though my initial plan, when posting this patch, was to see how well the
general approach was received, before thinking much about how to apply
it to the other builtins, now I am concerned that extending it to them
is more than I can tackle.

Would it make more sense to extend it, even constrained by the
limitations mentioned above, or handle memset only?  In the latter case,
would it still make sense to adopt a command-line option that suggests a
broader effect than it already has, even if it's only a hopeful future
extension?  -finline-all-stringops[={memset,memcpy,...}], that you
suggested, seems to be a reasonable and extensible one to adopt.

>> Is (optionally) tending to this (uncommon, I suppose) need (or
>> preference?) not something GCC would like to do?

> Sure, I think for the specific intended purpose that would be fine.

Cool!

> It should also only apply to __builtin_memset calls, not to memset
> calls from user code?

I suppose it could be argued both ways.  The situations that I had in
mind either already are or could be made __builtin_memset calls, but I
can't think of reasons to prevent explicit memset calls 

[PATCH] modula2/108144 - fix --enable-version-specific-runtime-libs

2023-01-19 Thread Richard Biener via Gcc-patches
The following fixes --enable-version-specific-runtime-libs for
the modula2 target libraries.  The issue is that the install
happens via for example

toolexeclib_LTLIBRARIES = libm2cor.la

and toolexeclibdir is set to $(toolexecdir)/$(gcc_version)$(MULTISUBDIR)
but the Makefile.am do not define $(gcc_version) but instead
$(version) which is used locally to define libsubdir.  The fix
is to consistently define and use $(gcc_version), also properly
supporting --with-gcc-major-version-only

Built and installed on x86_64-unknown-linux-gnu with
--enable-version-specific-runtime-libs and --with-gcc-major-version-only.

OK?

Thanks,
Richard. 

PR modula2/108144
libgm2/
* configure.ac: Add GCC_BASE_VER.
* configure: Re-generate.
* Makefile.am: Use @get_gcc_base_ver@ for gcc_version.
* libm2cor/Makefile.am: Likewise.  Use gcc_version instead
of version.
* libm2iso/Makefile.am: Likewise.
* libm2log/Makefile.am: Likewise.
* libm2min/Makefile.am: Likewise.
* libm2pim/Makefile.am: Likewise.
* Makefile.in: Re-generate.
* libm2cor/Makefile.in: Likewise.
* libm2iso/Makefile.in: Likewise.
* libm2log/Makefile.in: Likewise.
* libm2min/Makefile.in: Likewise.
* libm2pim/Makefile.in: Likewise.
---
 libgm2/Makefile.am  |  2 +-
 libgm2/Makefile.in  |  3 ++-
 libgm2/configure| 23 +--
 libgm2/configure.ac |  3 +++
 libgm2/libm2cor/Makefile.am |  4 ++--
 libgm2/libm2cor/Makefile.in |  5 +++--
 libgm2/libm2iso/Makefile.am |  4 ++--
 libgm2/libm2iso/Makefile.in |  5 +++--
 libgm2/libm2log/Makefile.am |  4 ++--
 libgm2/libm2log/Makefile.in |  5 +++--
 libgm2/libm2min/Makefile.am |  4 ++--
 libgm2/libm2min/Makefile.in |  5 +++--
 libgm2/libm2pim/Makefile.am |  4 ++--
 libgm2/libm2pim/Makefile.in |  5 +++--
 14 files changed, 52 insertions(+), 24 deletions(-)

diff --git a/libgm2/Makefile.am b/libgm2/Makefile.am
index 88d12ee325e..524ea6c7124 100644
--- a/libgm2/Makefile.am
+++ b/libgm2/Makefile.am
@@ -32,7 +32,7 @@ MAKEOVERRIDES=
 
 AM_CFLAGS = -I $(srcdir)/../libgcc -I $(MULTIBUILDTOP)../../gcc/include
 
-gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
+gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 TOP_GCCDIR := $(shell cd $(top_srcdir) && cd .. && pwd)
 
 GCC_DIR = $(TOP_GCCDIR)/gcc
diff --git a/libgm2/Makefile.in b/libgm2/Makefile.in
index ec9094b345d..ac01eafe45c 100644
--- a/libgm2/Makefile.in
+++ b/libgm2/Makefile.in
@@ -264,6 +264,7 @@ dvidir = @dvidir@
 enable_shared = @enable_shared@
 enable_static = @enable_static@
 exec_prefix = @exec_prefix@
+get_gcc_base_ver = @get_gcc_base_ver@
 host = @host@
 host_alias = @host_alias@
 host_cpu = @host_cpu@
@@ -336,7 +337,7 @@ ACLOCAL_AMFLAGS = -I . -I .. -I ../config
 # Multilib support.
 MAKEOVERRIDES = 
 AM_CFLAGS = -I $(srcdir)/../libgcc -I $(MULTIBUILDTOP)../../gcc/include
-gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
+gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER)
 TOP_GCCDIR := $(shell cd $(top_srcdir) && cd .. && pwd)
 GCC_DIR = $(TOP_GCCDIR)/gcc
 GM2_SRC = $(GCC_DIR)/m2
diff --git a/libgm2/configure b/libgm2/configure
index 922b0715964..8b2c28cb163 100755
--- a/libgm2/configure
+++ b/libgm2/configure
@@ -634,6 +634,7 @@ ac_subst_vars='am__EXEEXT_FALSE
 am__EXEEXT_TRUE
 LTLIBOBJS
 LIBOBJS
+get_gcc_base_ver
 TARGET_DARWIN_FALSE
 TARGET_DARWIN_TRUE
 BUILD_LOGLIB_FALSE
@@ -805,6 +806,7 @@ with_pic
 enable_fast_install
 with_gnu_ld
 enable_libtool_lock
+with_gcc_major_version_only
 '
   ac_precious_vars='build_alias
 host_alias
@@ -1464,6 +1466,8 @@ Optional Packages:
   --with-pic  try to use only PIC/non-PIC objects [default=use
   both]
   --with-gnu-ld   assume the C compiler uses GNU ld [default=no]
+  --with-gcc-major-version-only
+  use only GCC major number in filesystem paths
 
 Some influential environment variables:
   CC  C compiler command
@@ -12700,7 +12704,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12703 "configure"
+#line 12707 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -12806,7 +12810,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 12809 "configure"
+#line 12813 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -19696,6 +19700,21 @@ else
 fi
 
 
+# Determine what GCC version number to use in filesystem paths.
+
+  get_gcc_base_ver="cat"
+
+# Check whether --with-gcc-major-version-only was given.
+if test "${with_gcc_major_version_only+set}" = set; then :
+  withval=$with_gcc_major_version_only; if test x$with_gcc_major_version_only 
= xyes ; then
+get_gcc_base_ver="sed -e 's/^\([0-9]*\).*/\1/'"
+  fi
+
+fi
+
+
+
+
 
 

Re: [PATCH] wwwdocs: Announce Solaris 11.3 obsoletion

2023-01-19 Thread Gerald Pfeifer
On Wed, 18 Jan 2023, Rainer Orth wrote:
> Here's the changes.html patch corresponding to the Solaris 11.3
> obsoletion notice in
> 
>   https://gcc.gnu.org/pipermail/gcc/2022-December/240322.html
>   https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608384.html
> 
> Since this is the only obsoletion in GCC 13 so far, I haven't introduced
> a toplevel bulletpoint as in GCC 9.

That maks sense to me.

> Ok?

Yes, thank you!

> Btw., I noticed the -gz=zstd addition is listed under Caveats.  I don't
> think this belongs here and probably only landed due to the -gz=zlib-gnu
> removal above.

Agreed. Can you address this on the way?

Thanks,
Gerald


Re: [PATCH 1/2] select .rodata for const volatile variables.

2023-01-19 Thread Cupertino Miranda via Gcc-patches


Hi Jeff,

Kindly calling your attention to this thread.

Regards,
Cupertino

Cupertino Miranda via Gcc-patches writes:

> Richard Biener writes:
>
>> On Mon, Dec 5, 2022 at 7:07 PM Jeff Law via Gcc-patches
>>  wrote:
>>>
>>>
>>>
>>> On 12/2/22 10:52, Cupertino Miranda via Gcc-patches wrote:
>>> > Changed target code to select .rodata section for 'const volatile'
>>> > defined variables.
>>> > This change is in the context of the bugzilla #170181.
>>> >
>>> > gcc/ChangeLog:
>>> >
>>> >   v850.c(v850_select_section): Changed function.
>>> I'm not sure this is safe/correct.  ISTM that you need to look at the
>>> underlying TREE_TYPE to check for const-volatile rather than
>>> TREE_SIDE_EFFECTS.
>>
>> Just to quote tree.h:
>>
>> /* In any expression, decl, or constant, nonzero means it has side effects or
>>reevaluation of the whole expression could produce a different value.
>>This is set if any subexpression is a function call, a side effect or a
>>reference to a volatile variable.  In a ..._DECL, this is set only if the
>>declaration said `volatile'.  This will never be set for a constant.  */
>> #define TREE_SIDE_EFFECTS(NODE) \
>>   (NON_TYPE_CHECK (NODE)->base.side_effects_flag)
>>
>> so if exp is a decl then that's the volatile check.
>>
>
> Thank you Richard for the review.
> Jeff: Can you please let me know if Richard comments reply to your
> concerns?
>
> Cupertino
>
>>> Of secondary importance is the ChangeLog.  Just saying "Changed
>>> function" provides no real information.  Something like this would be
>>> better:
>>>
>>> * config/v850/v850.c (v850_select_section): Put const volatile
>>> objects into read-only sections.
>>>
>>>
>>> Jeff
>>>
>>>
>>>
>>>
>>> > ---
>>> >   gcc/config/v850/v850.cc | 1 -
>>> >   1 file changed, 1 deletion(-)
>>> >
>>> > diff --git a/gcc/config/v850/v850.cc b/gcc/config/v850/v850.cc
>>> > index c7d432990ab..e66893fede4 100644
>>> > --- a/gcc/config/v850/v850.cc
>>> > +++ b/gcc/config/v850/v850.cc
>>> > @@ -2865,7 +2865,6 @@ v850_select_section (tree exp,
>>> >   {
>>> > int is_const;
>>> > if (!TREE_READONLY (exp)
>>> > -   || TREE_SIDE_EFFECTS (exp)
>>> > || !DECL_INITIAL (exp)
>>> > || (DECL_INITIAL (exp) != error_mark_node
>>> > && !TREE_CONSTANT (DECL_INITIAL (exp


[PATCH] tree-optimization/108449 - keep maybe_special_function_p behavior

2023-01-19 Thread Richard Biener via Gcc-patches
When we have a static declaration without definition we diagnose
that and turn it into an extern declaration.  That can alter
the outcome of maybe_special_function_p here and there's really
no point in doing that, so don't.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

PR tree-optimization/108449
* cgraphunit.cc (check_global_declaration): Do not turn
undefined statics into externs.

* gcc.dg/pr108449.c: New testcase.
---
 gcc/cgraphunit.cc   | 2 --
 gcc/testsuite/gcc.dg/pr108449.c | 5 +
 2 files changed, 5 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr108449.c

diff --git a/gcc/cgraphunit.cc b/gcc/cgraphunit.cc
index 59ce2708b7b..832818d651f 100644
--- a/gcc/cgraphunit.cc
+++ b/gcc/cgraphunit.cc
@@ -1087,8 +1087,6 @@ check_global_declaration (symtab_node *snode)
   else
warning (OPT_Wunused_function, "%q+F declared % but never "
   "defined", decl);
-  /* This symbol is effectively an "extern" declaration now.  */
-  TREE_PUBLIC (decl) = 1;
 }
 
   /* Warn about static fns or vars defined but not used.  */
diff --git a/gcc/testsuite/gcc.dg/pr108449.c b/gcc/testsuite/gcc.dg/pr108449.c
new file mode 100644
index 000..4a3ae5b3ed4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr108449.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+
+static int vfork(); /* { dg-warning "used but never defined" } */
+void f() { vfork(); }
-- 
2.35.3


Re: [PATCH 5/8 v2] middle-end: Add cltz_complement idiom recognition

2023-01-19 Thread Richard Biener via Gcc-patches
On Thu, Jan 19, 2023 at 10:19 AM Jan-Benedict Glaw  wrote:
>
> On Thu, 2022-12-22 17:42:16 +, Andrew Carlotti via Gcc-patches 
>  wrote:
> > New patch below, bootstrapped and regression tested on
> > aarch64-unknown-linux-gnu and x86_64-pc-linux-gnu - ok to merge?
>
> > diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
> > index 
> > fece876099c1687569d6351e7d2416ea6acae5b5..ce2441f2a6dbdf2d8fe42755d5d1abd8a631bb5c
> >  100644
> > --- a/gcc/tree-ssa-loop-niter.cc
> > +++ b/gcc/tree-ssa-loop-niter.cc
> > @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "tree-chrec.h"
> >  #include "tree-scalar-evolution.h"
> >  #include "tree-dfa.h"
> > +#include "internal-fn.h"
> >  #include "gimple-range.h"
> >
> >
> > @@ -2198,6 +2199,224 @@ number_of_iterations_popcount (loop_p loop, edge 
> > exit,
> >return true;
> >  }
> >
> > +/* Return an expression that counts the leading/trailing zeroes of src.
> > +
> > +   If define_at_zero is true, then the built expression will be defined to
> > +   return the precision of src when src == 0 (using either a conditional
> > +   expression or a suitable internal function).
> > +   Otherwise, we can elide the conditional expression and let src = 0 
> > invoke
> > +   undefined behaviour.  */
> > +
> > +static tree
> > +build_cltz_expr (tree src, bool leading, bool define_at_zero)
> > +{
> [...]
> > +
> > +  tree call;
> > +  if (use_ifn)
> > +{
> > +  call = build_call_expr_internal_loc (UNKNOWN_LOCATION, ifn,
> > +integer_type_node, 1, src);
> > +  int val;
> > +  scalar_int_mode mode = SCALAR_INT_TYPE_MODE (utype);
>  
>
> This will give us a new unused variable warning.

I wonder if hardening the defaults.h macros like

#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE)  (((MODE), (VALUE)), 0)

fixes that and makes sense (also to avoid losing side-effects for the arguments)

Richard.


> > +  int optab_defined_at_zero
> > + = leading ? CLZ_DEFINED_VALUE_AT_ZERO (mode, val)
> > +   : CTZ_DEFINED_VALUE_AT_ZERO (mode, val);
> > +  if (define_at_zero && !(optab_defined_at_zero == 2 && val == prec))
> > + {
> > +   tree is_zero = fold_build2 (NE_EXPR, boolean_type_node, src,
> > +   build_zero_cst (TREE_TYPE (src)));
> > +   call = fold_build3(COND_EXPR, integer_type_node, is_zero, call,
> > +  build_int_cst (integer_type_node, prec));
> > + }
> > +}
>
> MfG, JBG
>
> --


Re: [PATCH] forwprop: Further fixes for simplify_rotate [PR108440]

2023-01-19 Thread Aldy Hernandez via Gcc-patches




On 1/19/23 09:41, Jakub Jelinek wrote:


+ range_query *q = get_range_query (cfun);
+ if (q == get_global_range_query ())
+   q = enable_ranger (cfun);


Oh, neat.  Clever.  I hadn't thought about that.


+ if (!q->range_of_expr (r, rotcnt, check_range_stmt))
+   {
+ if (check_range > 0)
+   return false;
+ r.set_varying (TREE_TYPE (rotcnt));
+   }
  int prec = TYPE_PRECISION (TREE_TYPE (rotcnt));
  signop sign = TYPE_SIGN (TREE_TYPE (rotcnt));
  wide_int min = wide_int::from (TYPE_PRECISION (rtype), prec, sign);
  wide_int max = wide_int::from (wider_prec - 1, prec, sign);
- int_range<2> r2 (TREE_TYPE (rotcnt), min, max);
+ if (check_range < 0)
+   max = min;
+ int_range<1> r2 (TREE_TYPE (rotcnt), min, max);

>  r.intersect (r2);

Currently int_range<1> is a legacy range (anti ranges and such 
internally).  It's better to use <2> as the use of r2 will have to be 
converted to a multi-range before intersecting.  FYI, <2> is the 
smallest multi-range.


This is really an implementation detail, so don't bother changing it, 
even though it's slightly slower.  In the next release we'll nuke 
legacy, and <1> will mean what you think it means...the smallest range 
with one sub-range (and none of that anti range business internally).


Thanks.
Aldy



[committed] wwwdocs: gitwrite: Structure a section some more

2023-01-19 Thread Gerald Pfeifer
On the way properly mark up a command-line option.

Pushed. (The diff locks quite bigger than it actually is.)

Gerald
---
 htdocs/gitwrite.html | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/htdocs/gitwrite.html b/htdocs/gitwrite.html
index e4dadb27..1ffda77a 100644
--- a/htdocs/gitwrite.html
+++ b/htdocs/gitwrite.html
@@ -412,12 +412,15 @@ chosen).  You can also push an already existing branch 
using git
 push users/me me/branch.  Beware that if you have more than one
 personal branch set up locally, simply typing git push
 users/me will potentially push all personal branches based on
-that remote.  Use --dry-run to check that what will be pushed is what
-you intend.  The script contrib/git-add-user-branch.sh
-can be used to create a new personal branch which can be pushed and
-pulled from the users/me remote.
+that remote.
+
+Use --dry-run to check that what will be pushed is what
+you intend.
 
-The script also defines a few useful aliases that can be used with the
+The script contrib/git-add-user-branch.sh
+can be used to create a new personal branch which can be pushed and
+pulled from the users/me remote.
+The script also defines a few useful aliases that can be used with the
 repository:
 
 
-- 
2.39.0


[committed] wwwdocs: gcc-3.3: Adjust www.open-std.org links to https

2023-01-19 Thread Gerald Pfeifer
Pushed.

Gerald

---
 htdocs/gcc-3.3/changes.html | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/htdocs/gcc-3.3/changes.html b/htdocs/gcc-3.3/changes.html
index 93d96e65..da4165f3 100644
--- a/htdocs/gcc-3.3/changes.html
+++ b/htdocs/gcc-3.3/changes.html
@@ -560,7 +560,7 @@ Detailed release notes for the GCC 3.3 release follow.
 https://gcc.gnu.org/PR9424;>9424 
i/ostream::operator/(streambuf*) drops 
characters
 https://gcc.gnu.org/PR9425;>9425 
filebuf::pbackfail broken (DUP: https://gcc.gnu.org/PR9439;>9439)
 https://gcc.gnu.org/PR9474;>9474 GCC freezes in compiling a 
weird code mixing iostream and 
iostream.h
-https://gcc.gnu.org/PR9548;>9548 Incorrect results from 
setf(ios::fixed) and precision(-1) http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#231;>[DR 
231] wwwdocs:
+https://gcc.gnu.org/PR9548;>9548 Incorrect results from 
setf(ios::fixed) and precision(-1) https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#231;>[DR 
231] wwwdocs:
 https://gcc.gnu.org/PR9555;>9555 ostream 
inserters fail to set badbit on exception
 https://gcc.gnu.org/PR9561;>9561 ostream 
inserters rethrow exception of wrong type
 https://gcc.gnu.org/PR9563;>9563 ostream::sentry 
returns true after a failed preparation
@@ -1124,21 +1124,21 @@ the relevant defect report.
 https://gcc.gnu.org/PR9371;>9371 Bad exception handling in 
i/ostream::operator/(streambuf*)
 https://gcc.gnu.org/PR9546;>9546 bad exception handling in 
ostream members
 https://gcc.gnu.org/PR10081;>10081 
basic_ios::_M_cache_locale leaves NULL members in the face of 
unknown locales
-https://gcc.gnu.org/PR10093;>10093 http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#61;>[DR 61] 
wwwdocs: Setting failbit in exceptions doesn't work
+https://gcc.gnu.org/PR10093;>10093 https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#61;>[DR 61] 
wwwdocs: Setting failbit in exceptions doesn't work
 https://gcc.gnu.org/PR10095;>10095 
istream::operator>(int) sets ios::badbit 
when ios::failbit is set.
 https://gcc.gnu.org/PR11554;>11554 Warning about reordering 
of initializers doesn't mention location of constructor
 https://gcc.gnu.org/PR12297;>12297 
istream::sentry::sentry() handles eof() 
incorrectly.
 https://gcc.gnu.org/PR12352;>12352 Exception safety problems 
in src/localename.cc
 https://gcc.gnu.org/PR12438;>12438 Memory leak in 
locale::combine()
 https://gcc.gnu.org/PR12540;>12540 Memory leak in 
locale::locale(const char*)
-https://gcc.gnu.org/PR12594;>12594 DRs http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#60;>60 [TC] 
wwwdocs: and http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#63;>63 
[TC] not implemented
-https://gcc.gnu.org/PR12657;>12657 Resolution of http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#292;>DR 292 
(WP) still unimplemented
+https://gcc.gnu.org/PR12594;>12594 DRs https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#60;>60 [TC] 
wwwdocs: and https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#63;>63 
[TC] not implemented
+https://gcc.gnu.org/PR12657;>12657 Resolution of https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#292;>DR 292 
(WP) still unimplemented
 https://gcc.gnu.org/PR12696;>12696 memory eating infinite 
loop in diagnostics (error recovery problem)
 https://gcc.gnu.org/PR12815;>12815 Code compiled with 
optimization behaves unexpectedly
 https://gcc.gnu.org/PR12862;>12862 Conflicts between 
typedefs/enums and namespace member declarations
 https://gcc.gnu.org/PR12926;>12926 Wrong value after 
assignment in initialize list using bit-fields
-https://gcc.gnu.org/PR12967;>12967 Resolution of http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#;>DR 300 
[WP] wwwdocs: still unimplemented
-https://gcc.gnu.org/PR12971;>12971 Resolution of http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#328;>DR 328 
[WP] wwwdocs: still unimplemented
+https://gcc.gnu.org/PR12967;>12967 Resolution of https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#;>DR 300 
[WP] wwwdocs: still unimplemented
+https://gcc.gnu.org/PR12971;>12971 Resolution of https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#328;>DR 328 
[WP] wwwdocs: still unimplemented
 https://gcc.gnu.org/PR13007;>13007 
basic_streambuf::pubimbue, imbue wrong
 https://gcc.gnu.org/PR13009;>13009 Implicitly-defined 
assignment operator writes to wrong memory
 https://gcc.gnu.org/PR13057;>13057 regparm 
attribute not applied to destructor
-- 
2.39.0


Re: [Patch] Resolve bugzilla #108150 and #108192 for mingw

2023-01-19 Thread Jonathan Yong via Gcc-patches

On 1/11/23 09:56, Jonathan Yong wrote:

Are the patches and changelogs OK?


Ping1.



Re: [PATCH 2/2] aarch64: add -fno-stack-protector to some tests [PR108411]

2023-01-19 Thread Richard Sandiford via Gcc-patches
Christophe Lyon  writes:
> As discussed in the PR, these recently added tests fail when the
> testsuite is executed with -fstack-protector-strong.  To avoid this,
> this patch adds -fno-stack-protector to dg-options.
>
>   PR target/108411
>   gcc/testsuite
>   * g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: Add
>   -fno-stack-protector.
>   * g++.target/aarch64/bitfield-abi-warning-align16-O2.C: Likewise.
>   * g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: Likewise.
>   * g++.target/aarch64/bitfield-abi-warning-align32-O2.C: Likewise.
>   * g++.target/aarch64/bitfield-abi-warning-align8-O2.C: Likewise.
>   * gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: Likewise.
>   * gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: Likewise.
>   * gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: Likewise.
>   * gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: Likewise.
>   * gcc.target/aarch64/bitfield-abi-warning-align8-O2.c: Likewise.

OK, thanks.

Richard

> ---
>  .../g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C  | 2 +-
>  .../g++.target/aarch64/bitfield-abi-warning-align16-O2.C| 2 +-
>  .../g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C  | 2 +-
>  .../g++.target/aarch64/bitfield-abi-warning-align32-O2.C| 2 +-
>  .../g++.target/aarch64/bitfield-abi-warning-align8-O2.C | 2 +-
>  .../gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c  | 2 +-
>  .../gcc.target/aarch64/bitfield-abi-warning-align16-O2.c| 2 +-
>  .../gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c  | 2 +-
>  .../gcc.target/aarch64/bitfield-abi-warning-align32-O2.c| 2 +-
>  .../gcc.target/aarch64/bitfield-abi-warning-align8-O2.c | 2 +-
>  10 files changed, 10 insertions(+), 10 deletions(-)
>
> diff --git 
> a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C 
> b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C
> index 443cd458b4c..52f9cdd1ee9 100644
> --- a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C
> +++ b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -save-temps -Wno-narrowing" } */
> +/* { dg-options "-O2 -fno-stack-protector -save-temps -Wno-narrowing" } */
>  
>  #define ALIGN 16
>  //#define EXTRA
> diff --git 
> a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2.C 
> b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2.C
> index 76a7e3d0ad4..9ff4e46645b 100644
> --- a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2.C
> +++ b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2.C
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -save-temps -Wno-narrowing" } */
> +/* { dg-options "-O2 -fno-stack-protector -save-temps -Wno-narrowing" } */
>  
>  #define ALIGN 16
>  #define EXTRA
> diff --git 
> a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C 
> b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C
> index 6f8f54f41ff..55dcbfe4b7c 100644
> --- a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C
> +++ b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -save-temps -Wno-narrowing" } */
> +/* { dg-options "-O2 -fno-stack-protector -save-temps -Wno-narrowing" } */
>  
>  #define ALIGN 32
>  //#define EXTRA
> diff --git 
> a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2.C 
> b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2.C
> index 6b8ad5fbea1..6bb8778ee90 100644
> --- a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2.C
> +++ b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2.C
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -save-temps -Wno-narrowing" } */
> +/* { dg-options "-O2 -fno-stack-protector -save-temps -Wno-narrowing" } */
>  
>  #define ALIGN 32
>  #define EXTRA
> diff --git 
> a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align8-O2.C 
> b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align8-O2.C
> index b1764d97ea0..41bcc894a2b 100644
> --- a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align8-O2.C
> +++ b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align8-O2.C
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -save-temps -Wno-narrowing" } */
> +/* { dg-options "-O2 -fno-stack-protector -save-temps -Wno-narrowing" } */
>  
>  #define ALIGN 8
>  #define EXTRA
> diff --git 
> a/gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c 
> b/gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c
> index f248a129509..3b2c932ac23 100644
> --- 

Re: [PATCH 1/2] aarch64: fix ICE in aarch64_layout_arg [PR108411]

2023-01-19 Thread Richard Sandiford via Gcc-patches
Christophe Lyon  writes:
> The previous patch added an assert which should not be applied to PST
> types (Pure Scalable Types) because alignment does not matter in this
> case.  This patch moves the assert after the PST case is handled to
> avoid the ICE.
>
>   PR target/108411
>   gcc/
>   * config/aarch64/aarch64.cc (aarch64_layout_arg): Improve
>   comment. Move assert about alignment a bit later.
> ---
>  gcc/config/aarch64/aarch64.cc | 28 +---
>  1 file changed, 21 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index d36b57341b3..7175b453b3a 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -7659,7 +7659,18 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const 
> function_arg_info )
> && (currently_expanding_function_start
>  || currently_expanding_gimple_stmt));
>  
> -  /* There are several things to note here:
> +  /* HFAs and HVAs can have an alignment greater than 16 bytes.  For example:
> +
> +   typedef struct foo {
> + __Int8x16_t foo[2] __attribute__((aligned(32)));
> +   } foo;
> +
> + is still a HVA despite its larger-than-normal alignment.
> + However, such over-aligned HFAs and HVAs are guaranteed to have
> + no padding.
> +
> + If we exclude HFAs and HVAs from the discussion below, then there
> + are several things to note:
>  
>   - Both the C and AAPCS64 interpretations of a type's alignment should
> give a value that is no greater than the type's size.
> @@ -7704,12 +7715,6 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const 
> function_arg_info )
> would treat the alignment as though it was *equal to* 16 bytes.
>  
>   Both behaviors were wrong, but in different cases.  */
> -  unsigned int alignment
> -= aarch64_function_arg_alignment (mode, type, _break,
> -   _break_packed);
> -  gcc_assert (alignment <= 16 * BITS_PER_UNIT
> -   && (!alignment || abi_break < alignment)
> -   && (!abi_break_packed || alignment < abi_break_packed));
>  
>pcum->aapcs_arg_processed = true;
>  
> @@ -7780,6 +7785,15 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const 
> function_arg_info )
>);
>gcc_assert (!sve_p || !allocate_nvrn);
>  
> +  unsigned int alignment
> += aarch64_function_arg_alignment (mode, type, _break,
> +   _break_packed);
> +
> +  gcc_assert (allocate_nvrn || (alignment <= 16 * BITS_PER_UNIT
> + && (!alignment || abi_break < alignment)
> + && (!abi_break_packed
> + || alignment < abi_break_packed)));

I think allocate_nvrn should only circumvent the first part, so:

  gcc_assert ((allocate_nvrn || alignment <= 16 * BITS_PER_UNIT)
  && (!alignment || abi_break < alignment)
  && (!abi_break_packed || alignment < abi_break_packed));


OK with that change, and sorry for not thinking about this originally.

Richard

> +
>/* allocate_ncrn may be false-positive, but allocate_nvrn is quite 
> reliable.
>   The following code thus handles passing by SIMD/FP registers first.  */


Re: [PATCH 5/8 v2] middle-end: Add cltz_complement idiom recognition

2023-01-19 Thread Jan-Benedict Glaw
On Thu, 2022-12-22 17:42:16 +, Andrew Carlotti via Gcc-patches 
 wrote:
> New patch below, bootstrapped and regression tested on
> aarch64-unknown-linux-gnu and x86_64-pc-linux-gnu - ok to merge?

> diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
> index 
> fece876099c1687569d6351e7d2416ea6acae5b5..ce2441f2a6dbdf2d8fe42755d5d1abd8a631bb5c
>  100644
> --- a/gcc/tree-ssa-loop-niter.cc
> +++ b/gcc/tree-ssa-loop-niter.cc
> @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-chrec.h"
>  #include "tree-scalar-evolution.h"
>  #include "tree-dfa.h"
> +#include "internal-fn.h"
>  #include "gimple-range.h"
>  
>  
> @@ -2198,6 +2199,224 @@ number_of_iterations_popcount (loop_p loop, edge exit,
>return true;
>  }
>  
> +/* Return an expression that counts the leading/trailing zeroes of src.
> +
> +   If define_at_zero is true, then the built expression will be defined to
> +   return the precision of src when src == 0 (using either a conditional
> +   expression or a suitable internal function).
> +   Otherwise, we can elide the conditional expression and let src = 0 invoke
> +   undefined behaviour.  */
> +
> +static tree
> +build_cltz_expr (tree src, bool leading, bool define_at_zero)
> +{
[...]
> +
> +  tree call;
> +  if (use_ifn)
> +{
> +  call = build_call_expr_internal_loc (UNKNOWN_LOCATION, ifn,
> +integer_type_node, 1, src);
> +  int val;
> +  scalar_int_mode mode = SCALAR_INT_TYPE_MODE (utype);
 

This will give us a new unused variable warning.

> +  int optab_defined_at_zero
> + = leading ? CLZ_DEFINED_VALUE_AT_ZERO (mode, val)
> +   : CTZ_DEFINED_VALUE_AT_ZERO (mode, val);
> +  if (define_at_zero && !(optab_defined_at_zero == 2 && val == prec))
> + {
> +   tree is_zero = fold_build2 (NE_EXPR, boolean_type_node, src,
> +   build_zero_cst (TREE_TYPE (src)));
> +   call = fold_build3(COND_EXPR, integer_type_node, is_zero, call,
> +  build_int_cst (integer_type_node, prec));
> + }
> +}

MfG, JBG

-- 


signature.asc
Description: PGP signature


[PATCH] c++: Fix up handling of non-dependent subscript with static operator[] [PR108437]

2023-01-19 Thread Jakub Jelinek via Gcc-patches
Hi!

As the following testcases shows, when adding static operator[]
support I've missed that the 2 build_min_non_dep_op_overload functions
need to be adjusted.  The first one we only use for the single index
case, but as cp_tree_code_length (ARRAY_REF) is 2, we were running
into an assertion there which compared nargs and expected_nargs.
For ARRAY_REF, the operator[] is either a non-static member or newly
static member, never out of class and for the static member case
if user uses single index the operator[] needs to have a single
argument as well, but the function is called with 2 - the object
it is invoked on and the index.  We need to evaluate side-effects
of the object and use just a single argument in the call - the index.
The other build_min_non_dep_op_overload overload has been added
solely for ARRAY_REF - CALL_EXPR is the other operator that accepts
variable number of operands but that one goes through different
routines.  There we asserted it is a METHOD_TYPE, so again
we shouldn't assert that but handle the case when it is not one
by making sure object's side-effects are evaluated if needed and
passing all the index arguments to the static operator[].

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-01-19  Jakub Jelinek  

PR c++/108437
* cp-tree.h (keep_unused_object_arg): Declare.
* call.cc (keep_unused_object_arg): No longer static.
* tree.cc (build_min_non_dep_op_overload): Handle ARRAY_REF
with overload being static member function.

* g++.dg/cpp23/subscript12.C: New test.
* g++.dg/cpp23/subscript13.C: New test.

--- gcc/cp/cp-tree.h.jj 2023-01-16 11:52:16.067734300 +0100
+++ gcc/cp/cp-tree.h2023-01-18 16:47:55.258697839 +0100
@@ -6599,6 +6599,7 @@ inline tree build_new_op (const op_locat
   return build_new_op (loc, code, flags, arg1, arg2, NULL_TREE, NULL_TREE,
   NULL, complain);
 }
+extern tree keep_unused_object_arg (tree, tree, tree);
 extern tree build_op_call  (tree, vec **,
 tsubst_flags_t);
 extern tree build_op_subscript (const op_location_t &, tree,
--- gcc/cp/call.cc.jj   2023-01-16 11:52:16.059734418 +0100
+++ gcc/cp/call.cc  2023-01-18 16:48:02.966586328 +0100
@@ -5187,7 +5187,7 @@ build_operator_new_call (tree fnname, ve
or static operator(), in which cases the source expression
would be `obj[...]' or `obj(...)'.  */
 
-static tree
+tree
 keep_unused_object_arg (tree result, tree obj, tree fn)
 {
   if (result == NULL_TREE
--- gcc/cp/tree.cc.jj   2023-01-16 11:52:16.093733917 +0100
+++ gcc/cp/tree.cc  2023-01-18 17:01:08.937242864 +0100
@@ -3693,14 +3693,14 @@ build_min_non_dep_op_overload (enum tree
 {
   va_list p;
   int nargs, expected_nargs;
-  tree fn, call;
+  tree fn, call, obj = NULL_TREE;
 
   non_dep = extract_call_expr (non_dep);
 
   nargs = call_expr_nargs (non_dep);
 
   expected_nargs = cp_tree_code_length (op);
-  if (TREE_CODE (TREE_TYPE (overload)) == METHOD_TYPE)
+  if (TREE_CODE (TREE_TYPE (overload)) == METHOD_TYPE || op == ARRAY_REF)
 expected_nargs -= 1;
   if ((op == POSTINCREMENT_EXPR
|| op == POSTDECREMENT_EXPR)
@@ -3715,6 +3715,8 @@ build_min_non_dep_op_overload (enum tree
   if (TREE_CODE (TREE_TYPE (overload)) == FUNCTION_TYPE)
 {
   fn = overload;
+  if (op == ARRAY_REF)
+   obj = va_arg (p, tree);
   for (int i = 0; i < nargs; i++)
{
  tree arg = va_arg (p, tree);
@@ -3746,6 +3748,8 @@ build_min_non_dep_op_overload (enum tree
   CALL_EXPR_ORDERED_ARGS (call_expr) = CALL_EXPR_ORDERED_ARGS (non_dep);
   CALL_EXPR_REVERSE_ARGS (call_expr) = CALL_EXPR_REVERSE_ARGS (non_dep);
 
+  if (obj)
+return keep_unused_object_arg (call, obj, overload);
   return call;
 }
 
@@ -3759,11 +3763,15 @@ build_min_non_dep_op_overload (tree non_
   non_dep = extract_call_expr (non_dep);
 
   unsigned int nargs = call_expr_nargs (non_dep);
-  gcc_assert (TREE_CODE (TREE_TYPE (overload)) == METHOD_TYPE);
-  tree binfo = TYPE_BINFO (TREE_TYPE (object));
-  tree method = build_baselink (binfo, binfo, overload, NULL_TREE);
-  tree fn = build_min (COMPONENT_REF, TREE_TYPE (overload),
-  object, method, NULL_TREE);
+  tree fn = overload;
+  if (TREE_CODE (TREE_TYPE (overload)) == METHOD_TYPE)
+{
+  tree binfo = TYPE_BINFO (TREE_TYPE (object));
+  tree method = build_baselink (binfo, binfo, overload, NULL_TREE);
+  fn = build_min (COMPONENT_REF, TREE_TYPE (overload),
+ object, method, NULL_TREE);
+  object = NULL_TREE;
+}
   gcc_assert (vec_safe_length (args) == nargs);
 
   tree call = build_min_non_dep_call_vec (non_dep, fn, args);
@@ -3774,6 +3782,8 @@ build_min_non_dep_op_overload (tree non_
   CALL_EXPR_ORDERED_ARGS (call_expr) = CALL_EXPR_ORDERED_ARGS (non_dep);
   CALL_EXPR_REVERSE_ARGS (call_expr) = CALL_EXPR_REVERSE_ARGS (non_dep);
 
+ 

Re: [PATCH] forwprop: Further fixes for simplify_rotate [PR108440]

2023-01-19 Thread Richard Biener via Gcc-patches
On Thu, 19 Jan 2023, Jakub Jelinek wrote:

> Hi!
> 
> As mentioned in the simplify_rotate comment, for e.g.
>((T) ((T2) X << (Y & (B - 1 | ((T) ((T2) X >> ((-Y) & (B - 1
> we already emit
>X r<< (Y & (B - 1))
> as replacement.  This PR is about the
>((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
>((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
> forms if T2 is wider than T.  Unlike e.g.
>(X << Y) OP (X >> (B - Y))
> which is valid just for Y in [1, B - 1], the above 2 forms are actually
> valid and do the rotates for Y in [0, B] - for Y 0 the X value is preserved
> by the left shift and right logical shift by B adds just zeros (but because
> the shift is in wider precision B is still valid shift count), while for
> Y equal to B X is preserved through the latter shift and the former adds
> just zeros.
> Now, it is unclear if we in the middle-end treat rotates with rotate count
> equal or larger than precision as UB or not, unlike shifts there are less
> reasons to do so, but e.g. expansion of X r<< Y if there is no rotate optab
> for the mode is emitted as (X << Y) | (((unsigned) X) >> ((-Y) & (B - 1)))
> and so with UB on Y == B.
> 
> The following patch does multiple things:
> 1) for the above 2, asks the ranger if Y could be equal to B and if so,
>instead of using X r<< Y uses X r<< (Y & (B - 1))
> 2) for the
>((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1
>((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1
>forms that were fixed 2 days ago it only punts if Y might be in the
>[B,B2-1] range but isn't known to be in the
>[0,B][2*B,2*B][3*B,3*B]... range.  Because for Y which is a multiple
>of B but smaller than B2 it acts as a rotate too, left shift provides
>0 and (-Y) & (B - 1) is 0 and so preserves X.  Though, for the cases
>where Y is not known to be in [0,B-1] the patch also uses
>X r<< (Y & (B - 1)) rather than X r<< Y
> 3) as discussed with Aldy, instead of using global ranger it uses a pass
>specific copy but lazily created on first simplify_rotate that needs it;
>this e.g. handles rotate inside of if body where the guarding condition
>limits the shift count to some range which will not work with the
>global ranger (unless there is some SSA_NAME to attach the range to).
> 
> Note, e.g. on x86 X r<< (Y & (B - 1)) and X r<< Y actually emit the
> same assembly because rotates work the same even for larger rotate counts,
> but that is handled only during combine.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks
Richard.

> 2023-01-19  Jakub Jelinek  
> 
>   PR tree-optimization/108440
>   * tree-ssa-forwprop.cc: Include gimple-range.h.
>   (simplify_rotate): For the forms with T2 wider than T and shift counts 
> of
>   Y and B - Y add & (B - 1) masking for the rotate count if Y could be 
> equal
>   to B.  For the forms with T2 wider than T and shift counts of
>   Y and (-Y) & (B - 1), don't punt if range could be [B, B2], but only if
>   range doesn't guarantee Y < B or Y = N * B.  If range doesn't guarantee
>   Y < B, also add & (B - 1) masking for the rotate count.  Use lazily 
> created
>   pass specific ranger instead of get_global_range_query.
>   (pass_forwprop::execute): Disable that ranger at the end of pass if it 
> has
>   been created.
> 
>   * c-c++-common/rotate-10.c: New test.
>   * c-c++-common/rotate-11.c: New test.
> 
> --- gcc/tree-ssa-forwprop.cc.jj   2023-01-17 12:14:15.845088330 +0100
> +++ gcc/tree-ssa-forwprop.cc  2023-01-18 13:30:59.337914945 +0100
> @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.
>  #include "internal-fn.h"
>  #include "cgraph.h"
>  #include "tree-ssa.h"
> +#include "gimple-range.h"
>  
>  /* This pass propagates the RHS of assignment statements into use
> sites of the LHS of the assignment.  It's basically a specialized
> @@ -1837,8 +1838,12 @@ defcodefor_name (tree name, enum tree_co
> ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1
> ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1
>  
> -   transform these into (last 2 only if ranger can prove Y < B):
> +   transform these into (last 2 only if ranger can prove Y < B
> +   or Y = N * B):
> X r<< Y
> +   or
> +   X r<< (& & (B - 1))
> +   The latter for the forms with T2 wider than T if ranger can't prove Y < B.
>  
> Or for:
> (X << (Y & (B - 1))) | (X >> ((-Y) & (B - 1)))
> @@ -1868,6 +1873,7 @@ simplify_rotate (gimple_stmt_iterator *g
>gimple *g;
>gimple *def_arg_stmt[2] = { NULL, NULL };
>int wider_prec = 0;
> +  bool add_masking = false;
>  
>arg[0] = gimple_assign_rhs1 (stmt);
>arg[1] = gimple_assign_rhs2 (stmt);
> @@ -1995,7 +2001,7 @@ simplify_rotate (gimple_stmt_iterator *g
>tree cdef_arg1[2], cdef_arg2[2], def_arg2_alt[2];
>enum tree_code cdef_code[2];
>gimple 

[PATCH] forwprop: Further fixes for simplify_rotate [PR108440]

2023-01-19 Thread Jakub Jelinek via Gcc-patches
Hi!

As mentioned in the simplify_rotate comment, for e.g.
   ((T) ((T2) X << (Y & (B - 1 | ((T) ((T2) X >> ((-Y) & (B - 1
we already emit
   X r<< (Y & (B - 1))
as replacement.  This PR is about the
   ((T) ((T2) X << Y)) OP ((T) ((T2) X >> (B - Y)))
   ((T) ((T2) X << (int) Y)) OP ((T) ((T2) X >> (int) (B - Y)))
forms if T2 is wider than T.  Unlike e.g.
   (X << Y) OP (X >> (B - Y))
which is valid just for Y in [1, B - 1], the above 2 forms are actually
valid and do the rotates for Y in [0, B] - for Y 0 the X value is preserved
by the left shift and right logical shift by B adds just zeros (but because
the shift is in wider precision B is still valid shift count), while for
Y equal to B X is preserved through the latter shift and the former adds
just zeros.
Now, it is unclear if we in the middle-end treat rotates with rotate count
equal or larger than precision as UB or not, unlike shifts there are less
reasons to do so, but e.g. expansion of X r<< Y if there is no rotate optab
for the mode is emitted as (X << Y) | (((unsigned) X) >> ((-Y) & (B - 1)))
and so with UB on Y == B.

The following patch does multiple things:
1) for the above 2, asks the ranger if Y could be equal to B and if so,
   instead of using X r<< Y uses X r<< (Y & (B - 1))
2) for the
   ((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1
   ((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1
   forms that were fixed 2 days ago it only punts if Y might be in the
   [B,B2-1] range but isn't known to be in the
   [0,B][2*B,2*B][3*B,3*B]... range.  Because for Y which is a multiple
   of B but smaller than B2 it acts as a rotate too, left shift provides
   0 and (-Y) & (B - 1) is 0 and so preserves X.  Though, for the cases
   where Y is not known to be in [0,B-1] the patch also uses
   X r<< (Y & (B - 1)) rather than X r<< Y
3) as discussed with Aldy, instead of using global ranger it uses a pass
   specific copy but lazily created on first simplify_rotate that needs it;
   this e.g. handles rotate inside of if body where the guarding condition
   limits the shift count to some range which will not work with the
   global ranger (unless there is some SSA_NAME to attach the range to).

Note, e.g. on x86 X r<< (Y & (B - 1)) and X r<< Y actually emit the
same assembly because rotates work the same even for larger rotate counts,
but that is handled only during combine.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-01-19  Jakub Jelinek  

PR tree-optimization/108440
* tree-ssa-forwprop.cc: Include gimple-range.h.
(simplify_rotate): For the forms with T2 wider than T and shift counts 
of
Y and B - Y add & (B - 1) masking for the rotate count if Y could be 
equal
to B.  For the forms with T2 wider than T and shift counts of
Y and (-Y) & (B - 1), don't punt if range could be [B, B2], but only if
range doesn't guarantee Y < B or Y = N * B.  If range doesn't guarantee
Y < B, also add & (B - 1) masking for the rotate count.  Use lazily 
created
pass specific ranger instead of get_global_range_query.
(pass_forwprop::execute): Disable that ranger at the end of pass if it 
has
been created.

* c-c++-common/rotate-10.c: New test.
* c-c++-common/rotate-11.c: New test.

--- gcc/tree-ssa-forwprop.cc.jj 2023-01-17 12:14:15.845088330 +0100
+++ gcc/tree-ssa-forwprop.cc2023-01-18 13:30:59.337914945 +0100
@@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.
 #include "internal-fn.h"
 #include "cgraph.h"
 #include "tree-ssa.h"
+#include "gimple-range.h"
 
 /* This pass propagates the RHS of assignment statements into use
sites of the LHS of the assignment.  It's basically a specialized
@@ -1837,8 +1838,12 @@ defcodefor_name (tree name, enum tree_co
((T) ((T2) X << Y)) | ((T) ((T2) X >> ((-Y) & (B - 1
((T) ((T2) X << (int) Y)) | ((T) ((T2) X >> (int) ((-Y) & (B - 1
 
-   transform these into (last 2 only if ranger can prove Y < B):
+   transform these into (last 2 only if ranger can prove Y < B
+   or Y = N * B):
X r<< Y
+   or
+   X r<< (& & (B - 1))
+   The latter for the forms with T2 wider than T if ranger can't prove Y < B.
 
Or for:
(X << (Y & (B - 1))) | (X >> ((-Y) & (B - 1)))
@@ -1868,6 +1873,7 @@ simplify_rotate (gimple_stmt_iterator *g
   gimple *g;
   gimple *def_arg_stmt[2] = { NULL, NULL };
   int wider_prec = 0;
+  bool add_masking = false;
 
   arg[0] = gimple_assign_rhs1 (stmt);
   arg[1] = gimple_assign_rhs2 (stmt);
@@ -1995,7 +2001,7 @@ simplify_rotate (gimple_stmt_iterator *g
   tree cdef_arg1[2], cdef_arg2[2], def_arg2_alt[2];
   enum tree_code cdef_code[2];
   gimple *def_arg_alt_stmt[2] = { NULL, NULL };
-  bool check_range = false;
+  int check_range = 0;
   gimple *check_range_stmt = NULL;
   /* Look through conversion of the shift count argument.
 The C/C++ FE cast any shift count argument to