Re: [PATCH] c++/modules: Stream unmergeable temporaries by value again [PR114856]

2024-05-06 Thread Nathaniel Shead
On Thu, May 02, 2024 at 01:53:44PM -0400, Jason Merrill wrote:
> On 5/2/24 10:40, Patrick Palka wrote:
> > On Thu, 2 May 2024, Nathaniel Shead wrote:
> > 
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk/14.2?
> > > 
> > > Another alternative would be to stream such !DECL_NAME temporaries with
> > > a merge key of MK_unique rather than attempting to find the matching
> > > (nonexistant) field of the class context.
> > 
> > Both approaches sound good to me, hard to say which one is preferable..
> > 
> > The handling of function-scope vs class-scope temporaries seems to start
> > diverging in:
> > 
> > @@ -8861,28 +8861,6 @@ trees_out::decl_node (tree decl, walk_kind ref)
> > return false;
> >   }
> > !  tree ctx = CP_DECL_CONTEXT (decl);
> > !  depset *dep = NULL;
> > !  if (streaming_p ())
> > !dep = dep_hash->find_dependency (decl);
> > !  else if (TREE_CODE (ctx) != FUNCTION_DECL
> > !  || TREE_CODE (decl) == TEMPLATE_DECL
> > !  || DECL_IMPLICIT_TYPEDEF_P (decl)
> > !  || (DECL_LANG_SPECIFIC (decl)
> > !  && DECL_MODULE_IMPORT_P (decl)))
> > !{
> > !  auto kind = (TREE_CODE (decl) == NAMESPACE_DECL
> > !  && !DECL_NAMESPACE_ALIAS (decl)
> > !  ? depset::EK_NAMESPACE : depset::EK_DECL);
> > !  dep = dep_hash->add_dependency (decl, kind);
> > !}
> > !
> > !  if (!dep)
> > !{
> > !  /* Some internal entity of context.  Do by value.  */
> > !  decl_value (decl, NULL);
> > !  return false;
> > !}
> > if (dep->get_entity_kind () == depset::EK_REDIRECT)
> >   {
> > 
> > where for a class-scope temporary we add a dependency for it, stream
> > it by reference, and then stream it by value separately, which seems
> > unnecessary.
> > 
> > So if we decide to keep the create_temporary_var change, we probably
> > would want to unify this code path's handling of temporaries (i.e.
> > don't add_dependency a temporary regardless of its context).
> > 
> > If we decide your partially revert the create_temporary_var change,
> > your patch LGTM.
> 
> Streaming by value sounds right, but as noted an important difference
> between reference temps and others is DECL_NAME.  Perhaps the code Patrick
> quotes could look at that as well as the context?
> 
> Jason
> 

With my patch we would no longer go through the code that Patrick quotes
for class-scope temporaries that I can see; we would instead first hit
the following code in 'tree_node':


  if (DECL_P (t))
{
  if (DECL_TEMPLATE_PARM_P (t))
{
  tpl_parm_value (t);
  goto done;
}

  if (!DECL_CONTEXT (t))
{
  /* There are a few cases of decls with no context.  We'll write
 these by value, but first assert they are cases we expect.  */
  gcc_checking_assert (ref == WK_normal);
  switch (TREE_CODE (t))
{
default: gcc_unreachable ();

case LABEL_DECL:
  /* CASE_LABEL_EXPRs contain uncontexted LABEL_DECLs.  */
  gcc_checking_assert (!DECL_NAME (t));
  break;

case VAR_DECL:
  /* AGGR_INIT_EXPRs cons up anonymous uncontexted VAR_DECLs.  */
  gcc_checking_assert (!DECL_NAME (t)
   && DECL_ARTIFICIAL (t));
  break;

case PARM_DECL:
  /* REQUIRES_EXPRs have a tree list of uncontexted
 PARM_DECLS.  It'd be nice if they had a
 distinguishing flag to double check.  */
  break;
}
  goto by_value;
}
}

 skip_normal:
  if (DECL_P (t) && !decl_node (t, ref))
goto done;

  /* Otherwise by value */
 by_value:
  tree_value (t);


I think modifying what Patrick pointed out should only be necessary if
we maintain these nameless temporaries as having a class context; for
clarity, is that the direction you'd prefer me to go in to solve this?

Thanks,
Nathaniel


Re: Re: [PATCH 1/1] RISC-V: Add Zfbfmin extension to the -march= option

2024-05-06 Thread Xiao Zeng
2024-05-07 06:40  Jeff Law  wrote:
>
 
>
>
>On 4/11/24 9:32 PM, Xiao Zeng wrote:
>> This patch would like to add new sub extension (aka Zfbfmin) to the
>> -march= option. It introduces a new data type BF16.
>>
>> 1 The Zfbfmin extension depend on 'F', and the FLH, FSH, FMV.X.H, and
>> FMV.H.X instructions as defined in the Zfh extension.
>>
>> 2 The Zfhmin extension includes the following instructions from the
>> Zfh extension: FLH, FSH, FMV.X.H, FMV.H.X, FCVT.S.H, and FCVT.H.S.
>>
>> 3 Zfhmin extension depend on 'F'.
>>
>> 4 Simply put, just make Zfbfmin dependent on Zfhmin.
>>
>> Perhaps in the future, we could propose making the FLH, FSH, FMV.X.H, and
>> FMV.H.X instructions an independent extension to achieve precise dependency
>> relationships for the Zfbfmin.
>>
>> You can locate more information about Zfbfmin from below spec doc.
>>
>> 
>>
>> Below test are passed for this patch
>>  * The riscv fully regression test.
>I wrote a suitable ChangeLog entry and pushed this patch to the trunk. 
Thanks, jeff

>
>THanks,
>jeff
>
 
Thanks
Xiao Zeng



[RISC-V][V2] Fix incorrect if-then-else nesting of Zbs usage in constant synthesis

2024-05-06 Thread Jeff Law
Reposting without the patch that ignores whitespace.  The CI system 
doesn't like including both patches, that'll generate a failure to apply 
and none of the tests actually get run.


So I managed to goof the if-then-else level of the bseti bits last week. 
 They were supposed to be a last ditch effort to improve the result, 
but ended up inside a conditional where they don't really belong.  I 
almost always use Zba, Zbb and Zbs together, so it slipped by.


So it's NFC if you always test with Zbb and Zbs enabled together.  But 
if you enabled Zbs without Zbb you'd see a failure to use bseti.


Planning to commit once pre-commit CI passes.

jeff

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 6f1c67bf3f7..dddb7f8d673 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -869,50 +869,51 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
  codes[1].use_uw = false;
  cost = 2;
}
-  /* Final cases, particularly focused on bseti.  */
-  else if (cost > 2 && TARGET_ZBS)
-   {
- int i = 0;
+}
 
- /* First handle any bits set by LUI.  Be careful of the
-SImode sign bit!.  */
- if (value & 0x7800)
-   {
- alt_codes[i].code = (i == 0 ? UNKNOWN : IOR);
- alt_codes[i].value = value & 0x7800;
- alt_codes[i].use_uw = false;
- value &= ~0x7800;
- i++;
-   }
+  /* Final cases, particularly focused on bseti.  */
+  if (cost > 2 && TARGET_ZBS)
+{
+  int i = 0;
 
- /* Next, any bits we can handle with addi.  */
- if (value & 0x7ff)
-   {
- alt_codes[i].code = (i == 0 ? UNKNOWN : PLUS);
- alt_codes[i].value = value & 0x7ff;
- alt_codes[i].use_uw = false;
- value &= ~0x7ff;
- i++;
-   }
+  /* First handle any bits set by LUI.  Be careful of the
+SImode sign bit!.  */
+  if (value & 0x7800)
+   {
+ alt_codes[i].code = (i == 0 ? UNKNOWN : IOR);
+ alt_codes[i].value = value & 0x7800;
+ alt_codes[i].use_uw = false;
+ value &= ~0x7800;
+  i++;
+   }
 
- /* And any residuals with bseti.  */
- while (i < cost && value)
-   {
- HOST_WIDE_INT bit = ctz_hwi (value);
- alt_codes[i].code = (i == 0 ? UNKNOWN : IOR);
- alt_codes[i].value = 1UL << bit;
- alt_codes[i].use_uw = false;
- value &= ~(1ULL << bit);
- i++;
-   }
+  /* Next, any bits we can handle with addi.  */
+  if (value & 0x7ff)
+   {
+ alt_codes[i].code = (i == 0 ? UNKNOWN : PLUS);
+ alt_codes[i].value = value & 0x7ff;
+ alt_codes[i].use_uw = false;
+ value &= ~0x7ff;
+ i++;
+   }
 
- /* If LUI+ADDI+BSETI resulted in a more efficient
-sequence, then use it.  */
- if (i < cost)
-   {
- memcpy (codes, alt_codes, sizeof (alt_codes));
- cost = i;
-   }
+  /* And any residuals with bseti.  */
+  while (i < cost && value)
+   {
+ HOST_WIDE_INT bit = ctz_hwi (value);
+ alt_codes[i].code = (i == 0 ? UNKNOWN : IOR);
+ alt_codes[i].value = 1UL << bit;
+ alt_codes[i].use_uw = false;
+ value &= ~(1ULL << bit);
+ i++;
+   }
+
+  /* If LUI+ADDI+BSETI resulted in a more efficient
+sequence, then use it.  */
+  if (i < cost)
+   {
+ memcpy (codes, alt_codes, sizeof (alt_codes));
+ cost = i;
}
 }
 


Re: [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324)

2024-05-06 Thread Jason Merrill

On 5/5/24 14:14, Andi Kleen wrote:

This patch implements a clang compatible [[musttail]] attribute for
returns.


Thanks.


musttail is useful as an alternative to computed goto for interpreters.
With computed goto the interpreter function usually ends up very big
which causes problems with register allocation and other per function
optimizations not scaling. With musttail the interpreter can be instead
written as a sequence of smaller functions that call each other. To
avoid unbounded stack growth this requires forcing a sibling call, which
this attribute does. It guarantees an error if the call cannot be tail
called which allows the programmer to fix it instead of risking a stack
overflow. Unlike computed goto it is also type-safe.

It turns out that David Malcolm had already implemented middle/backend
support for a musttail attribute back in 2016, but it wasn't exposed
to any frontend other than a special plugin.

This patch adds a [[gnu::musttail]] attribute for C++ that can be added
to return statements. The return statement must be a direct call
(it does not follow dependencies), which is similar to what clang
implements. It then uses the existing must tail infrastructure.

For compatibility it also detects clang::musttail

One problem is that tree-tailcall usually fails when optimization
is disabled, which implies the attribute only really works with
optimization on. But that seems to be a reasonable limitation.

Passes bootstrap and full test

PR83324

gcc/cp/ChangeLog:

* cp-tree.h (finish_return_stmt): Add musttail_p.
(check_return_expr): Dito.
* parser.cc (cp_parser_statement): Handle [[musttail]].
(cp_parser_std_attribute): Dito.
(cp_parser_init_statement): Dito.
(cp_parser_jump_statement): Dito.
* semantics.cc (finish_return_stmt): Dito.
* typeck.cc (check_return_expr): Handle musttail_p flag.
---
  gcc/cp/cp-tree.h|  4 ++--
  gcc/cp/parser.cc| 30 --
  gcc/cp/semantics.cc |  6 +++---
  gcc/cp/typeck.cc| 20 ++--
  4 files changed, 47 insertions(+), 13 deletions(-)

@@ -12734,9 +12734,27 @@ cp_parser_statement (cp_parser* parser, tree 
in_statement_expr,
 NULL_TREE, false);
  break;
  
+	case RID_RETURN:

+ {
+   bool musttail_p = false;
+   std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
+   if (lookup_attribute ("gnu", "musttail", std_attrs))
+ {
+   musttail_p = true;
+   std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
+ }
+   // support this for compatibility
+   if (lookup_attribute ("clang", "musttail", std_attrs))
+ {
+   musttail_p = true;
+   std_attrs = remove_attribute ("clang", "musttail", std_attrs);
+ }
+   statement = cp_parser_jump_statement (parser, musttail_p);


It seems to me that if we were to pass _attrs to 
cp_parser_jump_statement, we could handle this entirely in that function 
rather than adding a flag to finish_return_stmt and check_return_stmt.



@@ -30189,7 +30207,7 @@ cp_parser_std_attribute (cp_parser *parser, tree 
attr_ns)
  /* Maybe we don't expect to see any arguments for this attribute.  */
  const attribute_spec *as
= lookup_attribute_spec (TREE_PURPOSE (attribute));
-if (as && as->max_length == 0)
+if ((as && as->max_length == 0) || is_attribute_p ("musttail", attr_id))


I'd prefer to add an attribute to the table, rather than special-case it 
here; apart from consistency, it seems likely that someone will later 
want to apply it to a function.


You need a template testcase; I expect it doesn't work in templates with 
the current patch.  It's probably enough to copy it in tsubst_expr where 
we currently propagate CALL_EXPR_OPERATOR_SYNTAX.


You also need a testcase where the function returns a class; in that 
case the call will often appear as AGGR_INIT_EXPR rather than CALL_EXPR, 
so you'll need to handle that as well.  And see the places that copy 
flags like CALL_EXPR_OPERATOR_SYNTAX between CALL_EXPR and AGGR_INIT_EXPR.


Jason



[PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-06 Thread HAO CHEN GUI
Hi,
  The former patch adds isfinite optab for __builtin_isfinite.
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649339.html

  Thus the builtin might not be folded at front end. The range op for
isfinite is needed for value range analysis. This patch adds them.

  Compared to last version, this version fixes a typo.

  Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
regressions. Is it OK for the trunk?

Thanks
Gui Haochen

ChangeLog
Value Range: Add range op for builtin isfinite

The former patch adds optab for builtin isfinite. Thus builtin isfinite might
not be folded at front end.  So the range op for isfinite is needed for value
range analysis.  This patch adds range op for builtin isfinite.

gcc/
* gimple-range-op.cc (class cfn_isfinite): New.
(op_cfn_finite): New variables.
(gimple_range_op_handler::maybe_builtin_call): Handle
CFN_BUILT_IN_ISFINITE.

gcc/testsuite/
* gcc/testsuite/gcc.dg/tree-ssa/range-isfinite.c: New test.

patch.diff
diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 9de130b4022..99c511728d3 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -1192,6 +1192,56 @@ public:
   }
 } op_cfn_isinf;

+//Implement range operator for CFN_BUILT_IN_ISFINITE
+class cfn_isfinite : public range_operator
+{
+public:
+  using range_operator::fold_range;
+  using range_operator::op1_range;
+  virtual bool fold_range (irange , tree type, const frange ,
+  const irange &, relation_trio) const override
+  {
+if (op1.undefined_p ())
+  return false;
+
+if (op1.known_isfinite ())
+  {
+   r.set_nonzero (type);
+   return true;
+  }
+
+if (op1.known_isnan ()
+   || op1.known_isinf ())
+  {
+   r.set_zero (type);
+   return true;
+  }
+
+return false;
+  }
+  virtual bool op1_range (frange , tree type, const irange ,
+ const frange &, relation_trio) const override
+  {
+if (lhs.zero_p ())
+  {
+   // The range is [-INF,-INF][+INF,+INF] NAN, but it can't be represented.
+   // Set range to varying
+   r.set_varying (type);
+   return true;
+  }
+
+if (!range_includes_zero_p ())
+  {
+   nan_state nan (false);
+   r.set (type, real_min_representable (type),
+  real_max_representable (type), nan);
+   return true;
+  }
+
+return false;
+  }
+} op_cfn_isfinite;
+
 // Implement range operator for CFN_BUILT_IN_
 class cfn_parity : public range_operator
 {
@@ -1288,6 +1338,11 @@ gimple_range_op_handler::maybe_builtin_call ()
   m_operator = _cfn_isinf;
   break;

+case CFN_BUILT_IN_ISFINITE:
+  m_op1 = gimple_call_arg (call, 0);
+  m_operator = _cfn_isfinite;
+  break;
+
 CASE_CFN_COPYSIGN_ALL:
   m_op1 = gimple_call_arg (call, 0);
   m_op2 = gimple_call_arg (call, 1);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/range-isfinite.c 
b/gcc/testsuite/gcc.dg/tree-ssa/range-isfinite.c
new file mode 100644
index 000..f5dce0a0486
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/range-isfinite.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp" } */
+
+#include 
+void link_error();
+
+void test1 (double x)
+{
+  if (x < __DBL_MAX__ && x > -__DBL_MAX__ && !__builtin_isfinite (x))
+link_error ();
+}
+
+void test2 (float x)
+{
+  if (x < __FLT_MAX__ && x > -__FLT_MAX__ && !__builtin_isfinite (x))
+link_error ();
+}
+
+void test3 (double x)
+{
+  if (__builtin_isfinite (x) && __builtin_isinf (x))
+link_error ();
+}
+
+void test4 (float x)
+{
+  if (__builtin_isfinite (x) && __builtin_isinf (x))
+link_error ();
+}
+
+/* { dg-final { scan-tree-dump-not "link_error" "evrp" } } */


[PATCH-6, rs6000] Split setcc to two insns after reload

2024-05-06 Thread HAO CHEN GUI
Hi,
  It's the sixth patch of a series of patches optimizing CC modes on
rs6000.

  This patch splits setcc to two separate insns after reload so that
other insns can be inserted between them. It should increase the
parallelism.

  The rotate_cr pattern still needs the info of the number of cr fields
as the pass pro_and_epilogue might change the cr register.

  Bootstrapped and tested on powerpc64-linux BE and LE with no
regressions. Is it OK for the trunk?

Thanks
Gui Haochen


ChangeLog
rs6000: Split setcc to two insns after reload

This patch splits setcc to two separate insns after reload so that other
insns can be inserted between them.

gcc/
* config/rs6000/rs6000.md (c_enum unpsec): Add UNSPEC_MFCR and
UNSPEC_ROTATE_CR.
(*move_from_cr): New.
(insn set_cc): Remove.
(*rotate_cr): New.
(insn_and_split set_cc): New.

patch.diff
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index ccf392b6409..0ad08e3111e 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -159,6 +159,8 @@ (define_c_enum "unspec"
UNSPEC_XXSPLTIW_CONST
UNSPEC_FMAX
UNSPEC_FMIN
+   UNSPEC_MFCR
+   UNSPEC_ROTATE_CR
   ])

 ;;
@@ -12744,26 +12746,51 @@ (define_insn_and_split "*cmp_internal2"
 }
 })
 
-;; Now we have the scc insns.  We can do some combinations because of the
-;; way the machine works.
-;;
-;; Note that this is probably faster if we can put an insn between the
-;; mfcr and rlinm, but this is tricky.  Let's leave it for now.  In most
-;; cases the insns below which don't use an intermediate CR field will
-;; be used instead.
-(define_insn "set_cc"
+
+(define_insn "*move_from_cr"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
-   (match_operator:GPR 1 "scc_comparison_operator"
-   [(match_operand 2 "cc_reg_operand" "y")
-(const_int 0)]))]
+   (unspec:GPR [(match_operand 1 "cc_reg_operand" "y")]
+   UNSPEC_MFCR))]
   ""
-  "mfcr %0%Q2\;rlwinm %0,%0,%J1,1"
+  "mfcr %0%Q1"
   [(set (attr "type")
  (cond [(match_test "TARGET_MFCRF")
(const_string "mfcrf")
   ]
-   (const_string "mfcr")))
-   (set_attr "length" "8")])
+   (const_string "mfcr")))])
+
+;; Split the insn after reload so that other insns can be inserted
+;; between mfcr and rlinm.
+(define_insn_and_split "set_cc"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+   (match_operator:GPR 1 "scc_comparison_operator"
+   [(match_operand 2 "cc_reg_operand" "y")
+(const_int 0)]))]
+  "!TARGET_POWER10
+   || (GET_MODE (operands[2]) != CCmode
+   && GET_MODE (operands[2]) != CCUNSmode)"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0)
+   (unspec:GPR [(match_dup 2)]
+   UNSPEC_MFCR))
+   (set (match_dup 0)
+   (unspec:GPR [(match_dup 0)
+(match_dup 1)]
+   UNSPEC_ROTATE_CR))]
+  ""
+  [(set_attr "length" "8")])
+
+(define_insn "*rotate_cr"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+   (unspec:GPR [(match_operand:GPR 3 "gpc_reg_operand" "r")
+(match_operator:GPR 1 "scc_comparison_operator"
+   [(match_operand 2 "cc_reg_operand" "y")
+(const_int 0)])]
+   UNSPEC_ROTATE_CR))]
+  ""
+  "rlwinm %0,%3,%J1,1"
+)

 (define_insn_and_split "*set_rev"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")


Re: [PATCH][GCC 13] RISC-V: Fix vsetvli local eliminate [PR114747]

2024-05-06 Thread Kito Cheng
Committed to gcc 13 branch, thanks:)

On Tue, May 7, 2024 at 9:20 AM juzhe.zh...@rivai.ai
 wrote:
>
> LGTM。
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: Kito Cheng
> Date: 2024-05-07 09:17
> To: gcc-patches; kito.cheng; palmer; jeffreyalaw; rdapp; juzhe.zhong; pan2.li
> CC: Kito Cheng
> Subject: [PATCH][GCC 13] RISC-V: Fix vsetvli local eliminate [PR114747]
> vsetvli local eliminate is only consider the current demand instead of
> full demand, and it will use that incomplete info to remove vsetvli.
>
> Give following example from PR114747:
>
> vsetvli a5,a1,e8,m4,ta,mu   # 57, ratio=2, sew=8, lmul=4
> vsetvli zero,a5,e16,m8,ta,ma# 58, ratio=2, sew=16, lmul=8
> vle8.v  v8,0(a0)# 13, demand ratio=2
> vzext.vf2   v24,v8  # 14, demand sew=16 and lmul=8
>
> Insn #58 will removed because #57 has satisfied demand of #13, but it's
> not consider #14.
>
> It should doing more demand analyze, but this bug only present in GCC 13
> branch, and we should not change too much on this release branch, so the best
> way is make the check more conservative - remove only if the target
> vsetvl_discard_result having same SEW and LMUL as the source vsetvli.
>
> gcc/ChangeLog:
>
> PR target/114747
> * config/riscv/riscv-vsetvl.cc (local_eliminate_vsetvl_insn):
> Check target vsetvl_discard_result and source vsetvli has same
> SEW and LMUL.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/vsetvl/pr114747.c: New.
> ---
> gcc/config/riscv/riscv-vsetvl.cc   | 10 ++
> .../gcc.target/riscv/rvv/vsetvl/pr114747.c | 18 ++
> 2 files changed, 28 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr114747.c
>
> diff --git a/gcc/config/riscv/riscv-vsetvl.cc 
> b/gcc/config/riscv/riscv-vsetvl.cc
> index 587c6975a70..e6606b1e4de 100644
> --- a/gcc/config/riscv/riscv-vsetvl.cc
> +++ b/gcc/config/riscv/riscv-vsetvl.cc
> @@ -1106,6 +1106,16 @@ local_eliminate_vsetvl_insn (const vector_insn_info 
> )
>   if (!new_info.skip_avl_compatible_p (dem))
> return;
> +   /* Be more conservative here since we don't really get full
> + demand info for following instructions, also that instruction
> + isn't exist in RTL-SSA yet so we need parse that by low level
> + API rather than vector_insn_info::parse_insn, see PR114747.  */
> +   unsigned last_vsetvli_sew = ::get_sew (PREV_INSN (i->rtl ()));
> +   unsigned last_vsetvli_lmul = ::get_vlmul (PREV_INSN (i->rtl ()));
> +   if (new_info.get_sew() != last_vsetvli_sew ||
> +   new_info.get_vlmul() != last_vsetvli_lmul)
> + return;
> +
>   new_info.set_avl_info (dem.get_avl_info ());
>   new_info = dem.merge (new_info, LOCAL_MERGE);
>   change_vsetvl_insn (insn, new_info);
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr114747.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr114747.c
> new file mode 100644
> index 000..c478405e8d6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr114747.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize 
> -fno-schedule-insns -fno-schedule-insns2" } */
> +
> +#include "riscv_vector.h"
> +
> +typedef unsigned short char16_t;
> +
> +size_t convert_latin1_to_utf16le(const char *src, size_t len, char16_t *dst) 
> {
> +  char16_t *beg = dst;
> +  for (size_t vl; len > 0; len -= vl, src += vl, dst += vl) {
> +vl = __riscv_vsetvl_e8m4(len);
> +vuint8m4_t v = __riscv_vle8_v_u8m4((uint8_t*)src, vl);
> +__riscv_vse16_v_u16m8((uint16_t*)dst, __riscv_vzext_vf2_u16m8(v, vl), 
> vl);
> +  }
> +  return dst - beg;
> +}
> +
> +/* { dg-final { scan-assembler 
> {vsetvli\s+[a-z0-9]+,\s*[a-x0-9]+,\s*e16,\s*m8,\s*t[au],\s*m[au]} } } */
> --
> 2.34.1
>
>


Re: [PATCH][GCC 13] RISC-V: Fix vsetvli local eliminate [PR114747]

2024-05-06 Thread juzhe.zh...@rivai.ai
LGTM。



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2024-05-07 09:17
To: gcc-patches; kito.cheng; palmer; jeffreyalaw; rdapp; juzhe.zhong; pan2.li
CC: Kito Cheng
Subject: [PATCH][GCC 13] RISC-V: Fix vsetvli local eliminate [PR114747]
vsetvli local eliminate is only consider the current demand instead of
full demand, and it will use that incomplete info to remove vsetvli.
 
Give following example from PR114747:
 
vsetvli a5,a1,e8,m4,ta,mu   # 57, ratio=2, sew=8, lmul=4
vsetvli zero,a5,e16,m8,ta,ma# 58, ratio=2, sew=16, lmul=8
vle8.v  v8,0(a0)# 13, demand ratio=2
vzext.vf2   v24,v8  # 14, demand sew=16 and lmul=8
 
Insn #58 will removed because #57 has satisfied demand of #13, but it's
not consider #14.
 
It should doing more demand analyze, but this bug only present in GCC 13
branch, and we should not change too much on this release branch, so the best
way is make the check more conservative - remove only if the target
vsetvl_discard_result having same SEW and LMUL as the source vsetvli.
 
gcc/ChangeLog:
 
PR target/114747
* config/riscv/riscv-vsetvl.cc (local_eliminate_vsetvl_insn):
Check target vsetvl_discard_result and source vsetvli has same
SEW and LMUL.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/vsetvl/pr114747.c: New.
---
gcc/config/riscv/riscv-vsetvl.cc   | 10 ++
.../gcc.target/riscv/rvv/vsetvl/pr114747.c | 18 ++
2 files changed, 28 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr114747.c
 
diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 587c6975a70..e6606b1e4de 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1106,6 +1106,16 @@ local_eliminate_vsetvl_insn (const vector_insn_info )
  if (!new_info.skip_avl_compatible_p (dem))
return;
+   /* Be more conservative here since we don't really get full
+ demand info for following instructions, also that instruction
+ isn't exist in RTL-SSA yet so we need parse that by low level
+ API rather than vector_insn_info::parse_insn, see PR114747.  */
+   unsigned last_vsetvli_sew = ::get_sew (PREV_INSN (i->rtl ()));
+   unsigned last_vsetvli_lmul = ::get_vlmul (PREV_INSN (i->rtl ()));
+   if (new_info.get_sew() != last_vsetvli_sew ||
+   new_info.get_vlmul() != last_vsetvli_lmul)
+ return;
+
  new_info.set_avl_info (dem.get_avl_info ());
  new_info = dem.merge (new_info, LOCAL_MERGE);
  change_vsetvl_insn (insn, new_info);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr114747.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr114747.c
new file mode 100644
index 000..c478405e8d6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr114747.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize 
-fno-schedule-insns -fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+typedef unsigned short char16_t;
+
+size_t convert_latin1_to_utf16le(const char *src, size_t len, char16_t *dst) {
+  char16_t *beg = dst;
+  for (size_t vl; len > 0; len -= vl, src += vl, dst += vl) {
+vl = __riscv_vsetvl_e8m4(len);
+vuint8m4_t v = __riscv_vle8_v_u8m4((uint8_t*)src, vl);
+__riscv_vse16_v_u16m8((uint16_t*)dst, __riscv_vzext_vf2_u16m8(v, vl), vl);
+  }
+  return dst - beg;
+}
+
+/* { dg-final { scan-assembler 
{vsetvli\s+[a-z0-9]+,\s*[a-x0-9]+,\s*e16,\s*m8,\s*t[au],\s*m[au]} } } */
-- 
2.34.1
 
 


[PATCH][GCC 13] RISC-V: Fix vsetvli local eliminate [PR114747]

2024-05-06 Thread Kito Cheng
vsetvli local eliminate is only consider the current demand instead of
full demand, and it will use that incomplete info to remove vsetvli.

Give following example from PR114747:

vsetvli a5,a1,e8,m4,ta,mu   # 57, ratio=2, sew=8, lmul=4
vsetvli zero,a5,e16,m8,ta,ma# 58, ratio=2, sew=16, lmul=8
vle8.v  v8,0(a0)# 13, demand ratio=2
vzext.vf2   v24,v8  # 14, demand sew=16 and lmul=8

Insn #58 will removed because #57 has satisfied demand of #13, but it's
not consider #14.

It should doing more demand analyze, but this bug only present in GCC 13
branch, and we should not change too much on this release branch, so the best
way is make the check more conservative - remove only if the target
vsetvl_discard_result having same SEW and LMUL as the source vsetvli.

gcc/ChangeLog:

PR target/114747
* config/riscv/riscv-vsetvl.cc (local_eliminate_vsetvl_insn):
Check target vsetvl_discard_result and source vsetvli has same
SEW and LMUL.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr114747.c: New.
---
 gcc/config/riscv/riscv-vsetvl.cc   | 10 ++
 .../gcc.target/riscv/rvv/vsetvl/pr114747.c | 18 ++
 2 files changed, 28 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr114747.c

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 587c6975a70..e6606b1e4de 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1106,6 +1106,16 @@ local_eliminate_vsetvl_insn (const vector_insn_info )
  if (!new_info.skip_avl_compatible_p (dem))
return;
 
+ /* Be more conservative here since we don't really get full
+demand info for following instructions, also that instruction
+isn't exist in RTL-SSA yet so we need parse that by low level
+API rather than vector_insn_info::parse_insn, see PR114747.  */
+ unsigned last_vsetvli_sew = ::get_sew (PREV_INSN (i->rtl ()));
+ unsigned last_vsetvli_lmul = ::get_vlmul (PREV_INSN (i->rtl ()));
+ if (new_info.get_sew() != last_vsetvli_sew ||
+ new_info.get_vlmul() != last_vsetvli_lmul)
+   return;
+
  new_info.set_avl_info (dem.get_avl_info ());
  new_info = dem.merge (new_info, LOCAL_MERGE);
  change_vsetvl_insn (insn, new_info);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr114747.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr114747.c
new file mode 100644
index 000..c478405e8d6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr114747.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize 
-fno-schedule-insns -fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+typedef unsigned short char16_t;
+
+size_t convert_latin1_to_utf16le(const char *src, size_t len, char16_t *dst) {
+  char16_t *beg = dst;
+  for (size_t vl; len > 0; len -= vl, src += vl, dst += vl) {
+vl = __riscv_vsetvl_e8m4(len);
+vuint8m4_t v = __riscv_vle8_v_u8m4((uint8_t*)src, vl);
+__riscv_vse16_v_u16m8((uint16_t*)dst, __riscv_vzext_vf2_u16m8(v, vl), vl);
+  }
+  return dst - beg;
+}
+
+/* { dg-final { scan-assembler 
{vsetvli\s+[a-z0-9]+,\s*[a-x0-9]+,\s*e16,\s*m8,\s*t[au],\s*m[au]} } } */
-- 
2.34.1



Re: [PATCH v3] c++/modules: Fix dangling pointer with imported_temploid_friends

2024-05-06 Thread Jason Merrill

On 5/6/24 18:53, Patrick Palka wrote:

On Mon, 6 May 2024, Jason Merrill wrote:


On 5/3/24 07:17, Nathaniel Shead wrote:

On Thu, May 02, 2024 at 02:05:38PM -0400, Jason Merrill wrote:

On 5/1/24 21:34, Nathaniel Shead wrote:

On Thu, May 02, 2024 at 12:15:44AM +1000, Nathaniel Shead wrote:

On Wed, May 01, 2024 at 09:57:38AM -0400, Patrick Palka wrote:


On Wed, 1 May 2024, Nathaniel Shead wrote:


Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk
(and
later 14.2)?  I don't think making it a GTY root is necessary but
I felt
perhaps better to be safe than sorry.

Potentially another approach would be to use DECL_UID instead like
how
entity_map does; would that be preferable?

-- >8 --

I got notified by Linaro CI and by checking testresults that there
seems
to be some occasional failures in tpl-friend-4_b.C on some
architectures
and standards modes since r15-59-gb5f6a56940e708.  I haven't been
able
to reproduce but looking at the backtrace I suspect the issue is
that
we're adding to the 'imported_temploid_friend' map a decl that is
ultimately discarded, which then has its address reused by a later
decl
causing a failure in the assert in 'set_originating_module'.

This patch attempts to fix the issue in two ways: by ensuring that
we
only store the decl if we know it's a new decl (and hence won't be
discarded), and by making the imported_temploid_friends map a GTY
root
so that even if the decl does get discarded later the address
isn't
reused.

gcc/cp/ChangeLog:

* module.cc (imported_temploid_friends): Mark GTY, and...
(init_modules): ...allocate from GGC.
(trees_in::decl_value): Only write to
imported_temploid_friends
for new decls.

Signed-off-by: Nathaniel Shead 
---
gcc/cp/module.cc | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 5b8ff5bc483..37d38bb9654 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -2731,7 +2731,7 @@ static keyed_map_t *keyed_table;
   need to be attached to the same module as the temploid.
This maps
   these decls to the temploid they are instantiated them, as
there is
   no other easy way to get this information.  */
-static hash_map *imported_temploid_friends;
+static GTY(()) hash_map *imported_temploid_friends;
//
/* Tree streaming.   The tree streaming is very specific to the
tree
@@ -8327,7 +8327,8 @@ trees_in::decl_value ()
  if (TREE_CODE (inner) == FUNCTION_DECL
  || TREE_CODE (inner) == TYPE_DECL)
if (tree owner = tree_node ())
-  imported_temploid_friends->put (decl, owner);
+  if (is_new)
+   imported_temploid_friends->put (decl, owner);


Hmm, I'm not seeing this code path getting reached for
tpl-friend-4_b.C.
It seems we're instead adding to imported_temploid_friends from
propagate_defining_module, during tsubst_friend_function.

What seems to be happening is that we we first
tsubst_friend_function
'foo' from TPL, and then we tsubst_friend_function 'foo' from
DEF,
which ends up calling duplicate_decls, which ggc_frees this 'foo'
redeclaration that is still present in the imported_temploid_friends
map.

So I don't think marking imported_temploid_friends as a GC root
would
help with this situation.  If we want to keep
imported_temploid_friends
as a tree -> tree map, I think we just need to ensure that a decl
is removed from the map upon getting ggc_free'd from e.g.
duplicate_decls.


Could we instead move the call to propagate_defining_module down a few
lines, after the pushdecl?


Doing that for tsubst_friend_class seems to work OK with my current test
cases, but for tsubst_friend_function doing so causes ICEs in
'module_may_redeclare' within duplicate_decls because the function is
already marked as attached but the originating module information hasn't
been setup yet.


It's unfortunate that we need to add a hash table entry in order to make it
through duplicate_decls, at which point it becomes dead.  Ah, well.


I suppose with tsubst_friend_class it works though because we can't ever
take the pushdecl branch if an existing type exists that we would call
duplicate_decls on.


But it seems simpler to use DECL_UID as the key instead, since those
never get reused even after the decl gets ggc_free'd IIUC.


It still means garbage entries in the hash_map, which is undesirable even
if
it doesn't cause the same kind of breakage.

Incidentally, decl_tree_map is preferable to hash_map when the
key is always a decl.


Ah thanks, didn't know about decl_tree_map.  I feel that I prefer using
DECL_UIDs explicitly here though; it's also consistent with the existing
usage in entity_map_t, and it looks like decl_tree_map is still perhaps
vulnerable to the original issue here (since DECL_UID is only used for
hashing and not for equality, it looks like?).

Though that said, 'decl_constraints' in constraints.cc seems to be using
it fine 

Re: [PATCH] arm: Support -mfdpic for more targets

2024-05-06 Thread Fangrui Song
On Wed, Mar 6, 2024 at 1:54 AM Richard Earnshaw (lists)
 wrote:
>
> On 06/03/2024 05:07, Fangrui Song wrote:
> > On Fri, Feb 23, 2024 at 7:33 PM Fangrui Song  wrote:
> >>
> >> From: Fangrui Song 
> >>
> >> Targets that are not arm*-*-uclinuxfdpiceabi can use -S -mfdpic, but -c
> >> -mfdpic does not pass --fdpic to gas.  This is an unnecessary
> >> restriction.  Just define the ASM_SPEC in bpabi.h.
> >>
> >> Additionally, use armelf[b]_linux_fdpiceabi emulations for -mfdpic in
> >> linux-eabi.h.  This will allow a future musl fdpic port to use the
> >> desired BFD emulation.
> >>
> >> gcc/ChangeLog:
> >>
> >> * config/arm/bpabi.h (TARGET_FDPIC_ASM_SPEC): Transform -mfdpic.
> >> * config/arm/linux-eabi.h (TARGET_FDPIC_LINKER_EMULATION): Define.
> >> (SUBTARGET_EXTRA_LINK_SPEC): Use TARGET_FDPIC_LINKER_EMULATION
> >> if -mfdpic.
> >> ---
> >>  gcc/config/arm/bpabi.h  | 2 +-
> >>  gcc/config/arm/linux-eabi.h | 5 -
> >>  2 files changed, 5 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
> >> index 7a279f3ed3c..6778be1a8bf 100644
> >> --- a/gcc/config/arm/bpabi.h
> >> +++ b/gcc/config/arm/bpabi.h
> >> @@ -55,7 +55,7 @@
> >>  #define TARGET_FIX_V4BX_SPEC " %{mcpu=arm8|mcpu=arm810|mcpu=strongarm*"\
> >>"|march=armv4|mcpu=fa526|mcpu=fa626:--fix-v4bx}"
> >>
> >> -#define TARGET_FDPIC_ASM_SPEC ""
> >> +#define TARGET_FDPIC_ASM_SPEC "%{mfdpic: --fdpic}"
> >>
> >>  #define BE8_LINK_SPEC  \
> >>"%{!r:%{!mbe32:%:be8_linkopt(%{mlittle-endian:little}"   \
> >> diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h
> >> index eef791f6a02..0c5c58e4928 100644
> >> --- a/gcc/config/arm/linux-eabi.h
> >> +++ b/gcc/config/arm/linux-eabi.h
> >> @@ -46,12 +46,15 @@
> >>  #undef  TARGET_LINKER_EMULATION
> >>  #if TARGET_BIG_ENDIAN_DEFAULT
> >>  #define TARGET_LINKER_EMULATION "armelfb_linux_eabi"
> >> +#define TARGET_FDPIC_LINKER_EMULATION "armelfb_linux_fdpiceabi"
> >>  #else
> >>  #define TARGET_LINKER_EMULATION "armelf_linux_eabi"
> >> +#define TARGET_FDPIC_LINKER_EMULATION "armelf_linux_fdpiceabi"
> >>  #endif
> >>
> >>  #undef  SUBTARGET_EXTRA_LINK_SPEC
> >> -#define SUBTARGET_EXTRA_LINK_SPEC " -m " TARGET_LINKER_EMULATION
> >> +#define SUBTARGET_EXTRA_LINK_SPEC " -m %{mfdpic: " \
> >> +  TARGET_FDPIC_LINKER_EMULATION ";:" TARGET_LINKER_EMULATION "}"
> >>
> >>  /* GNU/Linux on ARM currently supports three dynamic linkers:
> >> - ld-linux.so.2 - for the legacy ABI
> >> --
> >> 2.44.0.rc1.240.g4c46232300-goog
> >>
> >
> > Ping:)
> >
>
> We're in stage4 at present and this is new material.  I'll look at it after 
> the branch has been cut.
>
> R.

refs/heads/releases/gcc-14 has been cut :)



-- 
宋方睿


Re: [PATCH v3] c++/modules: Fix dangling pointer with imported_temploid_friends

2024-05-06 Thread Patrick Palka
On Mon, 6 May 2024, Jason Merrill wrote:

> On 5/3/24 07:17, Nathaniel Shead wrote:
> > On Thu, May 02, 2024 at 02:05:38PM -0400, Jason Merrill wrote:
> > > On 5/1/24 21:34, Nathaniel Shead wrote:
> > > > On Thu, May 02, 2024 at 12:15:44AM +1000, Nathaniel Shead wrote:
> > > > > On Wed, May 01, 2024 at 09:57:38AM -0400, Patrick Palka wrote:
> > > > > > 
> > > > > > On Wed, 1 May 2024, Nathaniel Shead wrote:
> > > > > > 
> > > > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk
> > > > > > > (and
> > > > > > > later 14.2)?  I don't think making it a GTY root is necessary but
> > > > > > > I felt
> > > > > > > perhaps better to be safe than sorry.
> > > > > > > 
> > > > > > > Potentially another approach would be to use DECL_UID instead like
> > > > > > > how
> > > > > > > entity_map does; would that be preferable?
> > > > > > > 
> > > > > > > -- >8 --
> > > > > > > 
> > > > > > > I got notified by Linaro CI and by checking testresults that there
> > > > > > > seems
> > > > > > > to be some occasional failures in tpl-friend-4_b.C on some
> > > > > > > architectures
> > > > > > > and standards modes since r15-59-gb5f6a56940e708.  I haven't been
> > > > > > > able
> > > > > > > to reproduce but looking at the backtrace I suspect the issue is
> > > > > > > that
> > > > > > > we're adding to the 'imported_temploid_friend' map a decl that is
> > > > > > > ultimately discarded, which then has its address reused by a later
> > > > > > > decl
> > > > > > > causing a failure in the assert in 'set_originating_module'.
> > > > > > > 
> > > > > > > This patch attempts to fix the issue in two ways: by ensuring that
> > > > > > > we
> > > > > > > only store the decl if we know it's a new decl (and hence won't be
> > > > > > > discarded), and by making the imported_temploid_friends map a GTY
> > > > > > > root
> > > > > > > so that even if the decl does get discarded later the address
> > > > > > > isn't
> > > > > > > reused.
> > > > > > > 
> > > > > > > gcc/cp/ChangeLog:
> > > > > > > 
> > > > > > >   * module.cc (imported_temploid_friends): Mark GTY, and...
> > > > > > >   (init_modules): ...allocate from GGC.
> > > > > > >   (trees_in::decl_value): Only write to
> > > > > > > imported_temploid_friends
> > > > > > >   for new decls.
> > > > > > > 
> > > > > > > Signed-off-by: Nathaniel Shead 
> > > > > > > ---
> > > > > > >gcc/cp/module.cc | 7 ---
> > > > > > >1 file changed, 4 insertions(+), 3 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> > > > > > > index 5b8ff5bc483..37d38bb9654 100644
> > > > > > > --- a/gcc/cp/module.cc
> > > > > > > +++ b/gcc/cp/module.cc
> > > > > > > @@ -2731,7 +2731,7 @@ static keyed_map_t *keyed_table;
> > > > > > >   need to be attached to the same module as the temploid.
> > > > > > > This maps
> > > > > > >   these decls to the temploid they are instantiated them, as
> > > > > > > there is
> > > > > > >   no other easy way to get this information.  */
> > > > > > > -static hash_map *imported_temploid_friends;
> > > > > > > +static GTY(()) hash_map *imported_temploid_friends;
> > > > > > >
> > > > > > > //
> > > > > > >/* Tree streaming.   The tree streaming is very specific to the
> > > > > > > tree
> > > > > > > @@ -8327,7 +8327,8 @@ trees_in::decl_value ()
> > > > > > >  if (TREE_CODE (inner) == FUNCTION_DECL
> > > > > > >  || TREE_CODE (inner) == TYPE_DECL)
> > > > > > >if (tree owner = tree_node ())
> > > > > > > -  imported_temploid_friends->put (decl, owner);
> > > > > > > +  if (is_new)
> > > > > > > + imported_temploid_friends->put (decl, owner);
> > > > > > 
> > > > > > Hmm, I'm not seeing this code path getting reached for
> > > > > > tpl-friend-4_b.C.
> > > > > > It seems we're instead adding to imported_temploid_friends from
> > > > > > propagate_defining_module, during tsubst_friend_function.
> > > > > > 
> > > > > > What seems to be happening is that we we first
> > > > > > tsubst_friend_function
> > > > > > 'foo' from TPL, and then we tsubst_friend_function 'foo' from
> > > > > > DEF,
> > > > > > which ends up calling duplicate_decls, which ggc_frees this 'foo'
> > > > > > redeclaration that is still present in the imported_temploid_friends
> > > > > > map.
> > > > > > 
> > > > > > So I don't think marking imported_temploid_friends as a GC root
> > > > > > would
> > > > > > help with this situation.  If we want to keep
> > > > > > imported_temploid_friends
> > > > > > as a tree -> tree map, I think we just need to ensure that a decl
> > > > > > is removed from the map upon getting ggc_free'd from e.g.
> > > > > > duplicate_decls.
> > > 
> > > Could we instead move the call to propagate_defining_module down a few
> > > lines, after the pushdecl?
> > 
> > Doing that for tsubst_friend_class seems to work OK with my current test
> > cases, but for tsubst_friend_function doing so causes 

Re: [PATCH 1/1] RISC-V: Add Zfbfmin extension to the -march= option

2024-05-06 Thread Jeff Law




On 4/11/24 9:32 PM, Xiao Zeng wrote:

This patch would like to add new sub extension (aka Zfbfmin) to the
-march= option. It introduces a new data type BF16.

1 The Zfbfmin extension depend on 'F', and the FLH, FSH, FMV.X.H, and
FMV.H.X instructions as defined in the Zfh extension.

2 The Zfhmin extension includes the following instructions from the
Zfh extension: FLH, FSH, FMV.X.H, FMV.H.X, FCVT.S.H, and FCVT.H.S.

3 Zfhmin extension depend on 'F'.

4 Simply put, just make Zfbfmin dependent on Zfhmin.

Perhaps in the future, we could propose making the FLH, FSH, FMV.X.H, and
FMV.H.X instructions an independent extension to achieve precise dependency
relationships for the Zfbfmin.

You can locate more information about Zfbfmin from below spec doc.



Below test are passed for this patch
 * The riscv fully regression test.

I wrote a suitable ChangeLog entry and pushed this patch to the trunk.

THanks,
jeff




Re: [PATCH] c++: replace tf_norm with a local flag

2024-05-06 Thread Jason Merrill

On 5/3/24 11:26, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk?


OK.


-- >8 --

The tf_norm flag controlling whether to build diagnostic information
during constraint normalization doesn't need to be a global tsubst flag,
and is confusingly named.  This patch replaces it with a boolean flag
local to normalization.

gcc/cp/ChangeLog:

* constraint.cc (norm_info::norm_info): Take a boolean parameter
instead of tsubst_flags_t.
(norm_info::generate_diagnostics): Turn this predicate function
into a data member.
(normalize_logical_operation): Adjust after norm_info changes.
(normalize_concept_check): Likewise.
(normalize_atom): Likewise.
(get_normalized_constraints_from_info): Likewise.
(normalize_concept_definition): Likewise.
(normalize_constraint_expression): Likewise.
(normalize_placeholder_type_constraints): Likewise.
(satisfy_nondeclaration_constraints): Likewise.
* cp-tree.h (enum tsubst_flags): Remove tf_norm.
---
  gcc/cp/constraint.cc | 40 
  gcc/cp/cp-tree.h |  3 +--
  2 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 8a3b5d80ba7..3f0dab79bcd 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -622,33 +622,29 @@ parameter_mapping_equivalent_p (tree t1, tree t2)
  
  struct norm_info : subst_info

  {
-  explicit norm_info (tsubst_flags_t cmp)
-: norm_info (NULL_TREE, cmp)
+  explicit norm_info (bool diag)
+: norm_info (NULL_TREE, diag)
{}
  
/* Construct a top-level context for DECL.  */
  
-  norm_info (tree in_decl, tsubst_flags_t complain)

-: subst_info (tf_warning_or_error | complain, in_decl)
+  norm_info (tree in_decl, bool diag)
+: subst_info (tf_warning_or_error, in_decl),
+  generate_diagnostics (diag)
{
  if (in_decl)
{
initial_parms = DECL_TEMPLATE_PARMS (in_decl);
-   if (generate_diagnostics ())
+   if (generate_diagnostics)
  context = build_tree_list (NULL_TREE, in_decl);
}
  else
initial_parms = current_template_parms;
}
  
-  bool generate_diagnostics() const

-  {
-return complain & tf_norm;
-  }
-
void update_context(tree expr, tree args)
{
-if (generate_diagnostics ())
+if (generate_diagnostics)
{
tree map = build_parameter_mapping (expr, args, ctx_parms ());
context = tree_cons (map, expr, context);
@@ -679,6 +675,10 @@ struct norm_info : subst_info
   template parameters of ORIG_DECL.  */
  
tree initial_parms = NULL_TREE;

+
+  /* Whether to build diagnostic information during normalization.  */
+
+  bool generate_diagnostics;
  };
  
  static tree normalize_expression (tree, tree, norm_info);

@@ -693,7 +693,7 @@ normalize_logical_operation (tree t, tree args, tree_code 
c, norm_info info)
tree t1 = normalize_expression (TREE_OPERAND (t, 1), args, info);
  
/* Build a new info object for the constraint.  */

-  tree ci = info.generate_diagnostics()
+  tree ci = info.generate_diagnostics
  ? build_tree_list (t, info.context)
  : NULL_TREE;
  
@@ -777,7 +777,7 @@ normalize_concept_check (tree check, tree args, norm_info info)

if (!norm_cache)
  norm_cache = hash_table::create_ggc (31);
norm_entry *entry = nullptr;
-  if (!info.generate_diagnostics ())
+  if (!info.generate_diagnostics)
  {
/* Cache the normal form of the substituted concept-id (when not
 diagnosing).  */
@@ -831,7 +831,7 @@ normalize_atom (tree t, tree args, norm_info info)
if (info.in_decl && concept_definition_p (info.in_decl))
  ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P (atom) = true;
  
-  if (!info.generate_diagnostics ())

+  if (!info.generate_diagnostics)
  {
/* Cache the ATOMIC_CONSTRs that we return, so that sat_hasher::equal
 later can cheaply compare two atoms using just pointer equality.  */
@@ -910,7 +910,7 @@ get_normalized_constraints_from_info (tree ci, tree 
in_decl, bool diag = false)
  
/* Substitution errors during normalization are fatal.  */

++processing_template_decl;
-  norm_info info (in_decl, diag ? tf_norm : tf_none);
+  norm_info info (in_decl, diag);
tree t = get_normalized_constraints (CI_ASSOCIATED_CONSTRAINTS (ci), info);
--processing_template_decl;
  
@@ -1012,7 +1012,7 @@ normalize_concept_definition (tree tmpl, bool diag)

gcc_assert (TREE_CODE (tmpl) == TEMPLATE_DECL);
tree def = get_concept_definition (DECL_TEMPLATE_RESULT (tmpl));
++processing_template_decl;
-  norm_info info (tmpl, diag ? tf_norm : tf_none);
+  norm_info info (tmpl, diag);
tree norm = get_normalized_constraints (def, info);
--processing_template_decl;
  
@@ -1035,7 +1035,7 @@ normalize_constraint_expression (tree expr, norm_info info)

if (!expr || expr == error_mark_node)
  

Re: [PATCH v3] c++/modules: Fix dangling pointer with imported_temploid_friends

2024-05-06 Thread Jason Merrill

On 5/3/24 07:17, Nathaniel Shead wrote:

On Thu, May 02, 2024 at 02:05:38PM -0400, Jason Merrill wrote:

On 5/1/24 21:34, Nathaniel Shead wrote:

On Thu, May 02, 2024 at 12:15:44AM +1000, Nathaniel Shead wrote:

On Wed, May 01, 2024 at 09:57:38AM -0400, Patrick Palka wrote:


On Wed, 1 May 2024, Nathaniel Shead wrote:


Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk (and
later 14.2)?  I don't think making it a GTY root is necessary but I felt
perhaps better to be safe than sorry.

Potentially another approach would be to use DECL_UID instead like how
entity_map does; would that be preferable?

-- >8 --

I got notified by Linaro CI and by checking testresults that there seems
to be some occasional failures in tpl-friend-4_b.C on some architectures
and standards modes since r15-59-gb5f6a56940e708.  I haven't been able
to reproduce but looking at the backtrace I suspect the issue is that
we're adding to the 'imported_temploid_friend' map a decl that is
ultimately discarded, which then has its address reused by a later decl
causing a failure in the assert in 'set_originating_module'.

This patch attempts to fix the issue in two ways: by ensuring that we
only store the decl if we know it's a new decl (and hence won't be
discarded), and by making the imported_temploid_friends map a GTY root
so that even if the decl does get discarded later the address isn't
reused.

gcc/cp/ChangeLog:

* module.cc (imported_temploid_friends): Mark GTY, and...
(init_modules): ...allocate from GGC.
(trees_in::decl_value): Only write to imported_temploid_friends
for new decls.

Signed-off-by: Nathaniel Shead 
---
   gcc/cp/module.cc | 7 ---
   1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 5b8ff5bc483..37d38bb9654 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -2731,7 +2731,7 @@ static keyed_map_t *keyed_table;
  need to be attached to the same module as the temploid.  This maps
  these decls to the temploid they are instantiated them, as there is
  no other easy way to get this information.  */
-static hash_map *imported_temploid_friends;
+static GTY(()) hash_map *imported_temploid_friends;
   //
   /* Tree streaming.   The tree streaming is very specific to the tree
@@ -8327,7 +8327,8 @@ trees_in::decl_value ()
 if (TREE_CODE (inner) == FUNCTION_DECL
 || TREE_CODE (inner) == TYPE_DECL)
   if (tree owner = tree_node ())
-  imported_temploid_friends->put (decl, owner);
+  if (is_new)
+   imported_temploid_friends->put (decl, owner);


Hmm, I'm not seeing this code path getting reached for tpl-friend-4_b.C.
It seems we're instead adding to imported_temploid_friends from
propagate_defining_module, during tsubst_friend_function.

What seems to be happening is that we we first tsubst_friend_function
'foo' from TPL, and then we tsubst_friend_function 'foo' from DEF,
which ends up calling duplicate_decls, which ggc_frees this 'foo'
redeclaration that is still present in the imported_temploid_friends map.

So I don't think marking imported_temploid_friends as a GC root would
help with this situation.  If we want to keep imported_temploid_friends
as a tree -> tree map, I think we just need to ensure that a decl
is removed from the map upon getting ggc_free'd from e.g.  duplicate_decls.


Could we instead move the call to propagate_defining_module down a few
lines, after the pushdecl?


Doing that for tsubst_friend_class seems to work OK with my current test
cases, but for tsubst_friend_function doing so causes ICEs in
'module_may_redeclare' within duplicate_decls because the function is
already marked as attached but the originating module information hasn't
been setup yet.


It's unfortunate that we need to add a hash table entry in order to make 
it through duplicate_decls, at which point it becomes dead.  Ah, well.



I suppose with tsubst_friend_class it works though because we can't ever
take the pushdecl branch if an existing type exists that we would call
duplicate_decls on.


But it seems simpler to use DECL_UID as the key instead, since those
never get reused even after the decl gets ggc_free'd IIUC.


It still means garbage entries in the hash_map, which is undesirable even if
it doesn't cause the same kind of breakage.

Incidentally, decl_tree_map is preferable to hash_map when the
key is always a decl.


Ah thanks, didn't know about decl_tree_map.  I feel that I prefer using
DECL_UIDs explicitly here though; it's also consistent with the existing
usage in entity_map_t, and it looks like decl_tree_map is still perhaps
vulnerable to the original issue here (since DECL_UID is only used for
hashing and not for equality, it looks like?).

Though that said, 'decl_constraints' in constraints.cc seems to be using
it fine (well, with a GTY marking) by using 'remove_constraints' within
duplicate_decls to clear 

Re: [PATCH] RISC-V: Add zero_extract support for rv64gc

2024-05-06 Thread Jeff Law




On 5/6/24 3:42 PM, Vineet Gupta wrote:



On 5/6/24 13:40, Christoph Müllner wrote:

The combiner attempts to optimize a zero-extension of a logical right shift
using zero_extract. We already utilize this optimization for those cases
that result in a single instructions.  Let's add a insn_and_split
pattern that also matches the generic case, where we can emit an
optimized sequence of a slli/srli.

...

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index d4676507b45..80cbecb78e8 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2792,6 +2792,36 @@ (define_insn "*lshrsi3_zero_extend_3"
[(set_attr "type" "shift")
 (set_attr "mode" "SI")])
  
+;; Canonical form for a zero-extend of a logical right shift.

+;; Special cases are handled above.
+;; Skip for single-bit extraction (Zbs/XTheadBs) and th.extu (XTheadBb)


Dumb question: Why not for Zbs: Zb[abs] is going to be very common going
fwd and will end up being unused.
Zbs only handles single bit extractions.  The pattern rejects that case 
allowing the single bit patterns from bitmanip.md and thead.md to match 
them.


Jeff




Re: [PATCH] RISC-V: Add zero_extract support for rv64gc

2024-05-06 Thread Vineet Gupta



On 5/6/24 13:40, Christoph Müllner wrote:
> The combiner attempts to optimize a zero-extension of a logical right shift
> using zero_extract. We already utilize this optimization for those cases
> that result in a single instructions.  Let's add a insn_and_split
> pattern that also matches the generic case, where we can emit an
> optimized sequence of a slli/srli.
>
> ...
>
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index d4676507b45..80cbecb78e8 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -2792,6 +2792,36 @@ (define_insn "*lshrsi3_zero_extend_3"
>[(set_attr "type" "shift")
> (set_attr "mode" "SI")])
>  
> +;; Canonical form for a zero-extend of a logical right shift.
> +;; Special cases are handled above.
> +;; Skip for single-bit extraction (Zbs/XTheadBs) and th.extu (XTheadBb)

Dumb question: Why not for Zbs: Zb[abs] is going to be very common going
fwd and will end up being unused.

> +(define_insn_and_split "*lshr3_zero_extend_4"
> +  [(set (match_operand:GPR 0 "register_operand" "=r")
> +  (zero_extract:GPR
> +   (match_operand:GPR 1 "register_operand" " r")
> +   (match_operand 2 "const_int_operand")
> +   (match_operand 3 "const_int_operand")))
> +   (clobber (match_scratch:GPR  4 "="))]
> +  "!((TARGET_ZBS || TARGET_XTHEADBS) && (INTVAL (operands[2]) == 1))
> +   && !TARGET_XTHEADBB"
> +  "#"
> +  "&& reload_completed"
> +  [(set (match_dup 4)
> + (ashift:GPR (match_dup 1) (match_dup 2)))
> +   (set (match_dup 0)
> + (lshiftrt:GPR (match_dup 4) (match_dup 3)))]
> +{
> +  int regbits = GET_MODE_BITSIZE (GET_MODE (operands[0])).to_constant ();
> +  int sizebits = INTVAL (operands[2]);
> +  int startbits = INTVAL (operands[3]);
> +  int lshamt = regbits - sizebits - startbits;
> +  int rshamt = lshamt + startbits;
> +  operands[2] = GEN_INT (lshamt);
> +  operands[3] = GEN_INT (rshamt);
> +}
> +  [(set_attr "type" "shift")
> +   (set_attr "mode" "")])
> +
>  ;; Handle AND with 2^N-1 for N from 12 to XLEN.  This can be split into
>  ;; two logical shifts.  Otherwise it requires 3 instructions: lui,
>  ;; xor/addi/srli, and.
> diff --git a/gcc/testsuite/gcc.target/riscv/pr111501.c 
> b/gcc/testsuite/gcc.target/riscv/pr111501.c
> new file mode 100644
> index 000..9355be242e7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr111501.c
> @@ -0,0 +1,32 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target rv64 } */
> +/* { dg-options "-march=rv64gc" { target { rv64 } } } */
> +/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */

Is function body check really needed: isn't count of srli and slli each
sufficient ?
Last year we saw a lot of false failures due to unrelated scheduling
changes as such tripping these up.

> +/* { dg-allow-blank-lines-in-output 1 } */
> +
> +/*
> +**do_shift:
> +**...
> +**slli\ta[0-9],a[0-9],16
> +**srli\ta[0-9],a[0-9],48
> +**...
> +*/
> +unsigned int
> +do_shift(unsigned long csum)
> +{
> +  return (unsigned short)(csum >> 32);
> +}
> +
> +/*
> +**do_shift2:
> +**...
> +**slli\ta[0-9],a[0-9],16
> +**srli\ta[0-9],a[0-9],48
> +**...
> +*/
> +unsigned int
> +do_shift2(unsigned long csum)
> +{
> +  return (csum << 16) >> 48;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c 
> b/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
> new file mode 100644
> index 000..2824d6fe074
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target rv32 } */
> +/* { dg-options "-march=rv32gc" } */
> +/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
> +/* { dg-final { check-function-bodies "**" "" } } */

Same as above, counts where possible.

-Vineet



Re: [NOT CODE REVIEW] [PATCH v3 1/1] [RISC-V] Add support for _Bfloat16

2024-05-06 Thread Jeff Law




On 5/5/24 6:38 PM, Xiao Zeng wrote:

1 At point ,
   BF16 has already been completed "post public review".

2 LLVM has also added support for RISCV BF16 in
    and
   .

3 According to the discussion 
,
   this use __bf16 and use DF16b in riscv_mangle_type like x86.

Below test are passed for this patch
 * The riscv fully regression test.

gcc/ChangeLog:

* config/riscv/iterators.md: New mode iterator HFBF.
* config/riscv/riscv-builtins.cc (riscv_init_builtin_types):
Initialize data type _Bfloat16.
* config/riscv/riscv-modes.def (FLOAT_MODE): New.
(ADJUST_FLOAT_FORMAT): New.
* config/riscv/riscv.cc (riscv_mangle_type): Support for BFmode.
(riscv_scalar_mode_supported_p): Ditto.
(riscv_libgcc_floating_mode_supported_p): Ditto.
(riscv_init_libfuncs): Set the conversion method for BFmode and
HFmode.
(riscv_block_arith_comp_libfuncs_for_mode): Set the arithmetic
and comparison libfuncs for the mode.
* config/riscv/riscv.md (mode" ): Add BF.
(movhf): Support for BFmode.
(mov): Ditto.
(*movhf_softfloat): Ditto.
(*mov_softfloat): Ditto.

libgcc/ChangeLog:

* config/riscv/sfp-machine.h (_FP_NANFRAC_B): New.
(_FP_NANSIGN_B): Ditto.
* config/riscv/t-softfp32: Add support for BF16 libfuncs.
* config/riscv/t-softfp64: Ditto.
* soft-fp/floatsibf.c: For si -> bf16.
* soft-fp/floatunsibf.c: For unsi -> bf16.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/bf16_arithmetic.c: New test.
* gcc.target/riscv/bf16_call.c: New test.
* gcc.target/riscv/bf16_comparison.c: New test.
* gcc.target/riscv/bf16_float_libcall_convert.c: New test.
* gcc.target/riscv/bf16_integer_libcall_convert.c: New test.
Given we were only looking to have the CI system check the formatting 
nit and that has passed.  I've pushed this to the trunk.


jeff



Re: [PATCH] RISC-V: Document -mcmodel=large

2024-05-06 Thread Jeff Law




On 12/20/23 11:13 AM, Jeff Law wrote:



On 12/20/23 11:08, Palmer Dabbelt wrote:

This slipped through the cracks.  Probably also NEWS-worthy.

gcc/ChangeLog:

* doc/invoke.texi (RISC-V): Add -mcmodel=large.

OK.

And yes, I think we're going to need to to a new/changes update for the 
port as a whole as part of the gcc-14 process.

This never got committed as far as I can tell.  So I pushed it.

Jeff


Re: [RFA][RISC-V] Use "uw" forms for constant synthesis

2024-05-06 Thread Jeff Law




On 5/4/24 6:53 PM, Jeff Law wrote:


So another constant synthesis improvement.

In this patch we're looking at cases where we'd like to be able to use 
lui+slli, but can't because of the sign extending nature of lui on 
TARGET_64BIT.  For example: 0x800110020UL.  The trunk currently 
generates 4 instructions for that constant, when it can be done with 3 
(lui+slli.uw+addi).


When Zba is enabled, we can use lui+slli.uw as the slli.uw masks off the 
bits 32..63 before shifting, giving us the precise semantics we want.


I strongly suspect we'll want to do the same for a set of constants with 
lui+add.uw, lui+shNadd.uw, so you'll see the beginnings of generalizing 
support for lui followed by a "uw" instruction.


The new test just tests the set of cases that showed up while exploring 
a particular space of the constant synthesis problem.  It's not meant to 
be exhaustive (failure to use shadd when profitable).


Tested on rv64gc and rv32gcv.  OK for the trunk assuming it passes CI?

I pushed this after fixing the two over-length lines.

jeff



[PATCH] Mention that some options are turned on by `-Ofast` in their descriptions [PR97263]

2024-05-06 Thread Andrew Pinski
Like was done for -ffast-math in r0-105946-ga570fc16fa8056, we should
document that -Ofast enables -fmath-errno, -funsafe-math-optimizations,
-finite-math-only, -fno-trapping-math in their documentation.

Note this changes the stronger "must not" to be "is not" for -fno-trapping-math
since we do enable it for -Ofast already.

OK?

gcc/ChangeLog:

PR middle-end/97263
* doc/invoke.texi(fmath-errno): Document it is turned on
with -Ofast.
(funsafe-math-optimizations): Likewise.
(ffinite-math-only): Likewise.
(fno-trapping-math): Likewise and use less strong language.

Signed-off-by: Andrew Pinski 
---
 gcc/doc/invoke.texi | 41 ++---
 1 file changed, 22 insertions(+), 19 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9456ced468a..14ff4d25da7 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14996,11 +14996,12 @@ with a single instruction, e.g., @code{sqrt}.  A 
program that relies on
 IEEE exceptions for math error handling may want to use this flag
 for speed while maintaining IEEE arithmetic compatibility.
 
-This option is not turned on by any @option{-O} option since
-it can result in incorrect output for programs that depend on
-an exact implementation of IEEE or ISO rules/specifications for
-math functions. It may, however, yield faster code for programs
-that do not require the guarantees of these specifications.
+This option is not turned on by any @option{-O} option  besides
+@option{-Ofast} since it can result in incorrect output for
+programs that depend on an exact implementation of IEEE or
+ISO rules/specifications for math functions. It may, however,
+yield faster code for programs that do not require the guarantees
+of these specifications.
 
 The default is @option{-fmath-errno}.
 
@@ -15017,11 +15018,12 @@ ANSI standards.  When used at link time, it may 
include libraries
 or startup files that change the default FPU control word or other
 similar optimizations.
 
-This option is not turned on by any @option{-O} option since
-it can result in incorrect output for programs that depend on
-an exact implementation of IEEE or ISO rules/specifications for
-math functions. It may, however, yield faster code for programs
-that do not require the guarantees of these specifications.
+This option is not turned on by any @option{-O} option besides
+@option{-Ofast} since it can result in incorrect output
+for programs that depend on an exact implementation of IEEE
+or ISO rules/specifications for math functions. It may, however,
+yield faster code for programs that do not require the guarantees
+of these specifications.
 Enables @option{-fno-signed-zeros}, @option{-fno-trapping-math},
 @option{-fassociative-math} and @option{-freciprocal-math}.
 
@@ -15061,11 +15063,12 @@ The default is @option{-fno-reciprocal-math}.
 Allow optimizations for floating-point arithmetic that assume
 that arguments and results are not NaNs or +-Infs.
 
-This option is not turned on by any @option{-O} option since
-it can result in incorrect output for programs that depend on
-an exact implementation of IEEE or ISO rules/specifications for
-math functions. It may, however, yield faster code for programs
-that do not require the guarantees of these specifications.
+This option is not turned on by any @option{-O} option besides
+@option{-Ofast} since it can result in incorrect output
+for programs that depend on an exact implementation of IEEE or
+ISO rules/specifications for math functions. It may, however,
+yield faster code for programs that do not require the guarantees
+of these specifications.
 
 The default is @option{-fno-finite-math-only}.
 
@@ -15089,10 +15092,10 @@ underflow, inexact result and invalid operation.  
This option requires
 that @option{-fno-signaling-nans} be in effect.  Setting this option may
 allow faster code if one relies on ``non-stop'' IEEE arithmetic, for example.
 
-This option should never be turned on by any @option{-O} option since
-it can result in incorrect output for programs that depend on
-an exact implementation of IEEE or ISO rules/specifications for
-math functions.
+This option is not turned on by any @option{-O} option besides
+@option{-Ofast} since it can result in incorrect output for programs
+that depend on an exact implementation of IEEE or ISO rules/specifications
+for math functions.
 
 The default is @option{-ftrapping-math}.
 
-- 
2.43.0



Re: [PATCH] RISC-V: Add zero_extract support for rv64gc

2024-05-06 Thread Jeff Law




On 5/6/24 2:40 PM, Christoph Müllner wrote:

The combiner attempts to optimize a zero-extension of a logical right shift
using zero_extract. We already utilize this optimization for those cases
that result in a single instructions.  Let's add a insn_and_split
pattern that also matches the generic case, where we can emit an
optimized sequence of a slli/srli.

Tested with SPEC CPU 2017 (rv64gc).

PR 111501

gcc/ChangeLog:

* config/riscv/riscv.md (*lshr3_zero_extend_4): New
pattern for zero-extraction.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr111501.c: New test.
* gcc.target/riscv/zero-extend-rshift-32.c: New test.
* gcc.target/riscv/zero-extend-rshift-64.c: New test.
* gcc.target/riscv/zero-extend-rshift.c: New test.
So I had Lyut looking in this space as well.  Mostly because there's a 
desire to avoid the srl+and approach and instead represent this stuff as 
shifts (which are fusible in our uarch).  SO I've already got some state...





Signed-off-by: Christoph Müllner 
---
  gcc/config/riscv/riscv.md |  30 +
  gcc/testsuite/gcc.target/riscv/pr111501.c |  32 +
  .../gcc.target/riscv/zero-extend-rshift-32.c  |  37 ++
  .../gcc.target/riscv/zero-extend-rshift-64.c  |  63 ++
  .../gcc.target/riscv/zero-extend-rshift.c | 119 ++
  5 files changed, 281 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/riscv/pr111501.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-64.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index d4676507b45..80cbecb78e8 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2792,6 +2792,36 @@ (define_insn "*lshrsi3_zero_extend_3"
[(set_attr "type" "shift")
 (set_attr "mode" "SI")])
  
+;; Canonical form for a zero-extend of a logical right shift.

+;; Special cases are handled above.
+;; Skip for single-bit extraction (Zbs/XTheadBs) and th.extu (XTheadBb)
+(define_insn_and_split "*lshr3_zero_extend_4"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+(zero_extract:GPR
+   (match_operand:GPR 1 "register_operand" " r")
+   (match_operand 2 "const_int_operand")
+   (match_operand 3 "const_int_operand")))
+   (clobber (match_scratch:GPR  4 "="))]
+  "!((TARGET_ZBS || TARGET_XTHEADBS) && (INTVAL (operands[2]) == 1))
+   && !TARGET_XTHEADBB"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 4)
+ (ashift:GPR (match_dup 1) (match_dup 2)))
+   (set (match_dup 0)
+ (lshiftrt:GPR (match_dup 4) (match_dup 3)))]
Consider adding support for signed extractions as well.  You just need 
an iterator across zero_extract/sign_extract and suitable selection of 
arithmetic vs logical right shift step.


A nit on the condition.   Bring the && INTVAL (operands[2]) == 1 down to 
a new line like you've gone with !TARGET_XTHEADBB.


You also want to make sure the condition rejects the cases handled by 
this pattern (or merge your pattern with this one):



;; Canonical form for a zero-extend of a logical right shift.
(define_insn "*lshrsi3_zero_extend_2" 
  [(set (match_operand:DI   0 "register_operand" "=r")

(zero_extract:DI (match_operand:DI  1 "register_operand" " r")
 (match_operand 2 "const_int_operand")
 (match_operand 3 "const_int_operand")))]
  "(TARGET_64BIT && (INTVAL (operands[3]) > 0)
&& (INTVAL (operands[2]) + INTVAL (operands[3]) == 32))"
{
  return "srliw\t%0,%1,%3";
}
  [(set_attr "type" "shift")
   (set_attr "mode" "SI")])


So generally going the right direction.  But needs another iteration.

Jeff



Re: [PATCH v2 1/1] [RISC-V] Add support for _Bfloat16

2024-05-06 Thread Jeff Law




On 5/4/24 8:08 PM, Xiao Zeng wrote:



https://github.com/ewlu/gcc-precommit-ci/issues/1412#issuecomment-2031568644

In the future, my patch will strictly adhere to the formatting suggestions 
provided by CI.
No worries.  Even those of us who have been working on the project for 
30+ years still goof this stuff up from time to time.   In fact, it 
complained about one of my patches over the weekend ;-)




With that fixed, this is fine for the trunk.  No need to repost,
go ahead and commit.

Currently, I do not have commit permission. Can I have this permission?


Use this form:

https://sourceware.org/cgi-bin/pdw/ps_form.cgi


And my email address as the as your sponsor.  j...@ventanamicro.com

I'll go ahead and commit the Bfloat16 patch.  But if you plan on 
contributing regularly, it's definitely easier to have write access.


Jeff


[PATCH] RISC-V: Add zero_extract support for rv64gc

2024-05-06 Thread Christoph Müllner
The combiner attempts to optimize a zero-extension of a logical right shift
using zero_extract. We already utilize this optimization for those cases
that result in a single instructions.  Let's add a insn_and_split
pattern that also matches the generic case, where we can emit an
optimized sequence of a slli/srli.

Tested with SPEC CPU 2017 (rv64gc).

PR 111501

gcc/ChangeLog:

* config/riscv/riscv.md (*lshr3_zero_extend_4): New
pattern for zero-extraction.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr111501.c: New test.
* gcc.target/riscv/zero-extend-rshift-32.c: New test.
* gcc.target/riscv/zero-extend-rshift-64.c: New test.
* gcc.target/riscv/zero-extend-rshift.c: New test.

Signed-off-by: Christoph Müllner 
---
 gcc/config/riscv/riscv.md |  30 +
 gcc/testsuite/gcc.target/riscv/pr111501.c |  32 +
 .../gcc.target/riscv/zero-extend-rshift-32.c  |  37 ++
 .../gcc.target/riscv/zero-extend-rshift-64.c  |  63 ++
 .../gcc.target/riscv/zero-extend-rshift.c | 119 ++
 5 files changed, 281 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr111501.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift-64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zero-extend-rshift.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index d4676507b45..80cbecb78e8 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2792,6 +2792,36 @@ (define_insn "*lshrsi3_zero_extend_3"
   [(set_attr "type" "shift")
(set_attr "mode" "SI")])
 
+;; Canonical form for a zero-extend of a logical right shift.
+;; Special cases are handled above.
+;; Skip for single-bit extraction (Zbs/XTheadBs) and th.extu (XTheadBb)
+(define_insn_and_split "*lshr3_zero_extend_4"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+(zero_extract:GPR
+   (match_operand:GPR 1 "register_operand" " r")
+   (match_operand 2 "const_int_operand")
+   (match_operand 3 "const_int_operand")))
+   (clobber (match_scratch:GPR  4 "="))]
+  "!((TARGET_ZBS || TARGET_XTHEADBS) && (INTVAL (operands[2]) == 1))
+   && !TARGET_XTHEADBB"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 4)
+ (ashift:GPR (match_dup 1) (match_dup 2)))
+   (set (match_dup 0)
+ (lshiftrt:GPR (match_dup 4) (match_dup 3)))]
+{
+  int regbits = GET_MODE_BITSIZE (GET_MODE (operands[0])).to_constant ();
+  int sizebits = INTVAL (operands[2]);
+  int startbits = INTVAL (operands[3]);
+  int lshamt = regbits - sizebits - startbits;
+  int rshamt = lshamt + startbits;
+  operands[2] = GEN_INT (lshamt);
+  operands[3] = GEN_INT (rshamt);
+}
+  [(set_attr "type" "shift")
+   (set_attr "mode" "")])
+
 ;; Handle AND with 2^N-1 for N from 12 to XLEN.  This can be split into
 ;; two logical shifts.  Otherwise it requires 3 instructions: lui,
 ;; xor/addi/srli, and.
diff --git a/gcc/testsuite/gcc.target/riscv/pr111501.c 
b/gcc/testsuite/gcc.target/riscv/pr111501.c
new file mode 100644
index 000..9355be242e7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr111501.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target rv64 } */
+/* { dg-options "-march=rv64gc" { target { rv64 } } } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+/* { dg-allow-blank-lines-in-output 1 } */
+
+/*
+**do_shift:
+**...
+**slli\ta[0-9],a[0-9],16
+**srli\ta[0-9],a[0-9],48
+**...
+*/
+unsigned int
+do_shift(unsigned long csum)
+{
+  return (unsigned short)(csum >> 32);
+}
+
+/*
+**do_shift2:
+**...
+**slli\ta[0-9],a[0-9],16
+**srli\ta[0-9],a[0-9],48
+**...
+*/
+unsigned int
+do_shift2(unsigned long csum)
+{
+  return (csum << 16) >> 48;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c 
b/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
new file mode 100644
index 000..2824d6fe074
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zero-extend-rshift-32.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target rv32 } */
+/* { dg-options "-march=rv32gc" } */
+/* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#define URT_ZE_UCT_RSHIFT_N_UAT(RT,CT,N,AT)\
+unsigned RT u##RT##_ze_u##CT##_rshift_##N##_u##AT(unsigned AT v)   \
+{  \
+return (unsigned CT)(v >> N);  \
+}
+
+#define ULONG_ZE_USHORT_RSHIFT_N_ULONG(N) 
URT_ZE_UCT_RSHIFT_N_UAT(long,short,N,long)
+#define ULONG_ZE_UINT_RSHIFT_N_ULONG(N) 
URT_ZE_UCT_RSHIFT_N_UAT(long,int,N,long)
+
+/*
+**ulong_ze_ushort_rshift_9_ulong:
+**slli\ta[0-9],a[0-9],7
+**

[RISC-V] Fix incorrect if-then-else nesting of Zbs usage in constant synthesis

2024-05-06 Thread Jeff Law
So I managed to goof the if-then-else level of the bseti bits last week. 
 They were supposed to be a last ditch effort to improve the result, 
but ended up inside a conditional where they don't really belong.  I 
almost always use Zba, Zbb and Zbs together, so it slipped by.


So it's NFC if you always test with Zbb and Zbs enabled together.  But 
if you enabled Zbs without Zbb you'd see a failure to use bseti.


Planning to commit once pre-commit CI passes.

I'm attaching the actual patch (P) and a diff with whitespace ignored 
(P2) so it's easier to see what actually changed.


Jeffdiff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 6f1c67bf3f7..dddb7f8d673 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -869,50 +869,51 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
  codes[1].use_uw = false;
  cost = 2;
}
-  /* Final cases, particularly focused on bseti.  */
-  else if (cost > 2 && TARGET_ZBS)
-   {
- int i = 0;
+}
 
- /* First handle any bits set by LUI.  Be careful of the
-SImode sign bit!.  */
- if (value & 0x7800)
-   {
- alt_codes[i].code = (i == 0 ? UNKNOWN : IOR);
- alt_codes[i].value = value & 0x7800;
- alt_codes[i].use_uw = false;
- value &= ~0x7800;
- i++;
-   }
+  /* Final cases, particularly focused on bseti.  */
+  if (cost > 2 && TARGET_ZBS)
+{
+  int i = 0;
 
- /* Next, any bits we can handle with addi.  */
- if (value & 0x7ff)
-   {
- alt_codes[i].code = (i == 0 ? UNKNOWN : PLUS);
- alt_codes[i].value = value & 0x7ff;
- alt_codes[i].use_uw = false;
- value &= ~0x7ff;
- i++;
-   }
+  /* First handle any bits set by LUI.  Be careful of the
+SImode sign bit!.  */
+  if (value & 0x7800)
+   {
+ alt_codes[i].code = (i == 0 ? UNKNOWN : IOR);
+ alt_codes[i].value = value & 0x7800;
+ alt_codes[i].use_uw = false;
+ value &= ~0x7800;
+  i++;
+   }
 
- /* And any residuals with bseti.  */
- while (i < cost && value)
-   {
- HOST_WIDE_INT bit = ctz_hwi (value);
- alt_codes[i].code = (i == 0 ? UNKNOWN : IOR);
- alt_codes[i].value = 1UL << bit;
- alt_codes[i].use_uw = false;
- value &= ~(1ULL << bit);
- i++;
-   }
+  /* Next, any bits we can handle with addi.  */
+  if (value & 0x7ff)
+   {
+ alt_codes[i].code = (i == 0 ? UNKNOWN : PLUS);
+ alt_codes[i].value = value & 0x7ff;
+ alt_codes[i].use_uw = false;
+ value &= ~0x7ff;
+ i++;
+   }
 
- /* If LUI+ADDI+BSETI resulted in a more efficient
-sequence, then use it.  */
- if (i < cost)
-   {
- memcpy (codes, alt_codes, sizeof (alt_codes));
- cost = i;
-   }
+  /* And any residuals with bseti.  */
+  while (i < cost && value)
+   {
+ HOST_WIDE_INT bit = ctz_hwi (value);
+ alt_codes[i].code = (i == 0 ? UNKNOWN : IOR);
+ alt_codes[i].value = 1UL << bit;
+ alt_codes[i].use_uw = false;
+ value &= ~(1ULL << bit);
+ i++;
+   }
+
+  /* If LUI+ADDI+BSETI resulted in a more efficient
+sequence, then use it.  */
+  if (i < cost)
+   {
+ memcpy (codes, alt_codes, sizeof (alt_codes));
+ cost = i;
}
 }
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 6f1c67bf3f7..dddb7f8d673 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -869,8 +869,10 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
  codes[1].use_uw = false;
  cost = 2;
}
+}
+
   /* Final cases, particularly focused on bseti.  */
-  else if (cost > 2 && TARGET_ZBS)
+  if (cost > 2 && TARGET_ZBS)
 {
   int i = 0;
 
@@ -914,7 +916,6 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
  cost = i;
}
 }
-}
 
   gcc_assert (cost <= RISCV_MAX_INTEGER_OPS);
   return cost;


[PATCH] Fortran: improve attribute conflict checking [PR93635]

2024-05-06 Thread Harald Anlauf
Dear all,

I've been contemplating whether to submit the attached patch.
It addresses an ICE-on-invalid as reported in the PR, and also
fixes an accepts-invalid (see testcase), plus maybe some more,
related due to incomplete checking of symbol attribute conflicts.

The fix does not fully address the general issue, which is
analyzed by Steve: some of the checks do depend on the selected
Fortran standard, and under circumstances such as in the testcase
the checking of other, standard-version-independent conflicts
simply does not occur.

Steve's solution would fix that, but unfortunately leads to issues
with error recovery in notoriously fragile parts of the FE: e.g.
testcase pr87907.f90 needs adjusting, and minor variations
of it will lead to various other horrendous ICEs that remind
of existing PRs where parsing or resolution goes sideways.

I therefore propose a much simpler approach: move - if possible -
selected of the standard-version-dependent checks after the
version-independent ones.  I think this could help in getting more
consistent error reporting and recovery.  However, I did *not*
move those checks that are critical when processing interfaces.
(-> pr87907.f90 / (sub)modules)

The patch therefore does not require any testsuite update and
should not give any other surprises, so it should be very safe.
The plan is also to leave the PR open for the time being.

Regtesting on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From c55cb36a6ad00996b5efb33c0c5357fc5fa9919c Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Mon, 6 May 2024 20:57:29 +0200
Subject: [PATCH] Fortran: improve attribute conflict checking [PR93635]

gcc/fortran/ChangeLog:

	PR fortran/93635
	* symbol.cc (gfc_check_conflict): Move some attribute conflict
	checks that depend on the selected version of the Fortran standard
	so that error reporting gets more consistent.

gcc/testsuite/ChangeLog:

	PR fortran/93635
	* gfortran.dg/pr93635.f90: New test.
---
 gcc/fortran/symbol.cc | 30 ---
 gcc/testsuite/gfortran.dg/pr93635.f90 | 19 +
 2 files changed, 32 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr93635.f90

diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc
index 8f7deac1d1e..ed17291c53e 100644
--- a/gcc/fortran/symbol.cc
+++ b/gcc/fortran/symbol.cc
@@ -459,22 +459,6 @@ gfc_check_conflict (symbol_attribute *attr, const char *name, locus *where)
   if (where == NULL)
 where = _current_locus;

-  if (attr->pointer && attr->intent != INTENT_UNKNOWN)
-{
-  a1 = pointer;
-  a2 = intent;
-  standard = GFC_STD_F2003;
-  goto conflict_std;
-}
-
-  if (attr->in_namelist && (attr->allocatable || attr->pointer))
-{
-  a1 = in_namelist;
-  a2 = attr->allocatable ? allocatable : pointer;
-  standard = GFC_STD_F2003;
-  goto conflict_std;
-}
-
   /* Check for attributes not allowed in a BLOCK DATA.  */
   if (gfc_current_state () == COMP_BLOCK_DATA)
 {
@@ -579,10 +563,12 @@ gfc_check_conflict (symbol_attribute *attr, const char *name, locus *where)
 return false;

   conf (allocatable, pointer);
+
+  /* Moving these checks past the function/subroutine conflict check may
+ cause trouble with minor variations of testcase pr87907.f90.  */
   conf_std (allocatable, dummy, GFC_STD_F2003);
   conf_std (allocatable, function, GFC_STD_F2003);
   conf_std (allocatable, result, GFC_STD_F2003);
-  conf_std (elemental, recursive, GFC_STD_F2018);

   conf (in_common, dummy);
   conf (in_common, allocatable);
@@ -911,6 +897,16 @@ gfc_check_conflict (symbol_attribute *attr, const char *name, locus *where)
   break;
 }

+  /* Conflict checks depending on the selected version of the Fortran
+ standard are preferably applied after standard-independent ones, so
+ that one gets more consistent error reporting and recovery.  */
+  if (attr->pointer && attr->intent != INTENT_UNKNOWN)
+conf_std (pointer, intent, GFC_STD_F2003);
+
+  conf_std (in_namelist, allocatable, GFC_STD_F2003);
+  conf_std (in_namelist, pointer, GFC_STD_F2003);
+  conf_std (elemental, recursive, GFC_STD_F2018);
+
   return true;

 conflict:
diff --git a/gcc/testsuite/gfortran.dg/pr93635.f90 b/gcc/testsuite/gfortran.dg/pr93635.f90
new file mode 100644
index 000..4ef33fecf2b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr93635.f90
@@ -0,0 +1,19 @@
+! { dg-do compile }
+! PR fortran/93635
+!
+! Test that some attribute conflicts are properly diagnosed
+
+program p
+  implicit none
+  character(len=:),allocatable :: r,s
+  namelist /args/ r,s
+  equivalence(r,s) ! { dg-error "EQUIVALENCE attribute conflicts with ALLOCATABLE" }
+  allocate(character(len=1024) :: r)
+end
+
+subroutine sub (p, q)
+  implicit none
+  real, pointer, intent(inout) :: p(:), q(:)
+  namelist /nml/ p,q
+  equivalence(p,q) ! { dg-error "EQUIVALENCE attribute conflicts with DUMMY" }
+end
--
2.35.3



[COMMITTED] aarch64: Fix gcc.target/aarch64/sve/loop_add_6.c for LLP64 targets

2024-05-06 Thread Andrew Pinski
Even though the aarch64-mingw32 support has not been committed yet,
we should fix some of the testcases. In this case 
gcc.target/aarch64/sve/loop_add_6.c
is easy to fix. We should use __SIZETYPE__ instead of `unsigned long` for the 
variables
that will be used for pointer plus.

Committed as obvious after a quick test on aarch64-linux-gnu.

gcc/testsuite/ChangeLog:

PR testsuite/114177
* gcc.target/aarch64/sve/loop_add_6.c: Use __SIZETYPE__ instead
of `unsigned long` for index and offset variables.

Signed-off-by: Andrew Pinski 
---
 gcc/testsuite/gcc.target/aarch64/sve/loop_add_6.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/loop_add_6.c 
b/gcc/testsuite/gcc.target/aarch64/sve/loop_add_6.c
index e7416ebcded..a530998f54b 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/loop_add_6.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/loop_add_6.c
@@ -5,8 +5,8 @@ double __GIMPLE (ssa, startwith("loop"))
 neg_xi (double *x)
 {
   int i;
-  long unsigned int index;
-  long unsigned int offset;
+  __SIZETYPE__ index;
+  __SIZETYPE__ offset;
   double * xi_ptr;
   double xi;
   double neg_xi;
@@ -20,8 +20,8 @@ neg_xi (double *x)
   res_1 = __PHI (__BB5: 0.0, __BB3: res_2);
   i_4 = __PHI (__BB5: 0, __BB3: i_5);
   ivtmp_6 = __PHI (__BB5: 100U, __BB3: ivtmp_7);
-  index = (long unsigned int) i_4;
-  offset = index * 8UL;
+  index = (__SIZETYPE__ ) i_4;
+  offset = index * _Literal (__SIZETYPE__) 8;
   xi_ptr = x_8(D) + offset;
   xi = *xi_ptr;
   neg_xi = -xi;
-- 
2.43.0



Re: [PATCH v2] Fix auto deduction for template specialization scopes [PR114915]

2024-05-06 Thread Patrick Palka
On Sat, 4 May 2024, Seyed Sajad Kahani wrote:

> The limitations of the initial patch (checking specializiation template 
> usage), have been discussed.
> 
> > I realized that for the case where we have a member function template
> > of a class template, and a specialization of the enclosing class only
> > (like below),
> >
> > template <>
> > template 
> > void S::f() {
> >   // some constrained auto
> > }
> >
> > When using S::f, DECL_TEMPLATE_INFO(fn) is non-zero, and
> > DECL_TEMPLATE_SPECIALIZATION(fn) is zero, while
> > DECL_TEMPLATE_SPECIALIZATION(DECL_TI_TEMPLATE(fn)) is non-zero.
> > So it means that the patch will extract DECL_TI_ARGS(fn) as
> > outer_targs, and it would be   while the type of the
> > constrained auto will be as template  ... and will not be
> > dependent on the parameters of the enclosing class.
> > This means that again (outer_targs + targs) will have more depth than
> > auto_node levels.
> > This means that for the case where the function is not an explicit
> > specialization, but it is defined in an explicitly specialized scope,
> > the patch will not work.

Ah yes, good point!  This demonstrates that it doesn't suffice to
handle the TEMPLATE_TYPE_ORIG_LEVEL == 1 case contrary to what I
suggested earlier.

> 
> As described in more detail below, this patch attempts to resolve this issue 
> by trimming full_targs.
> 
> > > Another more context-unaware approach to fix this might be to only
> > > use the innermost level from 'full_targs' for satisfaction if
> > > TEMPLATE_TYPE_ORIG_LEVEL is 1 (which means this constrained auto
> > > appeared in a context that doesn't have template parameters such as an
> > > explicit specialization or ordinary non-template function, and so
> > > only the level corresponding to the deduced type is needed for
> > > satisfaction.)
> > >
> > > Generalizing on that, another approach might be to handle missing_levels 
> > > < 0
> > > by removing -missing_levels from full_targs via 
> > > get_innermost_template_args.
> > > But AFAICT it should suffice to handle the TEMPLATE_TYPE_ORIG_LEVEL == 1
> > > case specifically.
> >
> > I was unable to understand why you think that it might not handle
> > TEMPLATE_TYPE_ORIG_LEVEL > 1 cases, so I tried to formulate my
> > reasoning as follows.

Yes, sorry about that misleading suggestion.

> >
> > Assuming contexts adc_variable_type, adc_return_type, adc_decomp_type:
> > For any case where missing_level < 0, it means that the type depends
> > on fewer levels than the template arguments used to materialize it.
> > This can only happen when the type is defined in an explicit
> > specialization scope. This explicit specialization might not occur in
> > its immediate scope.
> > Note that partial specialization (while changing the set of
> > parameters) cannot reduce the number of levels for the type.
> > Because of the fact that the enclosing scope of any explicit
> > specialization is explicitly specialized
> > (https://eel.is/c++draft/temp.expl.spec#16), the type will not depend
> > on template parameters outside of its innermost explicit specialized
> > scope.
> > Assuming that there are no real missing levels, by removing those
> > levels, missing_level should be = 0. As a result, by roughly doing
> >
> > if (missing_levels < 0) {
> >   tree trimmed_full_args = get_innermost_template_args(full_targs,
> > TEMPLATE_TYPE_ORIG_LEVEL(auto_node));
> >   full_targs = trimmed_full_args;
> > }
> > in pt.cc:31262, where we calculate and check missing_levels, we would
> > be able to fix the errors.
> > Note that, for the case where there are real missing levels, we are
> > putting irrelevant template arguments for the missing levels instead
> > of make_tree_vec(0). By this change:
> > - If the type is independent of those missing levels: it works fine either 
> > way.
> > - If the type is dependent on those missing levels: Instead of raising
> > an ICE, the compiler exhibits undefined behavior.

Makes sense.

> ---
>  gcc/cp/pt.cc  | 14 ++--
>  .../g++.dg/cpp2a/concepts-placeholder14.C | 19 +++
>  .../g++.dg/cpp2a/concepts-placeholder15.C | 15 +
>  .../g++.dg/cpp2a/concepts-placeholder16.C | 33 +++
>  4 files changed, 78 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-placeholder14.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-placeholder15.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-placeholder16.C
> 
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index 3b2106dd3..bdf03a1a7 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -31044,7 +31044,8 @@ unparenthesized_id_or_class_member_access_p (tree 
> init)
> OUTER_TARGS is used during template argument deduction (context == 
> adc_unify)
> to properly substitute the result.  It's also used in the adc_unify and
> adc_requirement contexts to communicate the necessary template arguments
> -   to satisfaction.  

[PATCH] aarch64: Add fcsel to cmov integer and csel to float cmov [PR98477]

2024-05-06 Thread Andrew Pinski
This patch adds an alternative to the integer cmov and one to floating
point cmov so we avoid in some more moving

PR target/98477

gcc/ChangeLog:

* config/aarch64/aarch64.md (*cmov_insn[GPI]): Add 'w'
alternative.
(*cmov_insn[GPF]): Add 'r' alternative.
* config/aarch64/iterators.md (wv): New mode attr.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/csel_1.c: New test.
* gcc.target/aarch64/fcsel_2.c: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/config/aarch64/aarch64.md  | 13 +++
 gcc/config/aarch64/iterators.md|  4 
 gcc/testsuite/gcc.target/aarch64/csel_1.c  | 27 ++
 gcc/testsuite/gcc.target/aarch64/fcsel_2.c | 20 
 4 files changed, 59 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/csel_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/fcsel_2.c

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 2bdd443e71d..a6cedd0f1b8 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4404,6 +4404,7 @@ (define_insn "*cmov_insn"
  [ r, Ui1 , rZ  ; csel] csinc\t%0, %4, zr, %M1
  [ r, UsM , UsM ; mov_imm ] mov\t%0, -1
  [ r, Ui1 , Ui1 ; mov_imm ] mov\t%0, 1
+ [ w, w   , w   ; fcsel   ] fcsel\t%0, %3, %4, %m1
   }
 )
 
@@ -4464,15 +4465,17 @@ (define_insn "*cmovdi_insn_uxtw"
 )
 
 (define_insn "*cmov_insn"
-  [(set (match_operand:GPF 0 "register_operand" "=w")
+  [(set (match_operand:GPF 0 "register_operand" "=r,w")
(if_then_else:GPF
 (match_operator 1 "aarch64_comparison_operator"
  [(match_operand 2 "cc_register" "") (const_int 0)])
-(match_operand:GPF 3 "register_operand" "w")
-(match_operand:GPF 4 "register_operand" "w")))]
+(match_operand:GPF 3 "register_operand" "r,w")
+(match_operand:GPF 4 "register_operand" "r,w")))]
   "TARGET_FLOAT"
-  "fcsel\\t%0, %3, %4, %m1"
-  [(set_attr "type" "fcsel")]
+  "@
+   csel\t%0, %3, %4, %m1
+   fcsel\\t%0, %3, %4, %m1"
+  [(set_attr "type" "fcsel,csel")]
 )
 
 (define_expand "movcc"
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 99cde46f1ba..42303f2ec02 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -1147,6 +1147,10 @@ (define_mode_attr e [(CCFP "") (CCFPE "e")])
 ;; 32-bit version and "%x0" in the 64-bit version.
 (define_mode_attr w [(QI "w") (HI "w") (SI "w") (DI "x") (SF "s") (DF "d")])
 
+;; For cmov template to be used with fscel instruction
+(define_mode_attr wv [(QI "s") (HI "s") (SI "s") (DI "d") (SF "s") (DF "d")])
+
+
 ;; The size of access, in bytes.
 (define_mode_attr ldst_sz [(SI "4") (DI "8")])
 ;; Likewise for load/store pair.
diff --git a/gcc/testsuite/gcc.target/aarch64/csel_1.c 
b/gcc/testsuite/gcc.target/aarch64/csel_1.c
new file mode 100644
index 000..5848e5be2ff
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/csel_1.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-ssa-phiopt" } */
+/* PR target/98477 */
+
+/* We should be able to produce csel followed by a store
+   and not move between the GPRs and simd registers. */
+/* Note -fno-ssa-phiopt is needed, otherwise the tree level
+   does the VCE after the cmov which allowed to use the csel
+   instruction. */
+_Static_assert (sizeof(long long) == sizeof(double));
+void
+foo (int a, double *b, long long c, long long d)
+{
+  double ct;
+  double dt;
+  __builtin_memcpy(, , sizeof(long long));
+  __builtin_memcpy(, , sizeof(long long));
+  double t = a ? ct : dt;
+  *b = t;
+}
+
+/* { dg-final { scan-assembler-not "\tfcsel\t"  } } */
+/* { dg-final { scan-assembler-times "\tcsel\t" 1 } } */
+/* The store should still happen from the GPRs */
+/* { dg-final { scan-assembler-not "\tstr\td"  } } */
+/* { dg-final { scan-assembler-times "\tstr\tx" 1 } } */
+/* { dg-final { scan-assembler-not "\tfmov\t" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/fcsel_2.c 
b/gcc/testsuite/gcc.target/aarch64/fcsel_2.c
new file mode 100644
index 000..309e8cbe37f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fcsel_2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* PR target/98477 */
+
+#define vector16 __attribute__((vector_size(16)))
+/* We should be able to produce fscel followed by a store
+   and not move between the GPRs and simd registers. */
+void
+foo (int a, int *b, vector16 int c, vector16 int d)
+{
+  int t = a ? c[0] : d[0];
+  *b = t;
+}
+
+/* { dg-final { scan-assembler-times "\tfcsel\t" 1 } } */
+/* { dg-final { scan-assembler-not "\tcsel\t" } } */
+/* The store should still happen from the simd register */
+/* { dg-final { scan-assembler-times "\tstr\ts" 1 } } */
+/* { dg-final { scan-assembler-not "\tstr\tw" } } */
+/* { dg-final { scan-assembler-not "\tfmov\t" } } */
-- 
2.43.0



Re: [PATCH v2 1/6] ctf, btf: restructure CTF/BTF emission

2024-05-06 Thread David Faust



On 5/3/24 2:02 PM, Indu Bhagat wrote:
> On 5/2/24 10:11, David Faust wrote:
>> This commit makes some structural changes to the CTF/BTF debug info
>> emission.  In particular:
>>
>>   a) CTF is new always fully generated and emitted before any
>>  BTF-related procedures are run.  This means that BTF-related
>>  functions can change, even irreversibly, the shared in-memory
>>  representation used by the two formats without issue.
>>
>>   b) BTF generation has fewer entry points, and is cleanly divided
>>  into early_finish and finish.
>>
>>   c) BTF is now always emitted at finish (called from dwarf2out_finish),
>>  rather than being emitted at early_finish for targets other than
>>  BPF CO-RE.  Note that this change alone does not alter the contents
>>  of BTF at all, regardless of whether it would have previously been
>>  emitted at early_finish or finish.
>>
> 
> This will necessitate that we disallow -gbtf with -flto for non-BPF 
> targets.  Emitting BTF always at dwarf2out_finish will not work with LTO.

Yes, you're right.  I was not thinking about LTO when I wrote this.

I suppose the obvious fix is to undo "c)" above.  That change is not
really strictly necessary for the rest of the refactor in this series,
though we would lose the fix for PR debug/113566 in patch 4, since
fixing that requires us to generate some of the BTF at late finish time.
 Then BTF emission would be the same as it is currently:

BTF for BPF target, non-LTO-> (late) finish
BTF for BPF target, LTO-> forbidden by BPF back-end
BTF for non-BPF target -> early_finish

Alternatively, we could undo "c)" only for LTO builds, i.e.:

BTF for BPF target  non-LTO-> (late) finish
BTF for BPF target, LTO-> forbidden by BPF back-end
BTF for other targets, non-LTO -> (late) finish
BTF for other targets, LTO -> early_finish

which would allow fixing debug/113566 for non-LTO builds.  Then the
differences in BTF related to debug/113566 should only appear between
LTO and non-LTO builds, rather than between BPF and non-BPF targets.

IMO the latter (move emission back to early_finish for LTO builds) is a
little better.

Either way, I think the majority of code related to BTF generation from
later in this series should not need to change much if at all - only
whether those routines end up called at early or late finish.

> 
>> The changes are transparent to both CTF and BTF emission.
>>
>> gcc/
>>  * btfout.cc (btf_init_postprocess): Rename to...
>>  (btf_early_finish): ...this.
>>  (btf_output): Rename to...
>>  (btf_finish): ...this.
>>  * ctfc.h: Analogous changes.
>>  * dwarf2ctf.cc (ctf_debug_early_finish): Conditionally call
>>  btf_early_finish or ctf_finalize as appropriate.
>>  (ctf_debug_finish): Always call btf_finish here if generating
>>  BTF info.
>>  (ctf_debug_finalize, ctf_debug_init_postprocess): Delete.
>>  * dwarf2out.cc (dwarf2out_early_finish): Remove call to
>>  ctf_debug_init_postprocess.
>> ---
>>   gcc/btfout.cc| 28 +
>>   gcc/ctfc.h   |  4 ++--
>>   gcc/dwarf2ctf.cc | 54 +++-
>>   gcc/dwarf2out.cc |  2 --
>>   4 files changed, 42 insertions(+), 46 deletions(-)
>>
>> diff --git a/gcc/btfout.cc b/gcc/btfout.cc
>> index 07f066a4706..1b6a9ed811f 100644
>> --- a/gcc/btfout.cc
>> +++ b/gcc/btfout.cc
>> @@ -1491,6 +1491,34 @@ btf_finalize (void)
>> tu_ctfc = NULL;
>>   }
>>   
>> +/* Initial entry point of BTF generation, called at early_finish () after
>> +   CTF information has possibly been output.  Translate all CTF information
>> +   to BTF, and do any processing that must be done early, such as creating
>> +   BTF_KIND_FUNC records.  */
>> +
>> +void
>> +btf_early_finish (void)
>> +{
>> +  btf_init_postprocess ();
>> +}
>> +
>> +/* Late entry point for BTF generation, called from dwarf2out_finish ().
>> +   Complete and emit BTF information.  */
>> +
>> +void
>> +btf_finish (const char * filename)
>> +{
>> +  btf_output (filename);
>> +
>> +  /* If compiling for BPF with CO-RE info, we cannot deallocate until after
>> + CO-RE information is created, which happens very late in BPF backend.
>> + Therefore, the deallocation (i.e. btf_finalize ()) is delayed until
>> + TARGET_ASM_FILE_END for BPF CO-RE.  */
>> +  if (!btf_with_core_debuginfo_p ())
>> +btf_finalize ();
>> +}
>> +
>> +
>>   /* Traversal function for all BTF_KIND_FUNC type records.  */
>>   
>>   bool
>> diff --git a/gcc/ctfc.h b/gcc/ctfc.h
>> index fa188bf2f5a..e7bd93901cf 100644
>> --- a/gcc/ctfc.h
>> +++ b/gcc/ctfc.h
>> @@ -384,8 +384,8 @@ extern void ctf_init (void);
>>   extern void ctf_output (const char * filename);
>>   extern void ctf_finalize (void);
>>   
>> -extern void btf_output (const char * filename);
>> -extern void btf_init_postprocess (void);
>> +extern void btf_early_finish (void);
>> +extern void btf_finish (const char * filename);

Re: [PATCH, libgfortran] aix: Fix building fat library for AIX

2024-05-06 Thread FX Coudert
> libgfortran/ChangeLog:
> * config/t-aix (all-local, libcaf_single): Explicitly reference
> caf/.libs/single.o

OK, and sorry for the breakage.

FX



Re: [PATCH] c++: Allow IS_FAKE_BASE_TYPE for union types [PR114954]

2024-05-06 Thread Jason Merrill

On 5/6/24 02:32, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

In some circumstances, unions can also have an __as_base type;


Hmm, even though unions can't be bases I guess that is needed for 
something like


union U {
  int i;
private:
  char c[5];
};

struct A {
  [[no_unique_address]] U u;
  char d;
};

static_assert (sizeof (A) == sizeof (U));

The patch is OK.


we need to make sure that IS_FAKE_BASE_TYPE correctly recognises this.

PR c++/114954

gcc/cp/ChangeLog:

* cp-tree.h (IS_FAKE_BASE_TYPE): Also apply to unions.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr114954.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/cp-tree.h|  2 +-
  gcc/testsuite/g++.dg/modules/pr114954.C | 14 ++
  2 files changed, 15 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/modules/pr114954.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 933504b4821..fa24217eb2b 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -2616,7 +2616,7 @@ struct GTY(()) lang_type {
  
  /* True iff NODE is the CLASSTYPE_AS_BASE version of some type.  */

  #define IS_FAKE_BASE_TYPE(NODE)   \
-  (TREE_CODE (NODE) == RECORD_TYPE \
+  (RECORD_OR_UNION_TYPE_P (NODE)   \
 && TYPE_CONTEXT (NODE) && CLASS_TYPE_P (TYPE_CONTEXT (NODE))   \
 && CLASSTYPE_AS_BASE (TYPE_CONTEXT (NODE)) == (NODE))
  
diff --git a/gcc/testsuite/g++.dg/modules/pr114954.C b/gcc/testsuite/g++.dg/modules/pr114954.C

new file mode 100644
index 000..a9787140808
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr114954.C
@@ -0,0 +1,14 @@
+// PR c++/114954
+// { dg-additional-options "-fmodules-ts" }
+// { dg-module-cmi main }
+
+export module main;
+
+template 
+union U {
+private:
+  char a[N + 1];
+  int b;
+};
+
+U<4> p;




ping [PATCH] contrib: add cxx-dr-table.sh

2024-05-06 Thread Marek Polacek
A patch to to keep 
up-to-date:


Marek



Re: [PATCH] Fix PR c++/105760: ICE in build_deduction_guide for invalid template

2024-05-06 Thread Jason Merrill

On 5/6/24 09:20, Simon Martin wrote:

Hi,

We currently ICE upon the following invalid snippet because we fail to 
properly handle tsubst_arg_types returning error_mark_node in 
build_deduction_guide.


== cut ==
template
struct A { A(Ts...); };
A a;
== cut ==

This patch fixes this, and has been successfully tested on 
x86_64-pc-linux-gnu. OK for trunk?


OK, thanks.


Thanks!

-- Simon

diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index a78d9d546d6..9acef73e7ac 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,9 @@
+2024-05-06  Simon Martin  
+
+   PR c++/105760
+   * pt.c (build_deduction_guide): Check for error_mark_node
+   result from tsubst_arg_types.
+
  2024-05-03  Jason Merrill  

     PR c++/114935
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index d68d688016d..da5d9b8a665 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -30018,6 +30018,8 @@ build_deduction_guide (tree type, tree ctor, 
tree outer_args, tsubst_flags_t com

  references to members of an unknown specialization.  */
   cp_evaluated ev;
   fparms = tsubst_arg_types (fparms, targs, NULL_TREE, 
complain, ctor);

+ if (fparms == error_mark_node)
+   ok = false;
   fargs = tsubst (fargs, targs, complain, ctor);
   if (ci)
     {
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 03c88bbed07..8c606a8fb4f 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2024-05-06  Simon Martin  
+
+   PR c++/105760
+   * g++.dg/parse/error66.C: New test.
+
  2024-05-05  Harald Anlauf  

     PR fortran/114827
diff --git a/gcc/testsuite/g++.dg/parse/error66.C 
b/gcc/testsuite/g++.dg/parse/error66.C

new file mode 100644
index 000..82f4b8b8a53
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/error66.C
@@ -0,0 +1,6 @@
+// PR c++/105760
+// { dg-do compile { target c++17 } }
+
+template // { dg-error "must be at the end of the 
template parameter list" }

+struct A { A(Ts...); };
+A a;





Re: [PATCH] contrib/gcc-changelog/git_check_commit.py: Implement --num-commits

2024-05-06 Thread Ken Matsui
On Mon, May 6, 2024 at 9:20 AM Jason Merrill  wrote:
>
> On 5/6/24 09:25, Ken Matsui wrote:
> > On Thu, Mar 14, 2024 at 12:57 AM Ken Matsui  
> > wrote:
> >>
> >> On Fri, Mar 8, 2024 at 8:42 AM Patrick Palka  wrote:
> >>>
> >>> On Wed, 28 Feb 2024, Ken Matsui wrote:
> >>>
>  This patch implements a --num-commits (-n) flag for shorthand for
>  the range of hash~N..hash commits.
> >>
> >> Ping.
> >
> > Ping.  Ok for trunk?
>
> OK.

Thank you!

>
> 
>  contrib/ChangeLog:
> 
> * gcc-changelog/git_check_commit.py: Implement --num-commits.
> >>>
> >>> LGTM
> >>>
> 
>  Signed-off-by: Ken Matsui 
>  ---
>    contrib/gcc-changelog/git_check_commit.py | 15 +++
>    1 file changed, 15 insertions(+)
> 
>  diff --git a/contrib/gcc-changelog/git_check_commit.py 
>  b/contrib/gcc-changelog/git_check_commit.py
>  index 8cca9f439a5..22e032e8b38 100755
>  --- a/contrib/gcc-changelog/git_check_commit.py
>  +++ b/contrib/gcc-changelog/git_check_commit.py
>  @@ -22,6 +22,12 @@ import argparse
> 
>    from git_repository import parse_git_revisions
> 
>  +def nonzero_uint(value):
>  +ivalue = int(value)
>  +if ivalue <= 0:
>  +raise argparse.ArgumentTypeError('%s is not a non-zero positive 
>  integer' % value)
>  +return ivalue
>  +
>    parser = argparse.ArgumentParser(description='Check git ChangeLog 
>  format '
> 'of a commit')
>    parser.add_argument('revisions', default='HEAD', nargs='?',
>  @@ -33,8 +39,17 @@ parser.add_argument('-p', '--print-changelog', 
>  action='store_true',
>    help='Print final changelog entires')
>    parser.add_argument('-v', '--verbose', action='store_true',
>    help='Print verbose information')
>  +parser.add_argument('-n', '--num-commits', type=nonzero_uint, default=1,
>  +help='Number of commits to check (i.e. shorthand 
>  for '
>  +'hash~N..hash)')
>    args = parser.parse_args()
> 
>  +if args.num_commits > 1:
>  +if '..' in args.revisions:
>  +print('ERR: --num-commits and range of revisions are mutually 
>  exclusive')
>  +exit(1)
>  +args.revisions = '{0}~{1}..{0}'.format(args.revisions, 
>  args.num_commits)
>  +
>    retval = 0
>    for git_commit in parse_git_revisions(args.git_path, args.revisions):
>    res = 'OK' if git_commit.success else 'FAILED'
>  --
>  2.44.0
> 
> 
> >>>
> >
>


Re: [PATCH] contrib/gcc-changelog/git_check_commit.py: Implement --num-commits

2024-05-06 Thread Jason Merrill

On 5/6/24 09:25, Ken Matsui wrote:

On Thu, Mar 14, 2024 at 12:57 AM Ken Matsui  wrote:


On Fri, Mar 8, 2024 at 8:42 AM Patrick Palka  wrote:


On Wed, 28 Feb 2024, Ken Matsui wrote:


This patch implements a --num-commits (-n) flag for shorthand for
the range of hash~N..hash commits.


Ping.


Ping.  Ok for trunk?


OK.



contrib/ChangeLog:

   * gcc-changelog/git_check_commit.py: Implement --num-commits.


LGTM



Signed-off-by: Ken Matsui 
---
  contrib/gcc-changelog/git_check_commit.py | 15 +++
  1 file changed, 15 insertions(+)

diff --git a/contrib/gcc-changelog/git_check_commit.py 
b/contrib/gcc-changelog/git_check_commit.py
index 8cca9f439a5..22e032e8b38 100755
--- a/contrib/gcc-changelog/git_check_commit.py
+++ b/contrib/gcc-changelog/git_check_commit.py
@@ -22,6 +22,12 @@ import argparse

  from git_repository import parse_git_revisions

+def nonzero_uint(value):
+ivalue = int(value)
+if ivalue <= 0:
+raise argparse.ArgumentTypeError('%s is not a non-zero positive 
integer' % value)
+return ivalue
+
  parser = argparse.ArgumentParser(description='Check git ChangeLog format '
   'of a commit')
  parser.add_argument('revisions', default='HEAD', nargs='?',
@@ -33,8 +39,17 @@ parser.add_argument('-p', '--print-changelog', 
action='store_true',
  help='Print final changelog entires')
  parser.add_argument('-v', '--verbose', action='store_true',
  help='Print verbose information')
+parser.add_argument('-n', '--num-commits', type=nonzero_uint, default=1,
+help='Number of commits to check (i.e. shorthand for '
+'hash~N..hash)')
  args = parser.parse_args()

+if args.num_commits > 1:
+if '..' in args.revisions:
+print('ERR: --num-commits and range of revisions are mutually 
exclusive')
+exit(1)
+args.revisions = '{0}~{1}..{0}'.format(args.revisions, args.num_commits)
+
  retval = 0
  for git_commit in parse_git_revisions(args.git_path, args.revisions):
  res = 'OK' if git_commit.success else 'FAILED'
--
2.44.0










[PATCH, libgfortran] aix: Fix building fat library for AIX

2024-05-06 Thread David Edelsohn
aix: Fix building fat library for AIX

With the change in subdirectories, the code for libgfortran fat
libraries
needs to be adjusted to explicitly reference the subdirectory.  AIX
creates fat library archives and the compiler itself can be built as
either 32 bit or 64 bit application and default code generation.  For
the two, alternate versions of the compiler to interoperate, GCC needs
to construct the fat libraries manually.

The Makefile fragment had been trying to leverage as much of the
existing
targets and macros as possible.  With the subdirectory change, the
location of single.o is more obscured and cannot be determined without
libtool.  This patch references the location of the real object file
more explicitly.

Utilizing subst seems like overkill and unnecessary obscuration for a
single
object file.  Either way, it's digging below the libtool abstraction
layer.

This also fixes Fortran bootstrap on AIX.

Bootstrapped on powerpc-ibm-aix7.3.0.0

libgfortran/ChangeLog:
* config/t-aix (all-local, libcaf_single): Explicitly reference
caf/.libs/single.o

diff --git a/libgfortran/config/t-aix b/libgfortran/config/t-aix
index 0e50501d10e..099fc5d8b3a 100644
--- a/libgfortran/config/t-aix
+++ b/libgfortran/config/t-aix
@@ -7,6 +7,6 @@ ARX=$(shell echo $(AR) | sed -e 's/-X[^ ]*//g')
 all-local:
$(ARX) -X$(BITS) rc .libs/$(PACKAGE).a
../ppc$(BITS)/$(PACKAGE)/.libs/$(PACKAGE).so.$(MAJOR)
$(ARX) -X$(BITS) rc ../pthread/$(PACKAGE)/.libs/$(PACKAGE).a
../pthread/ppc$(BITS)/$(PACKAGE)/.libs/$(PACKAGE).so.$(MAJOR)
-   $(ARX) -X$(BITS) rc .libs/libcaf_single.a
../ppc$(BITS)/$(PACKAGE)/.libs/$(libcaf_single_la_OBJECTS:.lo=.o)
-   $(ARX) -X$(BITS) rc ../pthread/$(PACKAGE)/.libs/libcaf_single.a
../pthread/ppc$(BITS)/$(PACKAGE)/.libs/$(libcaf_single_la_OBJECTS:.lo=.o)
+   $(ARX) -X$(BITS) rc .libs/libcaf_single.a
../ppc$(BITS)/$(PACKAGE)/caf/.libs/single.o
+   $(ARX) -X$(BITS) rc ../pthread/$(PACKAGE)/.libs/libcaf_single.a
../pthread/ppc$(BITS)/$(PACKAGE)/caf/.libs/single.o
 endif


[PATCH v4 3/3] RISC-V: Implement IFN SAT_ADD for both the scalar and vector

2024-05-06 Thread pan2 . li
From: Pan Li 

This patch depends on below middle-end enabling patches for scalar and vector.

https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650822.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650823.html

The patch also implement the SAT_ADD in the riscv backend as
the sample for both the scalar and vector.  Given below vector
as example:

void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
  unsigned i;

  for (i = 0; i < n; i++)
out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i]));
}

Before this patch:
vec_sat_add_u64:
  ...
  vsetvli a5,a3,e64,m1,ta,ma
  vle64.v v0,0(a1)
  vle64.v v1,0(a2)
  sllia4,a5,3
  sub a3,a3,a5
  add a1,a1,a4
  add a2,a2,a4
  vadd.vv v1,v0,v1
  vmsgtu.vv   v0,v0,v1
  vmerge.vim  v1,v1,-1,v0
  vse64.v v1,0(a0)
  ...

After this patch:
vec_sat_add_u64:
  ...
  vsetvli a5,a3,e64,m1,ta,ma
  vle64.v v1,0(a1)
  vle64.v v2,0(a2)
  sllia4,a5,3
  sub a3,a3,a5
  add a1,a1,a4
  add a2,a2,a4
  vsaddu.vv   v1,v1,v2  <=  Vector Single-Width Saturating Add
  vse64.v v1,0(a0)
  ...

The below test suites are passed for this patch.
* The riscv fully regression tests.
* The aarch64 fully regression tests.
* The x86 bootstrap tests.
* The x86 fully regression tests.

PR target/51492
PR target/112600

gcc/ChangeLog:

* config/riscv/autovec.md (usadd3): New pattern expand for
the unsigned SAT_ADD in vector mode.
* config/riscv/riscv-protos.h (riscv_expand_usadd): New func decl
to expand usadd3 pattern.
(expand_vec_usadd): Ditto but for vector.
* config/riscv/riscv-v.cc (emit_vec_saddu): New func impl to emit
the vsadd insn.
(expand_vec_usadd): New func impl to expand usadd3 for vector.
* config/riscv/riscv.cc (riscv_expand_usadd): New func impl to
expand usadd3 for scalar.
* config/riscv/riscv.md (usadd3): New pattern expand for
the unsigned SAT_ADD in scalar mode.
* config/riscv/vector.md: Allow VLS mode for vsaddu.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c: New test.
* gcc.target/riscv/sat_arith.h: New test.
* gcc.target/riscv/sat_u_add-1.c: New test.
* gcc.target/riscv/sat_u_add-2.c: New test.
* gcc.target/riscv/sat_u_add-3.c: New test.
* gcc.target/riscv/sat_u_add-4.c: New test.
* gcc.target/riscv/sat_u_add-run-1.c: New test.
* gcc.target/riscv/sat_u_add-run-2.c: New test.
* gcc.target/riscv/sat_u_add-run-3.c: New test.
* gcc.target/riscv/sat_u_add-run-4.c: New test.
* gcc.target/riscv/scalar_sat_binary.h: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md   | 17 +
 gcc/config/riscv/riscv-protos.h   |  2 +
 gcc/config/riscv/riscv-v.cc   | 16 
 gcc/config/riscv/riscv.cc | 47 
 gcc/config/riscv/riscv.md | 11 +++
 gcc/config/riscv/vector.md| 12 +--
 .../riscv/rvv/autovec/binop/vec_sat_binary.h  | 33 
 .../riscv/rvv/autovec/binop/vec_sat_u_add-1.c | 19 +
 .../riscv/rvv/autovec/binop/vec_sat_u_add-2.c | 20 +
 .../riscv/rvv/autovec/binop/vec_sat_u_add-3.c | 20 +
 .../riscv/rvv/autovec/binop/vec_sat_u_add-4.c | 20 +
 .../rvv/autovec/binop/vec_sat_u_add-run-1.c   | 75 +++
 .../rvv/autovec/binop/vec_sat_u_add-run-2.c   | 75 +++
 .../rvv/autovec/binop/vec_sat_u_add-run-3.c   | 75 +++
 .../rvv/autovec/binop/vec_sat_u_add-run-4.c   | 75 +++
 gcc/testsuite/gcc.target/riscv/sat_arith.h| 31 
 gcc/testsuite/gcc.target/riscv/sat_u_add-1.c  | 19 +
 gcc/testsuite/gcc.target/riscv/sat_u_add-2.c  | 21 ++
 gcc/testsuite/gcc.target/riscv/sat_u_add-3.c  | 18 +
 gcc/testsuite/gcc.target/riscv/sat_u_add-4.c  | 17 +
 .../gcc.target/riscv/sat_u_add-run-1.c| 25 +++
 .../gcc.target/riscv/sat_u_add-run-2.c| 25 +++
 .../gcc.target/riscv/sat_u_add-run-3.c| 25 +++
 .../gcc.target/riscv/sat_u_add-run-4.c| 25 +++
 .../gcc.target/riscv/scalar_sat_binary.h  | 27 +++
 25 files changed, 744 insertions(+), 6 deletions(-)
 create mode 100644 

[PATCH v4 2/3] VECT: Support new IFN SAT_ADD for unsigned vector int

2024-05-06 Thread pan2 . li
From: Pan Li 

This patch depends on below scalar enabling patch:

https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650822.html

For vectorize, we leverage the existing vect pattern recog to find
the pattern similar to scalar and let the vectorizer to perform
the rest part for standard name usadd3 in vector mode.
The riscv vector backend have insn "Vector Single-Width Saturating
Add and Subtract" which can be leveraged when expand the usadd3
in vector mode.  For example:

void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
  unsigned i;

  for (i = 0; i < n; i++)
out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i]));
}

Before this patch:
void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
  ...
  _80 = .SELECT_VL (ivtmp_78, POLY_INT_CST [2, 2]);
  ivtmp_58 = _80 * 8;
  vect__4.7_61 = .MASK_LEN_LOAD (vectp_x.5_59, 64B, { -1, ... }, _80, 0);
  vect__6.10_65 = .MASK_LEN_LOAD (vectp_y.8_63, 64B, { -1, ... }, _80, 0);
  vect__7.11_66 = vect__4.7_61 + vect__6.10_65;
  mask__8.12_67 = vect__4.7_61 > vect__7.11_66;
  vect__12.15_72 = .VCOND_MASK (mask__8.12_67, { 18446744073709551615, ... }, 
vect__7.11_66);
  .MASK_LEN_STORE (vectp_out.16_74, 64B, { -1, ... }, _80, 0, vect__12.15_72);
  vectp_x.5_60 = vectp_x.5_59 + ivtmp_58;
  vectp_y.8_64 = vectp_y.8_63 + ivtmp_58;
  vectp_out.16_75 = vectp_out.16_74 + ivtmp_58;
  ivtmp_79 = ivtmp_78 - _80;
  ...
}

After this patch:
void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
  ...
  _62 = .SELECT_VL (ivtmp_60, POLY_INT_CST [2, 2]);
  ivtmp_46 = _62 * 8;
  vect__4.7_49 = .MASK_LEN_LOAD (vectp_x.5_47, 64B, { -1, ... }, _62, 0);
  vect__6.10_53 = .MASK_LEN_LOAD (vectp_y.8_51, 64B, { -1, ... }, _62, 0);
  vect__12.11_54 = .SAT_ADD (vect__4.7_49, vect__6.10_53);
  .MASK_LEN_STORE (vectp_out.12_56, 64B, { -1, ... }, _62, 0, vect__12.11_54);
  ...
}

The below test suites are passed for this patch.
* The riscv fully regression tests.
* The aarch64 fully regression tests.
* The x86 bootstrap tests.
* The x86 fully regression tests.

PR target/51492
PR target/112600

gcc/ChangeLog:

* tree-vect-patterns.cc (gimple_unsigned_integer_sat_add): New func
decl generated by match.pd match.
(vect_recog_sat_add_pattern): New func impl to recog the pattern
for unsigned SAT_ADD.

Signed-off-by: Pan Li 
---
 gcc/tree-vect-patterns.cc | 51 +++
 1 file changed, 51 insertions(+)

diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 87c2acff386..8ffcaf71d5c 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -4487,6 +4487,56 @@ vect_recog_mult_pattern (vec_info *vinfo,
   return pattern_stmt;
 }
 
+extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tree));
+
+/*
+ * Try to detect saturation add pattern (SAT_ADD), aka below gimple:
+ *   _7 = _4 + _6;
+ *   _8 = _4 > _7;
+ *   _9 = (long unsigned int) _8;
+ *   _10 = -_9;
+ *   _12 = _7 | _10;
+ *
+ * And then simplied to
+ *   _12 = .SAT_ADD (_4, _6);
+ */
+
+static gimple *
+vect_recog_sat_add_pattern (vec_info *vinfo, stmt_vec_info stmt_vinfo,
+   tree *type_out)
+{
+  gimple *last_stmt = STMT_VINFO_STMT (stmt_vinfo);
+
+  if (!is_gimple_assign (last_stmt))
+return NULL;
+
+  tree res_ops[2];
+  tree lhs = gimple_assign_lhs (last_stmt);
+
+  if (gimple_unsigned_integer_sat_add (lhs, res_ops, NULL))
+{
+  tree itype = TREE_TYPE (res_ops[0]);
+  tree vtype = get_vectype_for_scalar_type (vinfo, itype);
+
+  if (vtype != NULL_TREE && direct_internal_fn_supported_p (
+   IFN_SAT_ADD, vtype, OPTIMIZE_FOR_SPEED))
+   {
+ *type_out = vtype;
+ gcall *call = gimple_build_call_internal (IFN_SAT_ADD, 2, res_ops[0],
+   res_ops[1]);
+
+ gimple_call_set_lhs (call, vect_recog_temp_ssa_var (itype, NULL));
+ gimple_call_set_nothrow (call, /* nothrow_p */ false);
+ gimple_set_location (call, gimple_location (last_stmt));
+
+ vect_pattern_detected ("vect_recog_sat_add_pattern", last_stmt);
+ return call;
+   }
+}
+
+  return NULL;
+}
+
 /* Detect a signed division by a constant that wouldn't be
otherwise vectorized:
 
@@ -6987,6 +7037,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
   { vect_recog_vector_vector_shift_pattern, "vector_vector_shift" },
   { vect_recog_divmod_pattern, "divmod" },
   { vect_recog_mult_pattern, "mult" },
+  { vect_recog_sat_add_pattern, "sat_add" },
   { vect_recog_mixed_size_cond_pattern, "mixed_size_cond" },
   { vect_recog_gcond_pattern, "gcond" },
   { vect_recog_bool_pattern, "bool" },
-- 
2.34.1



[PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int

2024-05-06 Thread pan2 . li
From: Pan Li 

This patch would like to add the middle-end presentation for the
saturation add.  Aka set the result of add to the max when overflow.
It will take the pattern similar as below.

SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x))

Take uint8_t as example, we will have:

* SAT_ADD (1, 254)   => 255.
* SAT_ADD (1, 255)   => 255.
* SAT_ADD (2, 255)   => 255.
* SAT_ADD (255, 255) => 255.

Given below example for the unsigned scalar integer uint64_t:

uint64_t sat_add_u64 (uint64_t x, uint64_t y)
{
  return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x));
}

Before this patch:
uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
{
  long unsigned int _1;
  _Bool _2;
  long unsigned int _3;
  long unsigned int _4;
  uint64_t _7;
  long unsigned int _10;
  __complex__ long unsigned int _11;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _11 = .ADD_OVERFLOW (x_5(D), y_6(D));
  _1 = REALPART_EXPR <_11>;
  _10 = IMAGPART_EXPR <_11>;
  _2 = _10 != 0;
  _3 = (long unsigned int) _2;
  _4 = -_3;
  _7 = _1 | _4;
  return _7;
;;succ:   EXIT

}

After this patch:
uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
{
  uint64_t _7;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _7 = .SAT_ADD (x_5(D), y_6(D)); [tail call]
  return _7;
;;succ:   EXIT
}

We perform the tranform during widen_mult because that the sub-expr of
SAT_ADD will be optimized to .ADD_OVERFLOW.  We need to try the .SAT_ADD
pattern first and then .ADD_OVERFLOW,  or we may never catch the pattern
.SAT_ADD.  Meanwhile, the isel pass is after widen_mult and then we
cannot perform the .SAT_ADD pattern match as the sub-expr will be
optmized to .ADD_OVERFLOW first.

The below tests are passed for this patch:
1. The riscv fully regression tests.
2. The aarch64 fully regression tests.
3. The x86 bootstrap tests.
4. The x86 fully regression tests.

PR target/51492
PR target/112600

gcc/ChangeLog:

* internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD
to the return true switch case(s).
* internal-fn.def (SAT_ADD):  Add new signed optab SAT_ADD.
* match.pd: Add unsigned SAT_ADD match.
* optabs.def (OPTAB_NL): Remove fixed-point limitation for us/ssadd.
* tree-ssa-math-opts.cc (gimple_unsigned_integer_sat_add): New extern
func decl generated in match.pd match.
(match_saturation_arith): New func impl to match the saturation arith.
(math_opts_dom_walker::after_dom_children): Try match saturation
arith.

Signed-off-by: Pan Li 
---
 gcc/internal-fn.cc|  1 +
 gcc/internal-fn.def   |  2 ++
 gcc/match.pd  | 28 
 gcc/optabs.def|  4 ++--
 gcc/tree-ssa-math-opts.cc | 46 +++
 5 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 0a7053c2286..73045ca8c8c 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -4202,6 +4202,7 @@ commutative_binary_fn_p (internal_fn fn)
 case IFN_UBSAN_CHECK_MUL:
 case IFN_ADD_OVERFLOW:
 case IFN_MUL_OVERFLOW:
+case IFN_SAT_ADD:
 case IFN_VEC_WIDEN_PLUS:
 case IFN_VEC_WIDEN_PLUS_LO:
 case IFN_VEC_WIDEN_PLUS_HI:
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 848bb9dbff3..25badbb86e5 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -275,6 +275,8 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHS, ECF_CONST | 
ECF_NOTHROW, first,
 DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF_NOTHROW, first,
  smulhrs, umulhrs, binary)
 
+DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST, first, ssadd, usadd, binary)
+
 DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary)
 DEF_INTERNAL_COND_FN (SUB, ECF_CONST, sub, binary)
 DEF_INTERNAL_COND_FN (MUL, ECF_CONST, smul, binary)
diff --git a/gcc/match.pd b/gcc/match.pd
index d401e7503e6..7058e4cbe29 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3043,6 +3043,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
|| POINTER_TYPE_P (itype))
   && wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype))
 
+/* Unsigned Saturation Add */
+(match (usadd_left_part @0 @1)
+ (plus:c @0 @1)
+ (if (INTEGRAL_TYPE_P (type)
+  && TYPE_UNSIGNED (TREE_TYPE (@0))
+  && types_match (type, TREE_TYPE (@0))
+  && types_match (type, TREE_TYPE (@1)
+
+(match (usadd_right_part @0 @1)
+ (negate (convert (lt (plus:c @0 @1) @0)))
+ (if (INTEGRAL_TYPE_P (type)
+  && TYPE_UNSIGNED (TREE_TYPE (@0))
+  && types_match (type, TREE_TYPE (@0))
+  && types_match (type, TREE_TYPE (@1)
+
+(match (usadd_right_part @0 @1)
+ (negate (convert (gt @0 (plus:c @0 @1
+ (if (INTEGRAL_TYPE_P (type)
+  && TYPE_UNSIGNED (TREE_TYPE (@0))
+  && types_match (type, TREE_TYPE (@0))
+  && types_match (type, TREE_TYPE (@1)
+
+/* Unsigned saturation add, case 1 (branchless):
+   SAT_U_ADD = (X + Y) | - ((X + Y) < X) or
+   

Re: [V2][PATCH] gcc-14/changes.html: Deprecate a GCC C extension on flexible array members.

2024-05-06 Thread Qing Zhao
Hi, Sebastian,

Looks like that the behavior you described is correct.
What’s your major concern? ( a little confused).

Qing

On May 6, 2024, at 09:29, Sebastian Huber  
wrote:

On 06.05.24 09:08, Richard Biener wrote:
On Sat, 4 May 2024, Sebastian Huber wrote:
On 07.08.23 16:22, Qing Zhao via Gcc-patches wrote:
Hi,

This is the 2nd version of the patch.
Comparing to the 1st version, the only change is to address Richard's
comment on refering a warning option for diagnosing deprecated behavior.


Okay for committing?

thanks.

Qing

==

*htdocs/gcc-14/changes.html (Caveats): Add notice about deprecating a C
extension about flexible array members.
---
  htdocs/gcc-14/changes.html | 13 -
  1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index dad1ba53..eae25f1a 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -30,7 +30,18 @@ a work-in-progress.
  
  Caveats
  
-  ...
+  C:
+  Support for the GCC extension, a structure containing a C99 flexible
array
+  member, or a union containing such a structure, is not the last field
of
+  another structure, is deprecated. Refer to
+  https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html;>
+  Zero Length Arrays.
+  Any code relying on this extension should be modifed to ensure that
+  C99 flexible array members only end up at the ends of structures.
+  Please use the warning option
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wflex-array-member-not-at-end;>-Wflex-array-member-not-at-end
to
+  identify all such cases in the source code and modify them.
+  
  

I have a question with respect to the static initialization of flexible array
members. According to the documentation this is supported by GCC:

https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html

"GCC allows static initialization of flexible array members. This is
equivalent to defining a new structure containing the original structure
followed by an array of sufficient size to contain the data. E.g. in the
following, f1 is constructed as if it were declared like f2.

struct f1 {
  int x; int y[];
} f1 = { 1, { 2, 3, 4 } };

struct f2 {
  struct f1 f1; int data[3];
} f2 = { { 1 }, { 2, 3, 4 } };
"

However, when I compile this code, I get a warning like this:

flex-array.c:6:13: warning: structure containing a flexible array member is
not at the end of another structure [-Wflex-array-member-not-at-end]
6 |   struct f1 f1; int data[3];
  |

In general, I agree that flexible array members should be at the end, however
the support for static initialization is quite important from my point of view
especially for applications for embedded systems. Here, dynamic allocations
may not be allowed or feasible.
I do not get a diagnostic for this on trunk?  And I agree there shouldn't
be any.

It seems that this warning is not enabled by -Wall and -Wextra. I tried this:

gcc -Wflex-array-member-not-at-end -S -o - flex-array.c
   .file   "flex-array.c"
flex-array.c:6:13: warning: structure containing a flexible array member is not 
at the end of another structure [-Wflex-array-member-not-at-end]
   6 |   struct f1 f1; int data[3];
 | ^~
   .text
   .globl  f1
   .data
   .align 4
   .type   f1, @object
   .size   f1, 16
f1:
   .long   1
   .long   2
   .long   3
   .long   4
   .globl  f2
   .align 16
   .type   f2, @object
   .size   f2, 16
f2:
   .long   1
   .long   2
   .long   3
   .long   4
   .ident  "GCC: (GNU) 15.0.0 20240506 (experimental) [master ec1cdad89af]"
   .section.note.GNU-stack,"",@progbits

--
embedded brains GmbH & Co. KG
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: 
sebastian.hu...@embedded-brains.de<mailto:sebastian.hu...@embedded-brains.de>
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/



Re: [V2][PATCH] gcc-14/changes.html: Deprecate a GCC C extension on flexible array members.

2024-05-06 Thread Sebastian Huber

On 06.05.24 09:08, Richard Biener wrote:

On Sat, 4 May 2024, Sebastian Huber wrote:


On 07.08.23 16:22, Qing Zhao via Gcc-patches wrote:

Hi,

This is the 2nd version of the patch.
Comparing to the 1st version, the only change is to address Richard's
comment on refering a warning option for diagnosing deprecated behavior.


Okay for committing?

thanks.

Qing

==

*htdocs/gcc-14/changes.html (Caveats): Add notice about deprecating a C
extension about flexible array members.
---
   htdocs/gcc-14/changes.html | 13 -
   1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index dad1ba53..eae25f1a 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -30,7 +30,18 @@ a work-in-progress.
   
   Caveats
   
-  ...
+  C:
+  Support for the GCC extension, a structure containing a C99 flexible
array
+  member, or a union containing such a structure, is not the last field
of
+  another structure, is deprecated. Refer to
+  https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html;>
+  Zero Length Arrays.
+  Any code relying on this extension should be modifed to ensure that
+  C99 flexible array members only end up at the ends of structures.
+  Please use the warning option
+  https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wflex-array-member-not-at-end;>-Wflex-array-member-not-at-end
to
+  identify all such cases in the source code and modify them.
+  
   


I have a question with respect to the static initialization of flexible array
members. According to the documentation this is supported by GCC:

https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html

"GCC allows static initialization of flexible array members. This is
equivalent to defining a new structure containing the original structure
followed by an array of sufficient size to contain the data. E.g. in the
following, f1 is constructed as if it were declared like f2.

struct f1 {
   int x; int y[];
} f1 = { 1, { 2, 3, 4 } };

struct f2 {
   struct f1 f1; int data[3];
} f2 = { { 1 }, { 2, 3, 4 } };
"

However, when I compile this code, I get a warning like this:

flex-array.c:6:13: warning: structure containing a flexible array member is
not at the end of another structure [-Wflex-array-member-not-at-end]
 6 |   struct f1 f1; int data[3];
   |

In general, I agree that flexible array members should be at the end, however
the support for static initialization is quite important from my point of view
especially for applications for embedded systems. Here, dynamic allocations
may not be allowed or feasible.


I do not get a diagnostic for this on trunk?  And I agree there shouldn't
be any.


It seems that this warning is not enabled by -Wall and -Wextra. I tried 
this:


gcc -Wflex-array-member-not-at-end -S -o - flex-array.c
.file   "flex-array.c"
flex-array.c:6:13: warning: structure containing a flexible array member 
is not at the end of another structure [-Wflex-array-member-not-at-end]

6 |   struct f1 f1; int data[3];
  | ^~
.text
.globl  f1
.data
.align 4
.type   f1, @object
.size   f1, 16
f1:
.long   1
.long   2
.long   3
.long   4
.globl  f2
.align 16
.type   f2, @object
.size   f2, 16
f2:
.long   1
.long   2
.long   3
.long   4
.ident  "GCC: (GNU) 15.0.0 20240506 (experimental) [master 
ec1cdad89af]"

.section.note.GNU-stack,"",@progbits

--
embedded brains GmbH & Co. KG
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


Re: [PATCH] contrib/gcc-changelog/git_check_commit.py: Implement --num-commits

2024-05-06 Thread Ken Matsui
On Thu, Mar 14, 2024 at 12:57 AM Ken Matsui  wrote:
>
> On Fri, Mar 8, 2024 at 8:42 AM Patrick Palka  wrote:
> >
> > On Wed, 28 Feb 2024, Ken Matsui wrote:
> >
> > > This patch implements a --num-commits (-n) flag for shorthand for
> > > the range of hash~N..hash commits.
>
> Ping.

Ping.  Ok for trunk?



>
> > >
> > > contrib/ChangeLog:
> > >
> > >   * gcc-changelog/git_check_commit.py: Implement --num-commits.
> >
> > LGTM
> >
> > >
> > > Signed-off-by: Ken Matsui 
> > > ---
> > >  contrib/gcc-changelog/git_check_commit.py | 15 +++
> > >  1 file changed, 15 insertions(+)
> > >
> > > diff --git a/contrib/gcc-changelog/git_check_commit.py 
> > > b/contrib/gcc-changelog/git_check_commit.py
> > > index 8cca9f439a5..22e032e8b38 100755
> > > --- a/contrib/gcc-changelog/git_check_commit.py
> > > +++ b/contrib/gcc-changelog/git_check_commit.py
> > > @@ -22,6 +22,12 @@ import argparse
> > >
> > >  from git_repository import parse_git_revisions
> > >
> > > +def nonzero_uint(value):
> > > +ivalue = int(value)
> > > +if ivalue <= 0:
> > > +raise argparse.ArgumentTypeError('%s is not a non-zero positive 
> > > integer' % value)
> > > +return ivalue
> > > +
> > >  parser = argparse.ArgumentParser(description='Check git ChangeLog format 
> > > '
> > >   'of a commit')
> > >  parser.add_argument('revisions', default='HEAD', nargs='?',
> > > @@ -33,8 +39,17 @@ parser.add_argument('-p', '--print-changelog', 
> > > action='store_true',
> > >  help='Print final changelog entires')
> > >  parser.add_argument('-v', '--verbose', action='store_true',
> > >  help='Print verbose information')
> > > +parser.add_argument('-n', '--num-commits', type=nonzero_uint, default=1,
> > > +help='Number of commits to check (i.e. shorthand for 
> > > '
> > > +'hash~N..hash)')
> > >  args = parser.parse_args()
> > >
> > > +if args.num_commits > 1:
> > > +if '..' in args.revisions:
> > > +print('ERR: --num-commits and range of revisions are mutually 
> > > exclusive')
> > > +exit(1)
> > > +args.revisions = '{0}~{1}..{0}'.format(args.revisions, 
> > > args.num_commits)
> > > +
> > >  retval = 0
> > >  for git_commit in parse_git_revisions(args.git_path, args.revisions):
> > >  res = 'OK' if git_commit.success else 'FAILED'
> > > --
> > > 2.44.0
> > >
> > >
> >


Re: [PATCH v2] gcc, libcpp: Add warning switch for "#pragma once in main file" [PR89808]

2024-05-06 Thread Ken Matsui
On Thu, Mar 14, 2024 at 1:01 AM Ken Matsui  wrote:
>
> On Sat, Mar 2, 2024 at 5:04 AM Ken Matsui  wrote:
> >
> > This patch adds a warning switch for "#pragma once in main file".  The
> > warning option name is Wpragma-once-outside-header, which is the same
> > as Clang.
>
> Ping.

Ping.  Ok for trunk?



>
> >
> > PR preprocessor/89808
> >
> > gcc/c-family/ChangeLog:
> >
> > * c-opts.cc (c_common_handle_option): Handle
> > OPT_Wpragma_once_outside_header.
> > * c.opt (Wpragma_once_outside_header): Define new option.
> >
> > gcc/ChangeLog:
> >
> > * doc/invoke.texi (Warning Options): Document
> > -Wno-pragma-once-outside-header.
> >
> > libcpp/ChangeLog:
> >
> > * include/cpplib.h (struct cpp_options): Define
> > cpp_warn_pragma_once_outside_header.
> > * directives.cc (do_pragma_once): Use
> > cpp_warn_pragma_once_outside_header.
> > * init.cc (cpp_create_reader): Handle
> > cpp_warn_pragma_once_outside_header.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * g++.dg/Wpragma-once-outside-header.C: New test.
> > * g++.dg/warn/Wno-pragma-once-outside-header.C: New test.
> > * g++.dg/warn/Wpragma-once-outside-header.C: New test.
> >
> > Signed-off-by: Ken Matsui 
> > ---
> >  gcc/c-family/c-opts.cc |  9 +
> >  gcc/c-family/c.opt |  4 
> >  gcc/doc/invoke.texi| 10 --
> >  gcc/testsuite/g++.dg/Wpragma-once-outside-header.C |  5 +
> >  .../g++.dg/warn/Wno-pragma-once-outside-header.C   |  5 +
> >  .../g++.dg/warn/Wpragma-once-outside-header.C  |  5 +
> >  libcpp/directives.cc   |  8 ++--
> >  libcpp/include/cpplib.h|  4 
> >  libcpp/init.cc |  1 +
> >  9 files changed, 47 insertions(+), 4 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/Wpragma-once-outside-header.C
> >  create mode 100644 
> > gcc/testsuite/g++.dg/warn/Wno-pragma-once-outside-header.C
> >  create mode 100644 gcc/testsuite/g++.dg/warn/Wpragma-once-outside-header.C
> >
> > diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
> > index be3058dca63..4edd8c6c515 100644
> > --- a/gcc/c-family/c-opts.cc
> > +++ b/gcc/c-family/c-opts.cc
> > @@ -430,6 +430,15 @@ c_common_handle_option (size_t scode, const char *arg, 
> > HOST_WIDE_INT value,
> >cpp_opts->warn_num_sign_change = value;
> >break;
> >
> > +case OPT_Wpragma_once_outside_header:
> > +  if (value == 0)
> > +   cpp_opts->cpp_warn_pragma_once_outside_header = 0;
> > +  else if (kind == DK_ERROR)
> > +   cpp_opts->cpp_warn_pragma_once_outside_header = 2;
> > +  else
> > +   cpp_opts->cpp_warn_pragma_once_outside_header = 1;
> > +  break;
> > +
> >  case OPT_Wunknown_pragmas:
> >/* Set to greater than 1, so that even unknown pragmas in
> >  system headers will be warned about.  */
> > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> > index b7a4a1a68e3..6841a5a5e81 100644
> > --- a/gcc/c-family/c.opt
> > +++ b/gcc/c-family/c.opt
> > @@ -1180,6 +1180,10 @@ Wpragmas
> >  C ObjC C++ ObjC++ Var(warn_pragmas) Init(1) Warning
> >  Warn about misuses of pragmas.
> >
> > +Wpragma-once-outside-header
> > +C ObjC C++ ObjC++ Var(warn_pragma_once_outside_header) Init(1) Warning
> > +Warn about #pragma once outside of a header.
> > +
> >  Wprio-ctor-dtor
> >  C ObjC C++ ObjC++ Var(warn_prio_ctor_dtor) Init(1) Warning
> >  Warn if constructor or destructors with priorities from 0 to 100 are used.
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index bdf05be387d..eeb8954bcdf 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -391,8 +391,8 @@ Objective-C and Objective-C++ Dialects}.
> >  -Wpacked  -Wno-packed-bitfield-compat  -Wpacked-not-aligned  -Wpadded
> >  -Wparentheses  -Wno-pedantic-ms-format
> >  -Wpointer-arith  -Wno-pointer-compare  -Wno-pointer-to-int-cast
> > --Wno-pragmas  -Wno-prio-ctor-dtor  -Wredundant-decls
> > --Wrestrict  -Wno-return-local-addr  -Wreturn-type
> > +-Wno-pragmas  -Wno-pragma-once-outside-header  -Wno-prio-ctor-dtor
> > +-Wredundant-decls  -Wrestrict  -Wno-return-local-addr  -Wreturn-type
> >  -Wno-scalar-storage-order  -Wsequence-point
> >  -Wshadow  -Wshadow=global  -Wshadow=local  -Wshadow=compatible-local
> >  -Wno-shadow-ivar
> > @@ -7955,6 +7955,12 @@ Do not warn about misuses of pragmas, such as 
> > incorrect parameters,
> >  invalid syntax, or conflicts between pragmas.  See also
> >  @option{-Wunknown-pragmas}.
> >
> > +@opindex Wno-pragma-once-outside-header
> > +@opindex Wpragma-once-outside-header
> > +@item -Wno-pragma-once-outside-header
> > +Do not warn when @code{#pragma once} is used in a file that is not a header
> > +file, such as a main file.
> > +
> >  @opindex 

[PATCH] Fix PR c++/105760: ICE in build_deduction_guide for invalid template

2024-05-06 Thread Simon Martin

Hi,

We currently ICE upon the following invalid snippet because we fail to 
properly handle tsubst_arg_types returning error_mark_node in 
build_deduction_guide.


== cut ==
template
struct A { A(Ts...); };
A a;
== cut ==

This patch fixes this, and has been successfully tested on 
x86_64-pc-linux-gnu. OK for trunk?


Thanks!

-- Simon

diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index a78d9d546d6..9acef73e7ac 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,9 @@
+2024-05-06  Simon Martin  
+
+   PR c++/105760
+   * pt.c (build_deduction_guide): Check for error_mark_node
+   result from tsubst_arg_types.
+
 2024-05-03  Jason Merrill  

PR c++/114935
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index d68d688016d..da5d9b8a665 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -30018,6 +30018,8 @@ build_deduction_guide (tree type, tree ctor, 
tree outer_args, tsubst_flags_t com

 references to members of an unknown specialization.  */
  cp_evaluated ev;
  fparms = tsubst_arg_types (fparms, targs, NULL_TREE, 
complain, ctor);

+ if (fparms == error_mark_node)
+   ok = false;
  fargs = tsubst (fargs, targs, complain, ctor);
  if (ci)
{
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 03c88bbed07..8c606a8fb4f 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2024-05-06  Simon Martin  
+
+   PR c++/105760
+   * g++.dg/parse/error66.C: New test.
+
 2024-05-05  Harald Anlauf  

PR fortran/114827
diff --git a/gcc/testsuite/g++.dg/parse/error66.C 
b/gcc/testsuite/g++.dg/parse/error66.C

new file mode 100644
index 000..82f4b8b8a53
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/error66.C
@@ -0,0 +1,6 @@
+// PR c++/105760
+// { dg-do compile { target c++17 } }
+
+template // { dg-error "must be at the end of the 
template parameter list" }

+struct A { A(Ts...); };
+A a;


[PATCH] tree-optimization/100923 - re-do VN with contextual PTA info fix

2024-05-06 Thread Richard Biener
The following implements the gist of the PR100923 fix in a leaner
(and more complete) way by realizing that all ao_ref_init_from_vn_reference
uses need to have an SSA name in the base valueized with availability
in mind.  Instead of re-valueizing the whole chain of operands we can
simply only and always valueize the SSA name we put in the base.

This handles also two omitted places in vn_reference_lookup_3.

Bootstrapped and tested on x86_64-unknown-linux-gnu, will push later.

Richard.

PR tree-optimization/100923
* tree-ssa-sccvn.cc (ao_ref_init_from_vn_reference): Valueize
base SSA_NAME.
(vn_reference_lookup_3): Adjust vn_context_bb around calls
to ao_ref_init_from_vn_reference.
(vn_reference_lookup_pieces): Revert original PR100923 fix.
(vn_reference_lookup): Likewise.
---
 gcc/tree-ssa-sccvn.cc | 58 +++
 1 file changed, 25 insertions(+), 33 deletions(-)

diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index fbbfa557833..726e9d88b8f 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -1201,11 +1201,17 @@ ao_ref_init_from_vn_reference (ao_ref *ref,
case STRING_CST:
  /* This can show up in ARRAY_REF bases.  */
case INTEGER_CST:
-   case SSA_NAME:
  *op0_p = op->op0;
  op0_p = NULL;
  break;
 
+   case SSA_NAME:
+ /* SSA names we have to get at one available since it contains
+flow-sensitive info.  */
+ *op0_p = vn_valueize (op->op0);
+ op0_p = NULL;
+ break;
+
/* And now the usual component-reference style ops.  */
case BIT_FIELD_REF:
  offset += wi::to_poly_offset (op->op1);
@@ -2725,7 +2731,6 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
  copy_reference_ops_from_ref (lhs, _ops);
  valueize_refs_1 (_ops, _anything, true);
}
-  vn_context_bb = saved_rpo_bb;
   ao_ref_init (_ref, lhs);
   lhs_ref_ok = true;
   if (valueized_anything
@@ -2734,9 +2739,11 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
ao_ref_base_alias_set (_ref), TREE_TYPE (lhs), lhs_ops)
  && !refs_may_alias_p_1 (ref, _ref, data->tbaa_p))
{
+ vn_context_bb = saved_rpo_bb;
  *disambiguate_only = TR_VALUEIZE_AND_DISAMBIGUATE;
  return NULL;
}
+  vn_context_bb = saved_rpo_bb;
 
   /* When the def is a CLOBBER we can optimistically disambiguate
 against it since any overlap it would be undefined behavior.
@@ -3634,13 +3641,19 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
   /* Adjust *ref from the new operands.  */
   ao_ref rhs1_ref;
   ao_ref_init (_ref, rhs1);
+  basic_block saved_rpo_bb = vn_context_bb;
+  vn_context_bb = gimple_bb (def_stmt);
   if (!ao_ref_init_from_vn_reference (,
  force_no_tbaa ? 0
  : ao_ref_alias_set (_ref),
  force_no_tbaa ? 0
  : ao_ref_base_alias_set (_ref),
  vr->type, vr->operands))
-   return (void *)-1;
+   {
+ vn_context_bb = saved_rpo_bb;
+ return (void *)-1;
+   }
+  vn_context_bb = saved_rpo_bb;
   /* This can happen with bitfields.  */
   if (maybe_ne (ref->size, r.size))
{
@@ -3839,8 +3852,14 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
return data->finish (0, 0, val);
 
   /* Adjust *ref from the new operands.  */
+  basic_block saved_rpo_bb = vn_context_bb;
+  vn_context_bb = gimple_bb (def_stmt);
   if (!ao_ref_init_from_vn_reference (, 0, 0, vr->type, vr->operands))
-   return (void *)-1;
+   {
+ vn_context_bb = saved_rpo_bb;
+ return (void *)-1;
+   }
+  vn_context_bb = saved_rpo_bb;
   /* This can happen with bitfields.  */
   if (maybe_ne (ref->size, r.size))
return (void *)-1;
@@ -3928,31 +3947,13 @@ vn_reference_lookup_pieces (tree vuse, alias_set_type 
set,
   unsigned limit = param_sccvn_max_alias_queries_per_access;
   vn_walk_cb_data data (, NULL_TREE, NULL, kind, true, NULL_TREE,
false);
-  vec ops_for_ref;
-  if (!valueized_p)
-   ops_for_ref = vr1.operands;
-  else
-   {
- /* For ao_ref_from_mem we have to ensure only available SSA names
-end up in base and the only convenient way to make this work
-for PRE is to re-valueize with that in mind.  */
- ops_for_ref.create (operands.length ());
- ops_for_ref.quick_grow (operands.length ());
- memcpy (ops_for_ref.address (),
- operands.address (),
- sizeof (vn_reference_op_s)
- * operands.length ());
- 

[PATCH] Complete ao_ref_init_from_vn_reference for all refs

2024-05-06 Thread Richard Biener
This makes sure we can create ao_refs from all VN operands we create.

Bootstrapped and tested on x86_64-unknown-linux-gnu.  Will push later.

Richard.

* tree-ssa-sccvn.cc (ao_ref_init_from_vn_reference): Add
TARGET_MEM_REF support.  Handle more bases.
---
 gcc/tree-ssa-sccvn.cc | 51 ---
 1 file changed, 33 insertions(+), 18 deletions(-)

diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 02c3bd5f538..fbbfa557833 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -1148,8 +1148,29 @@ ao_ref_init_from_vn_reference (ao_ref *ref,
 {
   switch (op->opcode)
{
-   /* These may be in the reference ops, but we cannot do anything
-  sensible with them here.  */
+   case CALL_EXPR:
+ return false;
+
+   /* Record the base objects.  */
+   case MEM_REF:
+ *op0_p = build2 (MEM_REF, op->type,
+  NULL_TREE, op->op0);
+ MR_DEPENDENCE_CLIQUE (*op0_p) = op->clique;
+ MR_DEPENDENCE_BASE (*op0_p) = op->base;
+ op0_p = _OPERAND (*op0_p, 0);
+ break;
+
+   case TARGET_MEM_REF:
+ *op0_p = build5 (TARGET_MEM_REF, op->type,
+  NULL_TREE, op->op2, op->op0,
+  op->op1, ops[i+1].op0);
+ MR_DEPENDENCE_CLIQUE (*op0_p) = op->clique;
+ MR_DEPENDENCE_BASE (*op0_p) = op->base;
+ op0_p = _OPERAND (*op0_p, 0);
+ ++i;
+ break;
+
+   /* Unwrap some of the wrapped decls.  */
case ADDR_EXPR:
  /* Apart from ADDR_EXPR arguments to MEM_REF.  */
  if (base != NULL_TREE
@@ -1170,21 +1191,16 @@ ao_ref_init_from_vn_reference (ao_ref *ref,
  break;
}
  /* Fallthru.  */
-   case CALL_EXPR:
- return false;
-
-   /* Record the base objects.  */
-   case MEM_REF:
- *op0_p = build2 (MEM_REF, op->type,
-  NULL_TREE, op->op0);
- MR_DEPENDENCE_CLIQUE (*op0_p) = op->clique;
- MR_DEPENDENCE_BASE (*op0_p) = op->base;
- op0_p = _OPERAND (*op0_p, 0);
- break;
-
-   case VAR_DECL:
case PARM_DECL:
+   case CONST_DECL:
case RESULT_DECL:
+ /* ???  We shouldn't see these, but un-canonicalize what
+copy_reference_ops_from_ref does when visiting MEM_REF.  */
+   case VAR_DECL:
+ /* ???  And for this only have DECL_HARD_REGISTER.  */
+   case STRING_CST:
+ /* This can show up in ARRAY_REF bases.  */
+   case INTEGER_CST:
case SSA_NAME:
  *op0_p = op->op0;
  op0_p = NULL;
@@ -1234,13 +1250,12 @@ ao_ref_init_from_vn_reference (ao_ref *ref,
case VIEW_CONVERT_EXPR:
  break;
 
-   case STRING_CST:
-   case INTEGER_CST:
+   case POLY_INT_CST:
case COMPLEX_CST:
case VECTOR_CST:
case REAL_CST:
+   case FIXED_CST:
case CONSTRUCTOR:
-   case CONST_DECL:
  return false;
 
default:
-- 
2.35.3


Re: [PATCH v2] testsuite: Verify r0-r3 are extended with CMSE

2024-05-06 Thread Torbjorn SVENSSON

Hi,

Forgot to mention when I sent the patch that I would like to commit it 
to the following branches:


- releases/gcc-11
- releases/gcc-12
- releases/gcc-13
- releases/gcc-14
- trunk

Kind regards,
Torbjörn

On 2024-05-02 12:50, Torbjörn SVENSSON wrote:

Add regression test to the existing zero/sign extend tests for CMSE to
verify that r0, r1, r2 and r3 are properly extended, not just r0.

boolCharShortEnumSecureFunc test is done using -O0 to ensure the
instructions are in a predictable order.

gcc/testsuite/ChangeLog:

* gcc.target/arm/cmse/extend-param.c: Add regression test. Add
  -fshort-enums.
* gcc.target/arm/cmse/extend-return.c: Add -fshort-enums option.

Signed-off-by: Torbjörn SVENSSON 
---
  .../gcc.target/arm/cmse/extend-param.c| 21 +++
  .../gcc.target/arm/cmse/extend-return.c   |  4 ++--
  2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c 
b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
index 01fac786238..d01ef87e0be 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/extend-param.c
@@ -1,5 +1,5 @@
  /* { dg-do compile } */
-/* { dg-options "-mcmse" } */
+/* { dg-options "-mcmse -fshort-enums" } */
  /* { dg-final { check-function-bodies "**" "" "" } } */
  
  #include 

@@ -78,7 +78,6 @@ __attribute__((cmse_nonsecure_entry)) char enumSecureFunc 
(enum offset index) {
if (index >= ARRAY_SIZE)
  return 0;
return array[index];
-
  }
  
  /*

@@ -88,9 +87,23 @@ __attribute__((cmse_nonsecure_entry)) char enumSecureFunc 
(enum offset index) {
  **...
  */
  __attribute__((cmse_nonsecure_entry)) char boolSecureFunc (bool index) {
-
if (index >= ARRAY_SIZE)
  return 0;
return array[index];
+}
  
-}

\ No newline at end of file
+/*
+**__acle_se_boolCharShortEnumSecureFunc:
+** ...
+** uxtbr0, r0
+** uxtbr1, r1
+** uxthr2, r2
+** uxtbr3, r3
+** ...
+*/
+__attribute__((cmse_nonsecure_entry,optimize(0))) char 
boolCharShortEnumSecureFunc (bool a, unsigned char b, unsigned short c, enum 
offset d) {
+  size_t index = a + b + c + d;
+  if (index >= ARRAY_SIZE)
+return 0;
+  return array[index];
+}
diff --git a/gcc/testsuite/gcc.target/arm/cmse/extend-return.c 
b/gcc/testsuite/gcc.target/arm/cmse/extend-return.c
index cf731ed33df..081de0d699f 100644
--- a/gcc/testsuite/gcc.target/arm/cmse/extend-return.c
+++ b/gcc/testsuite/gcc.target/arm/cmse/extend-return.c
@@ -1,5 +1,5 @@
  /* { dg-do compile } */
-/* { dg-options "-mcmse" } */
+/* { dg-options "-mcmse -fshort-enums" } */
  /* { dg-final { check-function-bodies "**" "" "" } } */
  
  #include 

@@ -89,4 +89,4 @@ unsigned char __attribute__((noipa)) enumNonsecure0 
(ns_enum_foo_t * ns_foo_p)
  unsigned char boolNonsecure0 (ns_bool_foo_t * ns_foo_p)
  {
return ns_foo_p ();
-}
\ No newline at end of file
+}


[PING gcc-14?][PATCH v2] docs: Update function multiversioning documentation

2024-05-06 Thread Andrew Carlotti
Is this patch ok? I was hoping to get it merged before 14.1 releases, if it's
not yet too late for that.

On Tue, Apr 30, 2024 at 05:10:45PM +0100, Andrew Carlotti wrote:
> Add target_version attribute to Common Function Attributes and update
> target and target_clones documentation.  Move shared detail and examples
> to the Function Multiversioning page.  Add target-specific details to
> target-specific pages.
> 
> ---
> 
> Changes since v1:
> - Various typo fixes.
> - Reordered content in 'Function multiversioning' section to put 
> implementation
>   details at the end (as suggested in review).
> - Dropped links to outdated wiki page, and a couple of other unhelpful
>   sentences that the previous version preserved.
> 
> I've built and rechecked the info output.  Ok for master?  And is this ok for
> the GCC-14 branch too?
> 
> gcc/ChangeLog:
> 
>   * doc/extend.texi (Common Function Attributes): Update target
>   and target_clones documentation, and add target_version.
>   (AArch64 Function Attributes): Add ACLE reference and list
>   supported features.
>   (PowerPC Function Attributes): List supported features.
>   (x86 Function Attributes): Mention function multiversioning.
>   (Function Multiversioning): Update, and move shared detail here.
> 
> 
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 
> e290265d68d33f86a7e7ee9882cc0fd6bed00143..fefac70b5fffc350bf23db74a8fc88fa3bb99bd5
>  100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -4178,17 +4178,16 @@ and @option{-Wanalyzer-tainted-size}.
>  Multiple target back ends implement the @code{target} attribute
>  to specify that a function is to
>  be compiled with different target options than specified on the
> -command line.  The original target command-line options are ignored.
> -One or more strings can be provided as arguments.
> -Each string consists of one or more comma-separated suffixes to
> -the @code{-m} prefix jointly forming the name of a machine-dependent
> -option.  @xref{Submodel Options,,Machine-Dependent Options}.
> -
> +command line.  One or more strings can be provided as arguments.
>  The @code{target} attribute can be used for instance to have a function
>  compiled with a different ISA (instruction set architecture) than the
> -default.  @samp{#pragma GCC target} can be used to specify target-specific
> -options for more than one function.  @xref{Function Specific Option Pragmas},
> -for details about the pragma.
> +default.
> +
> +The options supported by the @code{target} attribute are specific to each
> +target; refer to @ref{x86 Function Attributes}, @ref{PowerPC Function
> +Attributes}, @ref{ARM Function Attributes}, @ref{AArch64 Function 
> Attributes},
> +@ref{Nios II Function Attributes}, and @ref{S/390 Function Attributes}
> +for details.
>  
>  For instance, on an x86, you could declare one function with the
>  @code{target("sse4.1,arch=core2")} attribute and another with
> @@ -4211,39 +4210,26 @@ multiple options is equivalent to separating the 
> option suffixes with
>  a comma (@samp{,}) within a single string.  Spaces are not permitted
>  within the strings.
>  
> -The options supported are specific to each target; refer to @ref{x86
> -Function Attributes}, @ref{PowerPC Function Attributes},
> -@ref{ARM Function Attributes}, @ref{AArch64 Function Attributes},
> -@ref{Nios II Function Attributes}, and @ref{S/390 Function Attributes}
> -for details.
> +@samp{#pragma GCC target} can be used to specify target-specific
> +options for more than one function.  @xref{Function Specific Option Pragmas},
> +for details about the pragma.
> +
> +On x86, the @code{target} attribute can also be used to create multiple
> +versions of a function, compiled with different target-specific options.
> +@xref{Function Multiversioning} for more details.
>  
>  @cindex @code{target_clones} function attribute
>  @item target_clones (@var{options})
>  The @code{target_clones} attribute is used to specify that a function
> -be cloned into multiple versions compiled with different target options
> -than specified on the command line.  The supported options and restrictions
> -are the same as for @code{target} attribute.
> -
> -For instance, on an x86, you could compile a function with
> -@code{target_clones("sse4.1,avx")}.  GCC creates two function clones,
> -one compiled with @option{-msse4.1} and another with @option{-mavx}.
> -
> -On a PowerPC, you can compile a function with
> -@code{target_clones("cpu=power9,default")}.  GCC will create two
> -function clones, one compiled with @option{-mcpu=power9} and another
> -with the default options.  GCC must be configured to use GLIBC 2.23 or
> -newer in order to use the @code{target_clones} attribute.
> -
> -It also creates a resolver function (see
> -the @code{ifunc} attribute above) that dynamically selects a clone
> -suitable for current architecture.  The resolver is created only if there
> -is a usage of a function with 

Re: [PATCH] middle-end/114931 - type_hash_canon and structual equality types

2024-05-06 Thread Martin Uecker
Am Montag, dem 06.05.2024 um 11:07 +0200 schrieb Richard Biener:
> On Mon, 6 May 2024, Martin Uecker wrote:
> 
> > Am Montag, dem 06.05.2024 um 09:00 +0200 schrieb Richard Biener:
> > > On Sat, 4 May 2024, Martin Uecker wrote:
> > > 
> > > > Am Freitag, dem 03.05.2024 um 21:16 +0200 schrieb Jakub Jelinek:
> > > > > > On Fri, May 03, 2024 at 09:11:20PM +0200, Martin Uecker wrote:
> > > > > > > > > > TYPE_CANONICAL as used by the middle-end cannot express 
> > > > > > > > > > this but
> > > > > > > > 
> > > > > > > > Hm. so how does it work now for arrays?
> > > > > > 
> > > > > > Do you have a testcase which doesn't work correctly with the arrays?
> > > > 
> > > > I am mostly trying to understand better how this works. But
> > > > if I am not mistaken, the following example would indeed
> > > > indicate that we do incorrect aliasing decisions for types
> > > > derived from arrays:
> > > > 
> > > > https://godbolt.org/z/rTsE3PhKc
> > > 
> > > This example is about pointer-to-array types, int (*)[2] and
> > > int (*)[1] are supposed to be compatible as in receive the same alias
> > > set. 
> > 
> > In C, char (*)[2] and char (*)[1] are not compatible. But with
> > COMPAT set, the example operates^1 with char (*)[] and char (*)[1]
> > which are compatible.  If we form equivalence classes, then
> > all three types would need to be treated as equivalent. 
> > 
> > ^1 Actually, pointer to functions returning pointers
> > to arrays. Probably this example can still be simplified...
> > 
> > >  This is ensured by get_alias_set POINTER_TYPE_P handling,
> > > the alias set is supposed to be the same as that of int *.  It seems
> > > we do restrict the handling a bit, the code does
> > > 
> > >   /* Unnest all pointers and references.
> > >  We also want to make pointer to array/vector equivalent to 
> > > pointer to
> > >  its element (see the reasoning above). Skip all those types, 
> > > too.  
> > > */
> > >   for (p = t; POINTER_TYPE_P (p)
> > >|| (TREE_CODE (p) == ARRAY_TYPE
> > >&& (!TYPE_NONALIASED_COMPONENT (p)
> > >|| !COMPLETE_TYPE_P (p)
> > >|| TYPE_STRUCTURAL_EQUALITY_P (p)))
> > >|| TREE_CODE (p) == VECTOR_TYPE;
> > >p = TREE_TYPE (p))
> > > 
> > > where the comment doesn't exactly match the code - but C should
> > > never have TYPE_NONALIASED_COMPONENT (p).
> > > 
> > > But maybe I misread the example or it goes wrong elsewhere.
> > 
> > If I am not confusing myself too much, the example shows that
> > aliasing analysis treats the the types as incompatible in
> > both cases, because it does not reload *a with -O2. 
> > 
> > For char (*)[1] and char (*)[2] this would be correct (but an
> > implementation exploiting this would need to do structural
> > comparisons and not equivalence classes) but for 
> > char (*)[2] and char (*)[] it is not.
> 
> Oh, these are function pointers, so it's about the alias set of
> a pointer to FUNCTION_TYPE.  I don't see any particular code
> trying to make char[] * (*)() and char[1] *(*)() inter-operate
> for TBAA iff the FUNCTION_TYPEs themselves are not having the
> same TYPE_CANONICAL.
> 
> Can you open a bugreport and please point to the relevant parts
> of the C standard that tells how pointer-to FUNCTION_TYPE TBAA
> is supposed to work?

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114959

Martin
> 

> Thanks,
> Richard.
> 
> > Martin
> > 
> > 
> > > 
> > > Richard.
> > > 
> > > > Martin
> > > > 
> > > > > > 
> > > > > > E.g. same_type_for_tbaa has
> > > > > >   type1 = TYPE_MAIN_VARIANT (type1);
> > > > > >   type2 = TYPE_MAIN_VARIANT (type2);
> > > > > > 
> > > > > >   /* Handle the most common case first.  */
> > > > > >   if (type1 == type2)
> > > > > > return 1;
> > > > > > 
> > > > > >   /* If we would have to do structural comparison bail out.  */
> > > > > >   if (TYPE_STRUCTURAL_EQUALITY_P (type1)
> > > > > >   || TYPE_STRUCTURAL_EQUALITY_P (type2))
> > > > > > return -1;
> > > > > > 
> > > > > >   /* Compare the canonical types.  */
> > > > > >   if (TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2))
> > > > > > return 1;
> > > > > > 
> > > > > >   /* ??? Array types are not properly unified in all cases as we 
> > > > > > have
> > > > > >  spurious changes in the index types for example.  Removing this
> > > > > >  causes all sorts of problems with the Fortran frontend.  */
> > > > > >   if (TREE_CODE (type1) == ARRAY_TYPE
> > > > > >   && TREE_CODE (type2) == ARRAY_TYPE)
> > > > > > return -1;
> > > > > > ...
> > > > > > and later compares alias sets and the like.
> > > > > > So, even if int[] and int[0] have different TYPE_CANONICAL, they
> > > > > > will be considered maybe the same.  Also, guess get_alias_set
> > > > > > has some ARRAY_TYPE handling...
> > > > > > 
> > > > > > Anyway, I think we should just go with Richi's patch.
> > > > > > 
> > > > > > Jakub
> > > > > > 
> > > > 
> > > > 
> > > > 
> > 

[PATCH] tree-optimization/114921 - _Float16 -> __bf16 isn't noop fixup

2024-05-06 Thread Richard Biener
The following further strengthens the check which convert expressions
we allow to vectorize as simple copy by resorting to
tree_nop_conversion_p on the vector components.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

PR tree-optimization/114921
* tree-vect-stmts.cc (vectorizable_assignment): Use
tree_nop_conversion_p to identify converts we can vectorize
with a simple assignment.
---
 gcc/tree-vect-stmts.cc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 7e571968a59..21e8fe98e44 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5957,15 +5957,15 @@ vectorizable_assignment (vec_info *vinfo,
 
   /* We can handle VIEW_CONVERT conversions that do not change the number
  of elements or the vector size or other conversions when the component
- mode keeps the same.  */
+ types are nop-convertible.  */
   if (!vectype_in
   || maybe_ne (TYPE_VECTOR_SUBPARTS (vectype_in), nunits)
   || (code == VIEW_CONVERT_EXPR
  && maybe_ne (GET_MODE_SIZE (TYPE_MODE (vectype)),
   GET_MODE_SIZE (TYPE_MODE (vectype_in
   || (CONVERT_EXPR_CODE_P (code)
- && (TYPE_MODE (TREE_TYPE (vectype))
- != TYPE_MODE (TREE_TYPE (vectype_in)
+ && !tree_nop_conversion_p (TREE_TYPE (vectype),
+TREE_TYPE (vectype_in
 return false;
 
   if (VECTOR_BOOLEAN_TYPE_P (vectype) != VECTOR_BOOLEAN_TYPE_P (vectype_in))
-- 
2.35.3


Re: [PATCH] Driver: Reject output filenames with the same suffixes as source files [PR80182]

2024-05-06 Thread Richard Biener
On Mon, May 6, 2024 at 10:29 AM Peter0x44  wrote:
>
> On Mon May 6, 2024 at 8:14 AM BST, Richard Biener wrote:
> > On Sat, May 4, 2024 at 9:36 PM Peter Damianov  wrote:
> > >
> > > Currently, commands like:
> > > gcc -o file.c -lm
> > > will delete the user's code.
> >
> > Since there's an error from the linker in the end (missing 'main'), I 
> > wonder if
> > the linker can avoid truncating/opening the output file instead?  A trivial
> > solution might be to open a temporary file first and only atomically replace
> > the output file with the temporary file when there were no errors?
> I think this is a great idea! The only concern I have is that I think
> for mingw targets it would be necessary to be careful to append .exe if
> the file has no suffix when moving the temporary file to the output
> file. Maybe some other targets have similar concerns.
> >
> > > This patch checks the suffix of the output, and errors if the output ends 
> > > in
> > > any of the suffixes listed in default_compilers.
> > >
> > > Unfortunately, I couldn't come up with a better heuristic to diagnose 
> > > this case
> > > more specifically, so it is now not possible to directly make executables 
> > > with
> > > said suffixes. I am unsure if any users are depending on this.
> >
> > A way to provide a workaround would be to require the file not existing.  So
> > change the heuristic to only trigger if the output file exists (and is
> > non-empty?).
> I guess this could work, and has a lower chance of breaking anyone
> depending on this behavior, but I think it would still be confusing to
> anyone who did rely on this behavior, since then it wouldn't be allowed
> to overwrite an executable with the ".c" name. If anyone did rely on
> this behavior, their build would succeed once, and then error for every
> subsequent invokation, which would be confusing. It seems to me it is
> not a meaningful improvement.

That's true and the behavior would be confusing.

> With your previous suggestion, this whole heuristic becomes unnecessary
> anyway, so I think I will just forego it.

It of course wouldn't handle the case if there isn't a link error like

gcc -o file -lm -r

but it should still be an improvement.  And yes, I typoed a wrong -o myself
a few times ...

Richard.

> >
> > Richard.
> >
> > > PR driver/80182
> > > * gcc.cc (process_command): fatal_error if the output has the 
> > > suffix of
> > >   a source file.
> > > (have_c): Change type to bool.
> > > (have_O): Change type to bool.
> > > (have_E): Change type to bool.
> > > (have_S): New global variable.
> > > (driver_handle_option): Assign have_S
> > >
> > > Signed-off-by: Peter Damianov 
> > > ---
> > >  gcc/gcc.cc | 29 ++---
> > >  1 file changed, 26 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/gcc/gcc.cc b/gcc/gcc.cc
> > > index 830a4700a87..53169c16460 100644
> > > --- a/gcc/gcc.cc
> > > +++ b/gcc/gcc.cc
> > > @@ -2127,13 +2127,16 @@ static vec at_file_argbuf;
> > >  static bool in_at_file = false;
> > >
> > >  /* Were the options -c, -S or -E passed.  */
> > > -static int have_c = 0;
> > > +static bool have_c = false;
> > >
> > >  /* Was the option -o passed.  */
> > > -static int have_o = 0;
> > > +static bool have_o = false;
> > >
> > >  /* Was the option -E passed.  */
> > > -static int have_E = 0;
> > > +static bool have_E = false;
> > > +
> > > +/* Was the option -S passed.  */
> > > +static bool have_S = false;
> > >
> > >  /* Pointer to output file name passed in with -o. */
> > >  static const char *output_file = 0;
> > > @@ -4593,6 +4596,10 @@ driver_handle_option (struct gcc_options *opts,
> > >have_E = true;
> > >break;
> > >
> > > +case OPT_S:
> > > +  have_S = true;
> > > +  break;
> > > +
> > >  case OPT_x:
> > >spec_lang = arg;
> > >if (!strcmp (spec_lang, "none"))
> > > @@ -5058,6 +5065,22 @@ process_command (unsigned int 
> > > decoded_options_count,
> > >output_file);
> > >  }
> > >
> > > +  /* Reject output file names that have the same suffix as a source
> > > + file. This is to catch mistakes like: gcc -o file.c -lm
> > > + that could delete the user's code. */
> > > +  if (have_o && output_file != NULL && !have_E && !have_S)
> > > +{
> > > +  const char* filename = lbasename(output_file);
> > > +  const char* suffix = strchr(filename, '.');
> > > +  if (suffix != NULL)
> > > +   for (int i = 0; i < n_default_compilers; ++i)
> > > + if (!strcmp(suffix, default_compilers[i].suffix))
> > > +   fatal_error (input_location,
> > > +"output file suffix %qs could be a source file",
> > > +suffix);
> > > +}
> > > +
> > > +
> > >if (output_file != NULL && output_file[0] == '\0')
> > >  fatal_error (input_location, "output filename may not be empty");
> > >
> > > --
> > > 2.39.2
> > >
>
> 

[COMMITTED] ada: Allow use of writable parameters inside function with side-effects

2024-05-06 Thread Marc Poulhiès
From: Piotr Trojanek 

Writable parameters can be used as global outputs inside functions with
side-effects.

gcc/ada/

* sem_prag.adb (Collect_Global_Item): Handle functions with
side-effects.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_prag.adb | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
index 25a98cb414e..5764992237b 100644
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -31656,8 +31656,9 @@ package body Sem_Prag is
--  outputs when the related type is access-to-variable.
 
if Ekind (Formal) = E_In_Parameter
- and then Ekind (Spec_Id) not in E_Function
-   | E_Generic_Function
+ and then (Ekind (Spec_Id) not in E_Function
+| E_Generic_Function
+ or else Is_Function_With_Side_Effects (Spec_Id))
  and then Is_Access_Variable (Etype (Formal))
then
   Append_New_Elmt (Formal, Subp_Outputs);
-- 
2.43.2



[COMMITTED] ada: Cleanup collecting of implicit outputs

2024-05-06 Thread Marc Poulhiès
From: Piotr Trojanek 

Move handling of IN parameters to where both IN and IN OUT parameters
are handled. This makes the code marginally more efficient and
symmetrical to handling of implicit inputs.

gcc/ada/

* sem_prag.adb (Collect_Global_Item): Move code.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_prag.adb | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
index ab60a8ad1d5..25a98cb414e 100644
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -31651,6 +31651,17 @@ package body Sem_Prag is
  while Present (Formal) loop
 if Ekind (Formal) in E_In_Out_Parameter | E_In_Parameter then
Append_New_Elmt (Formal, Subp_Inputs);
+
+   --  IN parameters of procedures and protected entries can act as
+   --  outputs when the related type is access-to-variable.
+
+   if Ekind (Formal) = E_In_Parameter
+ and then Ekind (Spec_Id) not in E_Function
+   | E_Generic_Function
+ and then Is_Access_Variable (Etype (Formal))
+   then
+  Append_New_Elmt (Formal, Subp_Outputs);
+   end if;
 end if;
 
 if Ekind (Formal) in E_In_Out_Parameter | E_Out_Parameter then
@@ -31667,17 +31678,6 @@ package body Sem_Prag is
end if;
 end if;
 
---  IN parameters of procedures and protected entries can act as
---  outputs when the related type is access-to-variable.
-
-if Ekind (Formal) = E_In_Parameter
-  and then Ekind (Spec_Id) not in E_Function
-| E_Generic_Function
-  and then Is_Access_Variable (Etype (Formal))
-then
-   Append_New_Elmt (Formal, Subp_Outputs);
-end if;
-
 Next_Formal (Formal);
  end loop;
 
-- 
2.43.2



[COMMITTED] ada: Tweak discriminant source locations

2024-05-06 Thread Marc Poulhiès
From: Ronan Desplanques 

This patch changes the source location information for the default
expressions of discrimants to better represent the fact that they're
evaluated at the point of object declaration, in the cases where
a Build_Default_Subtype optimization is performed. This fixes a
regression with CodePeer diagnostics introduced by a recent change
around Build_Default_Subtype optimizations.

gcc/ada/

* sem_util.adb (Build_Default_Subtype): Tweak source location
information.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index c47904f168c..18c9de05cf9 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -1780,7 +1780,8 @@ package body Sem_Util is
   begin
  while Present (Disc) loop
 Append_To (Constraints,
-  New_Copy_Tree (Discriminant_Default_Value (Disc)));
+   New_Copy_Tree
+ (Discriminant_Default_Value (Disc), New_Sloc => Loc));
 Next_Discriminant (Disc);
  end loop;
 
-- 
2.43.2



[COMMITTED] ada: Give error for reference to nonvisible library unit

2024-05-06 Thread Marc Poulhiès
From: Bob Duff 

This patch fixes a bug where the compiler would allow
a name X to refer to a library unit that is not visible.
In particular, this happens when the name X occurs in the
private part of a library package, and the parent of that
package contains an instantiation of a generic package, and the
spec of that generic package has "private with X;",
but there is no "private with X;" or "with X;" that applies
to the place where the name X occurs.

Also misc cleanup.

gcc/ada/

* sem_ch10.adb (Expand_With_Clause): Misc cleanup.
(Install_Private_With_Clauses): Avoid installing a private
with_clause that comes from an instantiated generic
(it is marked as Implicit_With, but doesn't come from a parent
with). Fix typo in comment, and other minor cleanups.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch10.adb | 49 ++--
 1 file changed, 29 insertions(+), 20 deletions(-)

diff --git a/gcc/ada/sem_ch10.adb b/gcc/ada/sem_ch10.adb
index 43adbbc54bf..7fc623b6278 100644
--- a/gcc/ada/sem_ch10.adb
+++ b/gcc/ada/sem_ch10.adb
@@ -3425,17 +3425,15 @@ package body Sem_Ch10 is
   --  Local variables
 
   Ent   : constant Entity_Id  := Entity (Nam);
-  Withn : Node_Id;
+  Withn : constant Node_Id :=
+Make_With_Clause
+  (Loc, Name => Build_Unit_Name (Nam),
+   First_Name => True, Last_Name => True);
 
--  Start of processing for Expand_With_Clause
 
begin
-  Withn :=
-Make_With_Clause (Loc,
-  Name => Build_Unit_Name (Nam));
-
   Set_Corresponding_Spec (Withn, Ent);
-  Set_First_Name (Withn);
   Set_Implicit_With  (Withn);
   Set_Library_Unit   (Withn, Parent (Unit_Declaration_Node (Ent)));
   Set_Parent_With(Withn);
@@ -3570,7 +3568,6 @@ package body Sem_Ch10 is
   P  : constant Node_Id:= Parent_Spec (Child_Unit);
   P_Unit : Node_Id := Unit (P);
   P_Name : constant Entity_Id  := Get_Parent_Entity (P_Unit);
-  Withn  : Node_Id;
 
   function Build_Ancestor_Name (P : Node_Id) return Node_Id;
   --  Build prefix of child unit name. Recurse if needed
@@ -3655,21 +3652,25 @@ package body Sem_Ch10 is
  return;
   end if;
 
-  Withn := Make_With_Clause (Loc, Name => Build_Unit_Name);
+  declare
+ Withn : constant Node_Id :=
+   Make_With_Clause
+ (Loc, Name => Build_Unit_Name,
+  First_Name => True, Last_Name => True);
+  begin
+ Set_Corresponding_Spec (Withn, P_Name);
+ Set_Implicit_With  (Withn);
+ Set_Library_Unit   (Withn, P);
+ Set_Parent_With(Withn);
 
-  Set_Corresponding_Spec (Withn, P_Name);
-  Set_First_Name (Withn);
-  Set_Implicit_With  (Withn);
-  Set_Library_Unit   (Withn, P);
-  Set_Parent_With(Withn);
+ --  Node is placed at the beginning of the context items, so that
+ --  subsequent use clauses on the parent can be validated.
 
-  --  Node is placed at the beginning of the context items, so that
-  --  subsequent use clauses on the parent can be validated.
+ Prepend (Withn, Context_Items (N));
+ Mark_Rewrite_Insertion (Withn);
 
-  Prepend (Withn, Context_Items (N));
-  Mark_Rewrite_Insertion (Withn);
-
-  Install_With_Clause (Withn);
+ Install_With_Clause (Withn);
+  end;
 
   if Is_Child_Spec (P_Unit) then
  Implicit_With_On_Parent (P_Unit, N);
@@ -4524,13 +4525,21 @@ package body Sem_Ch10 is
   if Nkind (Parent (Decl)) = N_Compilation_Unit then
  Item := First (Context_Items (Parent (Decl)));
  while Present (Item) loop
+--  If Item is a private with clause, install it, but do not
+--  install implicit private with's that come from (for example)
+--  with's on instantiated generics. DO install implicit private
+--  with's that come from parents, which is necessary in general,
+--  but ???not quite right if the former (generic) case also
+--  applies.
+
 if Nkind (Item) = N_With_Clause
   and then Private_Present (Item)
+  and then (not Implicit_With (Item) or else Parent_With (Item))
 then
--  If the unit is an ancestor of the current one, it is the
--  case of a private limited with clause on a child unit, and
--  the compilation of one of its descendants, in that case the
-   --  limited view is errelevant.
+   --  limited view is irrelevant.
 
if Limited_Present (Item) then
   if not Limited_View_Installed (Item)
-- 
2.43.2



[COMMITTED] ada: Excess finalization on return of call to dispatching constructor

2024-05-06 Thread Marc Poulhiès
From: Gary Dismukes 

The compiler expands a too-early finalization call for the result
object of an extended return statement that returns a call to a
dispatching constructor function for a limited interface type,
resulting in premature (and extra) finalization of the result.
The temporary object that the compiler creates to hold the result
of the build-in-place call loses the fact that it comes from a
return, and the wrong BIP allocation form may be passed in the
call to the dispatching constructor, and the later code for dealing
with finalization in Exp_Ch7.Build_Finalizer incorrectly finalizes
the result object.

gcc/ada/

* exp_ch6.adb
(Make_Build_In_Place_Iface_Call_In_Object_Declaration): Set the
Is_Return_Object flag on the entity of the temp object created to
hold the BIP call result, from the flag on the passed-in object
declaration's entity. Update copyright notice to 2024.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch6.adb | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
index fcfd1d7f0bf..a89c9af0bb2 100644
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -9393,6 +9393,16 @@ package body Exp_Ch6 is
   Insert_Action (Obj_Decl, Tmp_Decl);
   Expander_Mode_Restore;
 
+  --  Inherit Is_Return_Object from the parent object to the temp object,
+  --  so that Make_In_Build_Place_Call_In_Object_Declaration will handle
+  --  the temp properly in cases where there's a BIP_Alloc_Form formal of
+  --  an enclosing function that should be passed along (and which also
+  --  ensures that if the BIP call is used as a function result and it
+  --  requires finalization, then it will not be finalized prematurely
+  --  or redundantly).
+
+  Set_Is_Return_Object (Tmp_Id, Is_Return_Object (Obj_Id));
+
   Make_Build_In_Place_Call_In_Object_Declaration
 (Obj_Decl  => Tmp_Decl,
  Function_Call => Expression (Tmp_Decl));
-- 
2.43.2



[COMMITTED] ada: Deconstruct support for abstract states with Relaxed_Initialization

2024-05-06 Thread Marc Poulhiès
From: Piotr Trojanek 

GNATprove newer implemented support for abstract states with aspect
Relaxed_Initialization, so the frontend support is now deconstructed.

gcc/ada/

* einfo-utils.adb (Is_Relaxed_Initialization_State): Remove.
* einfo-utils.ads (Is_Relaxed_Initialization_State): Remove.
* einfo.ads: Remove description of removed aspect.
* fe.h (Is_Relaxed_Initialization_State): Remove.
* sem_prag.adb (Analyze_Abstract_State): Remove support for
Relaxed_Initialization.
* sem_util.adb (Has_Relaxed_Initialization): Likewise.
* sem_util.ads (Has_Relaxed_Initialization): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/einfo-utils.adb | 14 --
 gcc/ada/einfo-utils.ads |  1 -
 gcc/ada/einfo.ads   |  5 -
 gcc/ada/fe.h|  3 ---
 gcc/ada/sem_prag.adb| 25 +
 gcc/ada/sem_util.adb|  5 -
 gcc/ada/sem_util.ads|  6 +++---
 7 files changed, 12 insertions(+), 47 deletions(-)

diff --git a/gcc/ada/einfo-utils.adb b/gcc/ada/einfo-utils.adb
index 00799eb9bee..438868ac757 100644
--- a/gcc/ada/einfo-utils.adb
+++ b/gcc/ada/einfo-utils.adb
@@ -1649,20 +1649,6 @@ package body Einfo.Utils is
   and then Is_Protected_Type (Corresponding_Concurrent_Type (Id));
end Is_Protected_Record_Type;
 
-   -
-   -- Is_Relaxed_Initialization_State --
-   -
-
-   function Is_Relaxed_Initialization_State (Id : E) return B is
-   begin
-  --  To qualify, the abstract state must appear with simple option
-  --  "Relaxed_Initialization" (SPARK RM 6.10).
-
-  return
-Ekind (Id) = E_Abstract_State
-  and then Has_Option (Id, Name_Relaxed_Initialization);
-   end Is_Relaxed_Initialization_State;
-

-- Is_Standard_Character_Type --

diff --git a/gcc/ada/einfo-utils.ads b/gcc/ada/einfo-utils.ads
index 701d8ce59fb..d87a3e34f49 100644
--- a/gcc/ada/einfo-utils.ads
+++ b/gcc/ada/einfo-utils.ads
@@ -201,7 +201,6 @@ package Einfo.Utils is
function Is_Protected_Component (Id : E) return B with Inline;
function Is_Protected_Interface (Id : E) return B;
function Is_Protected_Record_Type (Id : E) return B with Inline;
-   function Is_Relaxed_Initialization_State (Id : E) return B;
function Is_Standard_Character_Type (Id : E) return B;
function Is_Standard_String_Type (Id : E) return B;
function Is_String_Type (Id : E) return B with Inline;
diff --git a/gcc/ada/einfo.ads b/gcc/ada/einfo.ads
index 6f563d5e62c..e3bfdb3507d 100644
--- a/gcc/ada/einfo.ads
+++ b/gcc/ada/einfo.ads
@@ -3219,10 +3219,6 @@ package Einfo is
 --   Applies to all entities, true for record types and subtypes,
 --   includes class-wide types and subtypes (which are also records).
 
---Is_Relaxed_Initialization_State (synthesized)
---   Applies to all entities, true for abstract states that are subject to
---   option Relaxed_Initialization.
-
 --Is_Remote_Call_Interface
 --   Defined in all entities. Set in E_Package and E_Generic_Package
 --   entities to which a pragma Remote_Call_Interface is applied, and
@@ -5129,7 +5125,6 @@ package Einfo is
--Has_Null_Visible_Refinement (synth)
--Is_External_State   (synth)
--Is_Null_State   (synth)
-   --Is_Relaxed_Initialization_State (synth)
--Is_Synchronized_State   (synth)
--Partial_Refinement_Constituents (synth)
 
diff --git a/gcc/ada/fe.h b/gcc/ada/fe.h
index 6aaa3fdc4d3..397045ea583 100644
--- a/gcc/ada/fe.h
+++ b/gcc/ada/fe.h
@@ -501,9 +501,6 @@ B Is_Protected_Interface  (E Id);
 #define Is_Protected_Record_Type einfo__utils__is_protected_record_type
 B Is_Protected_Record_Type(E Id);
 
-#define Is_Relaxed_Initialization_State 
einfo__utils__is_relaxed_initialization_state
-B Is_Relaxed_Initialization_State (E Id);
-
 #define Is_Standard_Character_Type einfo__utils__is_standard_character_type
 B Is_Standard_Character_Type  (E Id);
 
diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
index 39005aaea05..299e388167f 100644
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -12181,16 +12181,15 @@ package body Sem_Prag is
 is
--  Flags used to verify the consistency of options
 
-   AR_Seen : Boolean := False;
-   AW_Seen : Boolean := False;
-   ER_Seen : Boolean := False;
-   EW_Seen : Boolean := False;
-   External_Seen   : Boolean := False;
-   Ghost_Seen  : Boolean := False;
-   Others_Seen : Boolean := False;
-   Part_Of_Seen: 

[COMMITTED] ada: Do not attempt to generate finalization actions with restricted profile

2024-05-06 Thread Marc Poulhiès
From: Eric Botcazou 

These actions are not supported with this profile, but we were nevertheless
attempting to generate them for protected objects.

gcc/ada/

* exp_ch7.adb (Build_Finalizer.Process_Declarations): Do not call
Processing_Actions for simple protected objects if the profile is
restricted.
* exp_util.adb (Requires_Cleanup_Actions): Do not return True for
simple protected objects if the profile is restricted.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch7.adb  | 6 +-
 gcc/ada/exp_util.adb | 8 ++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb
index 7a8457683c5..99142a527fa 100644
--- a/gcc/ada/exp_ch7.adb
+++ b/gcc/ada/exp_ch7.adb
@@ -2526,9 +2526,12 @@ package body Exp_Ch7 is
then
   Processing_Actions (Decl);
 
-   --  Simple protected objects which use type System.Tasking.
+   --  Simple protected objects which use the type System.Tasking.
--  Protected_Objects.Protection to manage their locks should
--  be treated as controlled since they require manual cleanup.
+   --  but not for restricted run-time libraries (Ravenscar), see
+   --  also Cleanup_Protected_Object.
+
--  The only exception is illustrated in the following example:
 
-- package Pkg is
@@ -2561,6 +2564,7 @@ package body Exp_Ch7 is
elsif Ekind (Obj_Id) = E_Variable
  and then not In_Library_Level_Package_Body (Obj_Id)
  and then Has_Simple_Protected_Object (Obj_Typ)
+ and then not Restricted_Profile
then
   Processing_Actions (Decl, Is_Protected => True);
end if;
diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
index 732a02fc5d8..533127f26c2 100644
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -12999,9 +12999,12 @@ package body Exp_Util is
 then
return True;
 
---  Simple protected objects which use type System.Tasking.
+--  Simple protected objects which use the type System.Tasking.
 --  Protected_Objects.Protection to manage their locks should be
---  treated as controlled since they require manual cleanup.
+--  treated as controlled since they require manual cleanup, but
+--  not for restricted run-time libraries (Ravenscar), see also
+--  Cleanup_Protected_Object in Exp_Ch7.
+
 --  The only exception is illustrated in the following example:
 
 -- package Pkg is
@@ -13034,6 +13037,7 @@ package body Exp_Util is
 elsif Ekind (Obj_Id) = E_Variable
   and then not In_Library_Level_Package_Body (Obj_Id)
   and then Has_Simple_Protected_Object (Obj_Typ)
+  and then not Restricted_Profile
 then
return True;
 end if;
-- 
2.43.2



[COMMITTED] ada: Support writable parameters in Depends with side-effects

2024-05-06 Thread Marc Poulhiès
From: Piotr Trojanek 

Functions with side-effects can modify writable parameters of mode IN,
so these parameters must be allowed to appear in their Depends aspects.

gcc/ada/

* sem_prag.adb (Find_Role): Handle functions with side-effects
like procedures.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_prag.adb | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
index 299e388167f..ab60a8ad1d5 100644
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -1383,10 +1383,11 @@ package body Sem_Prag is
(Item_Is_Input  : out Boolean;
 Item_Is_Output : out Boolean)
  is
---  A constant or an IN parameter of a procedure or a protected
---  entry, if it is of an access-to-variable type, should be
---  handled like a variable, as the underlying memory pointed-to
---  can be modified. Use Adjusted_Kind to do this adjustment.
+--  A constant or an IN parameter of a protected entry, procedure,
+--  or function with side-effects, if it is of an
+--  access-to-variable type, should be handled like a variable, as
+--  the underlying memory pointed-to can be modified. Use
+--  Adjusted_Kind to do this adjustment.
 
 Adjusted_Kind : Entity_Kind := Ekind (Item_Id);
 
@@ -1394,11 +1395,15 @@ package body Sem_Prag is
 if (Ekind (Item_Id) in E_Constant | E_Generic_In_Parameter
   or else
   (Ekind (Item_Id) = E_In_Parameter
- and then Ekind (Scope (Item_Id))
-not in E_Function | E_Generic_Function))
+ and then
+   (Ekind (Scope (Item_Id)) not in E_Function
+ | E_Generic_Function
+  or else
+Is_Function_With_Side_Effects (Scope (Item_Id)
   and then Is_Access_Variable (Etype (Item_Id))
-  and then Ekind (Spec_Id) not in E_Function
-| E_Generic_Function
+  and then (Ekind (Spec_Id) not in E_Function
+ | E_Generic_Function
+  or else Is_Function_With_Side_Effects (Spec_Id))
 then
Adjusted_Kind := E_Variable;
 end if;
-- 
2.43.2



[COMMITTED] ada: Small cleanup in C/C++ front-end interface

2024-05-06 Thread Marc Poulhiès
From: Eric Botcazou 

The fe.h header file is supposed to contain only the declarations needed
by the code in the gcc-interface repository.

gcc/ada/

* fe.h: Remove unused declarations and add 'extern' to others.

no-issue-check

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/fe.h | 272 ---
 1 file changed, 41 insertions(+), 231 deletions(-)

diff --git a/gcc/ada/fe.h b/gcc/ada/fe.h
index 397045ea583..692c29a70af 100644
--- a/gcc/ada/fe.h
+++ b/gcc/ada/fe.h
@@ -377,305 +377,114 @@ extern Boolean Get_Warn_On_Questionable_Layout (void);
 
 // The following corresponds to Ada code in Einfo.Utils.
 
-typedef Boolean B;
-typedef Component_Alignment_Kind C;
+typedef Boolean   B;
 typedef Entity_Id E;
-typedef Mechanism_Type M;
-typedef Node_Id N;
-typedef Uint U;
-typedef Ureal R;
-typedef Elist_Id L;
-typedef List_Id S;
-
-#define Is_Access_Object_Type einfo__utils__is_access_object_type
-B Is_Access_Object_Type   (E Id);
-
-#define Is_Named_Access_Type einfo__utils__is_named_access_type
-B Is_Named_Access_Type(E Id);
+typedef Node_Id   N;
 
 #define Address_Clause einfo__utils__address_clause
-N Address_Clause  (E Id);
-
-#define Aft_Value einfo__utils__aft_value
-U Aft_Value   (E Id);
+extern N Address_Clause (E Id);
 
 #define Alignment_Clause einfo__utils__alignment_clause
-N Alignment_Clause(E Id);
+extern N Alignment_Clause (E Id);
 
 #define Base_Type einfo__utils__base_type
-E Base_Type   (E Id);
+extern E Base_Type (E Id);
 
 #define Declaration_Node einfo__utils__declaration_node
-N Declaration_Node(E Id);
+extern N Declaration_Node (E Id);
 
 #define Designated_Type einfo__utils__designated_type
-E Designated_Type (E Id);
-
-#define First_Component einfo__utils__first_component
-E First_Component (E Id);
-
-#define First_Component_Or_Discriminant 
einfo__utils__first_component_or_discriminant
-E First_Component_Or_Discriminant (E Id);
+extern E Designated_Type (E Id);
 
 #define First_Formal einfo__utils__first_formal
-E First_Formal(E Id);
+extern E First_Formal (E Id);
 
 #define First_Formal_With_Extras einfo__utils__first_formal_with_extras
-E First_Formal_With_Extras(E Id);
-
-#define Has_Attach_Handler einfo__utils__has_attach_handler
-B Has_Attach_Handler  (E Id);
-
-#define Has_Entries einfo__utils__has_entries
-B Has_Entries (E Id);
+extern E First_Formal_With_Extras (E Id);
 
 #define Has_Foreign_Convention einfo__utils__has_foreign_convention
-B Has_Foreign_Convention  (E Id);
-
-#define Has_Interrupt_Handler einfo__utils__has_interrupt_handler
-B Has_Interrupt_Handler   (E Id);
-
-#define Has_Non_Limited_View einfo__utils__has_non_limited_view
-B Has_Non_Limited_View(E Id);
-
-#define Has_Non_Null_Abstract_State einfo__utils__has_non_null_abstract_state
-B Has_Non_Null_Abstract_State (E Id);
-
-#define Has_Non_Null_Visible_Refinement 
einfo__utils__has_non_null_visible_refinement
-B Has_Non_Null_Visible_Refinement (E Id);
-
-#define Has_Null_Abstract_State einfo__utils__has_null_abstract_state
-B Has_Null_Abstract_State (E Id);
-
-#define Has_Null_Visible_Refinement einfo__utils__has_null_visible_refinement
-B Has_Null_Visible_Refinement (E Id);
+extern B Has_Foreign_Convention (E Id);
 
 #define Implementation_Base_Type einfo__utils__implementation_base_type
-E Implementation_Base_Type(E Id);
-
-#define Is_Base_Type einfo__utils__is_base_type
-B Is_Base_Type(E Id);
+extern E Implementation_Base_Type (E Id);
 
 #define Is_Boolean_Type einfo__utils__is_boolean_type
-B Is_Boolean_Type (E Id);
-
-#define Is_Constant_Object einfo__utils__is_constant_object
-B Is_Constant_Object  (E Id);
-
-#define Is_Controlled einfo__utils__is_controlled
-B Is_Controlled   (E Id);
-
-#define Is_Discriminal einfo__utils__is_discriminal
-B Is_Discriminal  (E Id);
-
-#define Is_Dynamic_Scope einfo__utils__is_dynamic_scope
-B Is_Dynamic_Scope(E Id);
-
-#define Is_Elaboration_Target einfo__utils__is_elaboration_target
-B Is_Elaboration_Target   (E Id);
-
-#define Is_External_State einfo__utils__is_external_state
-B Is_External_State   (E Id);
-
-#define Is_Finalizer einfo__utils__is_finalizer
-B Is_Finalizer(E Id);
-
-#define Is_Null_State einfo__utils__is_null_state
-B Is_Null_State   (E Id);
-
-#define Is_Package_Or_Generic_Package 
einfo__utils__is_package_or_generic_package
-B Is_Package_Or_Generic_Package   (E Id);
-
-#define Is_Packed_Array einfo__utils__is_packed_array
-B Is_Packed_Array   

[COMMITTED] ada: Fix wrong Finalization_Size for No_Heap_Finalization objects

2024-05-06 Thread Marc Poulhiès
From: Eric Botcazou 

When an access type is subject to the No_Heap_Finalization pragma, no header
is added in front of objects allocated through it, and the value returned by
Finalization_Size is defined to be the size of this header.

gcc/ada/

* exp_attr.adb (Expand_N_Attribute_Reference) :
Return 0 if the prefix is a dereference of an access value subject
to the No_Heap_Finalization pragma.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_attr.adb | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/exp_attr.adb b/gcc/ada/exp_attr.adb
index 614f1fbe14d..a8e06f0005e 100644
--- a/gcc/ada/exp_attr.adb
+++ b/gcc/ada/exp_attr.adb
@@ -3563,6 +3563,14 @@ package body Exp_Attr is
   --  Start of processing for Finalization_Size
 
   begin
+ --  If the prefix is the dereference of an access value subject to
+ --  pragma No_Heap_Finalization, then no header has been added.
+
+ if Nkind (Pref) = N_Explicit_Dereference
+   and then No_Heap_Finalization (Etype (Prefix (Pref)))
+ then
+Rewrite (N, Make_Integer_Literal (Loc, 0));
+
  --  An object of a class-wide type first requires a runtime check to
  --  determine whether it is actually controlled or not. Depending on
  --  the outcome of this check, the Finalization_Size of the object
@@ -3578,7 +3586,7 @@ package body Exp_Attr is
  --
  --  and the attribute reference is replaced with a reference to Size.
 
- if Is_Class_Wide_Type (Ptyp) then
+ elsif Is_Class_Wide_Type (Ptyp) then
 Size := Make_Temporary (Loc, 'S');
 
 Insert_Actions (N, New_List (
-- 
2.43.2



[COMMITTED] ada: Spurious reference warning on qualified expression

2024-05-06 Thread Marc Poulhiès
From: Justin Squirek 

Incremental improvement/clean up.

gcc/ada/

* sem_warn.adb (Within_Postcondition): Add/modify comments to
document various cases.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_warn.adb | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/ada/sem_warn.adb b/gcc/ada/sem_warn.adb
index 8317b449021..54d8920a943 100644
--- a/gcc/ada/sem_warn.adb
+++ b/gcc/ada/sem_warn.adb
@@ -1973,6 +1973,8 @@ package body Sem_Warn is
  begin
 Nod := Parent (N);
 while Present (Nod) loop
+   --  General contract / predicate related pragma
+
if Nkind (Nod) = N_Pragma
  and then
Pragma_Name_Unmapped (Nod)
@@ -1992,6 +1994,8 @@ package body Sem_Warn is
then
   return True;
 
+   --  Deal with special 'Ensures' Test_Case component
+
elsif Present (Parent (Nod)) then
   P := Parent (Nod);
 
-- 
2.43.2



[COMMITTED] ada: Spurious reference warning on qualified expression

2024-05-06 Thread Marc Poulhiès
From: Justin Squirek 

This patch fixes an error in the compiler whereby an assignment to an out
formal (whose type requires a predicate check) can lead to spurious
"value may be referenced before it has a value" warnings when the RHS is a
qualified expression.

gcc/ada/

* sem_warn.adb (Within_Postcondition): Add case to ignore
references within generated predicate function calls.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_warn.adb | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/ada/sem_warn.adb b/gcc/ada/sem_warn.adb
index eaf9a257ba0..0a54b3eda50 100644
--- a/gcc/ada/sem_warn.adb
+++ b/gcc/ada/sem_warn.adb
@@ -1982,6 +1982,16 @@ package body Sem_Warn is
then
   return True;
 
+   --  Verify we are not within a generated predicate
+   --  function call.
+
+   elsif Nkind (Nod) = N_Function_Call
+ and then Is_Entity_Name (Name (Nod))
+ and then Is_Predicate_Function
+(Entity (Name (Nod)))
+   then
+  return True;
+
elsif Present (Parent (Nod)) then
   P := Parent (Nod);
 
-- 
2.43.2



[COMMITTED] ada: Spurious reference warning on qualified expression

2024-05-06 Thread Marc Poulhiès
From: Justin Squirek 

Incremental improvement/clean up.

gcc/ada/

* sem_warn.adb (Within_Postcondition): Add coverage for
Preconditions

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_warn.adb | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/sem_warn.adb b/gcc/ada/sem_warn.adb
index 54d8920a943..57bdee65356 100644
--- a/gcc/ada/sem_warn.adb
+++ b/gcc/ada/sem_warn.adb
@@ -1978,7 +1978,8 @@ package body Sem_Warn is
if Nkind (Nod) = N_Pragma
  and then
Pragma_Name_Unmapped (Nod)
-in Name_Postcondition
+in Name_Precondition
+ | Name_Postcondition
  | Name_Refined_Post
  | Name_Contract_Cases
then
-- 
2.43.2



[COMMITTED] ada: Fix detection of (Un)Hide_Info pragma in GNATprove mode

2024-05-06 Thread Marc Poulhiès
From: Yannick Moy 

Spec or body may not be in a list for subunit.

gcc/ada/

* inline.adb (Can_Be_Inlined_In_GNATprove_Mode): Add guard.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/inline.adb | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/inline.adb b/gcc/ada/inline.adb
index 98bed860760..b7a6cc90cd2 100644
--- a/gcc/ada/inline.adb
+++ b/gcc/ada/inline.adb
@@ -1819,6 +1819,7 @@ package body Inline is
 
   begin
  if Present (Spec_Id)
+   and then Is_List_Member (Unit_Declaration_Node (Spec_Id))
and then Has_Hide_Unhide_Pragma
  (Next (Unit_Declaration_Node (Spec_Id)))
  then
@@ -1829,7 +1830,9 @@ package body Inline is
Subp_Body : constant N_Subprogram_Body_Id :=
  Unit_Declaration_Node (Body_Id);
 begin
-   return Has_Hide_Unhide_Pragma (Next (Subp_Body))
+   return
+ (Is_List_Member (Subp_Body)
+   and then Has_Hide_Unhide_Pragma (Next (Subp_Body)))
  or else
Has_Hide_Unhide_Pragma (First (Declarations (Subp_Body)));
 end;
-- 
2.43.2



[COMMITTED] ada: Spurious reference warning on qualified expression

2024-05-06 Thread Marc Poulhiès
From: Justin Squirek 

Incremental improvement/clean up.

gcc/ada/

* sem_warn.adb (Within_Postcondition): Renamed to
Within_Contract_Or_Predicate.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_warn.adb | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/gcc/ada/sem_warn.adb b/gcc/ada/sem_warn.adb
index 0a54b3eda50..8317b449021 100644
--- a/gcc/ada/sem_warn.adb
+++ b/gcc/ada/sem_warn.adb
@@ -1958,16 +1958,16 @@ package body Sem_Warn is
  SR : Entity_Id;
  SE : constant Entity_Id := Scope (E);
 
- function Within_Postcondition return Boolean;
- --  Returns True if N is within a Postcondition, a
- --  Refined_Post, an Ensures component in a Test_Case,
- --  or a Contract_Cases.
+ function Within_Contract_Or_Predicate  return Boolean;
+ --  Returns True if N is within a contract or predicate,
+ --  an Ensures component in a Test_Case, or a
+ --  Contract_Cases.
 
- --
- -- Within_Postcondition --
- --
+ --
+ -- Within_Contract_Or_Predicate --
+ --
 
- function Within_Postcondition return Boolean is
+ function Within_Contract_Or_Predicate return Boolean is
 Nod, P : Node_Id;
 
  begin
@@ -2012,7 +2012,7 @@ package body Sem_Warn is
 end loop;
 
 return False;
- end Within_Postcondition;
+ end Within_Contract_Or_Predicate;
 
   --  Start of processing for Potential_Unset_Reference
 
@@ -2136,7 +2136,7 @@ package body Sem_Warn is
  --  postcondition, since the expression occurs in a
  --  place unrelated to the actual test.
 
- if not Within_Postcondition then
+ if not Within_Contract_Or_Predicate then
 
 --  Here we definitely have a case for giving a warning
 --  for a reference to an unset value. But we don't
-- 
2.43.2



[COMMITTED] ada: Replace references to PO_Simple by Protected_Objects in comments

2024-05-06 Thread Marc Poulhiès
From: Eric Botcazou 

The child unit was renamed a while ago.

gcc/ada/

* libgnarl/s-taprob.ads (Protection): Add cross-reference to the
counterpart in System.Tasking.Protected_Objects.Entries.
* libgnarl/s-taskin.ads (Locking Rules): Replace PO_Simple by
Protected_Objects.
* libgnarl/s-tpoben.ads (Protection_Entries): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnarl/s-taprob.ads | 6 ++
 gcc/ada/libgnarl/s-taskin.ads | 4 ++--
 gcc/ada/libgnarl/s-tpoben.ads | 4 ++--
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/gcc/ada/libgnarl/s-taprob.ads b/gcc/ada/libgnarl/s-taprob.ads
index 34cfcc7a9ba..e94ec71e768 100644
--- a/gcc/ada/libgnarl/s-taprob.ads
+++ b/gcc/ada/libgnarl/s-taprob.ads
@@ -207,6 +207,12 @@ package System.Tasking.Protected_Objects is
--  lock and allowed to return from the Lock or Lock_Read_Only call.
 
 private
+   --  The following type contains the GNARL state of a protected object.
+   --  The application-defined portion of the state (i.e. private objects)
+   --  is maintained by the compiler-generated code. Note that there is
+   --  another version declared in System.Tasking.Protected_Objects.Entries
+   --  that handles the case with entries and is controlled.
+
type Protection is record
   L : aliased Task_Primitives.Lock;
   --  Lock used to ensure mutual exclusive access to the protected object
diff --git a/gcc/ada/libgnarl/s-taskin.ads b/gcc/ada/libgnarl/s-taskin.ads
index 949fb7e6607..1bae7e114cf 100644
--- a/gcc/ada/libgnarl/s-taskin.ads
+++ b/gcc/ada/libgnarl/s-taskin.ads
@@ -70,9 +70,9 @@ package System.Tasking is
-- Unlock (Y);
 
--  Locks with lower (smaller) level number cannot be locked
-   --  while holding a lock with a higher level number. (The level
+   --  while holding a lock with a higher level number.
 
-   --  1. System.Tasking.PO_Simple.Protection.L (any PO lock)
+   --  1. System.Tasking.Protected_Objects.Protection.L (any PO lock)
--  2. System.Tasking.Initialization.Global_Task_Lock (in body)
--  3. System.Task_Primitives.Operations.Single_RTS_Lock
--  4. System.Tasking.Ada_Task_Control_Block.LL.L (any TCB lock)
diff --git a/gcc/ada/libgnarl/s-tpoben.ads b/gcc/ada/libgnarl/s-tpoben.ads
index d1e6b8533d2..a7091184223 100644
--- a/gcc/ada/libgnarl/s-tpoben.ads
+++ b/gcc/ada/libgnarl/s-tpoben.ads
@@ -79,8 +79,8 @@ package System.Tasking.Protected_Objects.Entries is
--  The following type contains the GNARL state of a protected object.
--  The application-defined portion of the state (i.e. private objects)
--  is maintained by the compiler-generated code. Note that there is a
-   --  simplified version of this type declared in System.Tasking.PO_Simple
-   --  that handle the simple case (no entries).
+   --  simplified version declared in System.Tasking.Protected_Objects that
+   --  handles the simple case (no entries) and is not controlled.
 
type Protection_Entries (Num_Entries : Protected_Entry_Index) is new
  Ada.Finalization.Limited_Controlled
-- 
2.43.2



[COMMITTED] ada: Replace redundant conditions with assertions

2024-05-06 Thread Marc Poulhiès
From: Piotr Trojanek 

Fix warnings from the CodePeer. The code structure is essentially:

  if A and B then ...
  elsif not A and not B then ...
  elsif A then ...
  elsif B then ...  --  this condition is redundant
  end if;

and it causes CodePeer to say "exception is raised in a conditional
branch", which most likely means that the condition is redundant.

gcc/ada/

* make.adb (Scan_Make_Arg): Remove redundant condition.
* switch-b.adb (Scan_Debug_Switches): Likewise.
* switch-c.adb (Scan_Front_End_Switches): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/make.adb | 2 +-
 gcc/ada/switch-b.adb | 2 +-
 gcc/ada/switch-c.adb | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/make.adb b/gcc/ada/make.adb
index 01d3ccab8af..24b2d099bfe 100644
--- a/gcc/ada/make.adb
+++ b/gcc/ada/make.adb
@@ -4508,7 +4508,7 @@ package body Make is
  Make_Failed
("RTS path not valid: missing adainclude directory");
 
-  elsif Lib_Path_Name = null then
+  else pragma Assert (Lib_Path_Name = null);
  Make_Failed
("RTS path not valid: missing adalib directory");
   end if;
diff --git a/gcc/ada/switch-b.adb b/gcc/ada/switch-b.adb
index 2c4fc0c6039..8d8dc58937c 100644
--- a/gcc/ada/switch-b.adb
+++ b/gcc/ada/switch-b.adb
@@ -703,7 +703,7 @@ package body Switch.B is
  elsif Src_Path_Name = null then
 Osint.Fail
   ("RTS path not valid: missing adainclude directory");
- elsif Lib_Path_Name = null then
+ else pragma Assert (Lib_Path_Name = null);
 Osint.Fail
   ("RTS path not valid: missing adalib directory");
  end if;
diff --git a/gcc/ada/switch-c.adb b/gcc/ada/switch-c.adb
index 7668fce885a..43b69f1dde1 100644
--- a/gcc/ada/switch-c.adb
+++ b/gcc/ada/switch-c.adb
@@ -274,7 +274,7 @@ package body Switch.C is
  Osint.Fail ("RTS path not valid: missing "
  & "adainclude directory");
 
-  elsif RTS_Lib_Path_Name = null then
+  else pragma Assert (RTS_Lib_Path_Name = null);
  Osint.Fail ("RTS path not valid: missing "
  & "adalib directory");
   end if;
-- 
2.43.2



[COMMITTED] ada: Fix missing associated node for packed array itypes

2024-05-06 Thread Marc Poulhiès
From: Piotr Trojanek 

After decoration, itypes should have its associated node set.

gcc/ada/

* exp_pakd.adb (Create_Packed_Array_Impl_Type): Set associated
node for the packed array itype.
* exp_util.adb (Possible_Side_Effect_In_SPARK): Remove
workaround for a missing associated node.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_pakd.adb |  4 
 gcc/ada/exp_util.adb | 15 ++-
 2 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/gcc/ada/exp_pakd.adb b/gcc/ada/exp_pakd.adb
index 628a3e38a78..3f26c3527fa 100644
--- a/gcc/ada/exp_pakd.adb
+++ b/gcc/ada/exp_pakd.adb
@@ -541,8 +541,12 @@ package body Exp_Pakd is
 
  if Is_Itype (Typ) then
 Set_Parent (Decl, Associated_Node_For_Itype (Typ));
+Set_Associated_Node_For_Itype
+  (PAT, Associated_Node_For_Itype (Typ));
  else
 Set_Parent (Decl, Declaration_Node (Typ));
+Set_Associated_Node_For_Itype
+  (PAT, Declaration_Node (Typ));
  end if;
 
  if Scope (Typ) /= Current_Scope then
diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
index 25190a65ebf..e7573277b61 100644
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -11917,7 +11917,7 @@ package body Exp_Util is
 
 --  When this routine is called while the itype
 --  is being created, the entity might not yet be
---  decorated with the associated node, but should
+--  decorated with the associated node, but will
 --  have the related expression.
 
 if Present (Associated_Node_For_Itype (Subt)) then
@@ -11925,21 +11925,10 @@ package body Exp_Util is
  Possible_Side_Effect_In_SPARK
(Associated_Node_For_Itype (Subt));
 
-elsif Present (Related_Expression (Subt)) then
+else
return
  Possible_Side_Effect_In_SPARK
(Related_Expression (Subt));
-
---  When the itype doesn't have any indication of its
---  origin (which currently only happens for packed
---  array types created by freezing that shouldn't
---  be picked by GNATprove anyway), then we can
---  conservatively assume that the expression can
---  be kept as it appears in the source code.
-
-else
-   pragma Assert (Is_Packed_Array_Impl_Type (Subt));
-   return False;
 end if;
  else
 return True;
-- 
2.43.2



[COMMITTED] ada: Rework processing of special objects needing finalization

2024-05-06 Thread Marc Poulhiès
From: Eric Botcazou 

This reworks the processing of special objects needing finalization in the
new implementation.  These special objects, i.e. return object in extended
return statements and transient objects, cannot be automatically handled by
the post-processing phase because they have additional requirements, either
conditional finalization for the former or immediate finalization for the
latter and, therefore, a specific processing during expansion is needed for
them before the post-processing phase can complete the work.

The previous scheme used to do minimal processing during expansion, leaving
the bulk of the work to the post-processing phase. Unfortunately this scheme
turned out not to be stable for Expression_With_Actions nodes under copying
by means of New_Copy_Tree or equivalent devices. The new scheme moves a bit
more processing to the expansion, namely the generation of the attachment to
the master node, whose result can then be naturally copied by New_Copy_Tree.

A side effect is to further simplify the implementation of Build_Finalizer
in Exp_Ch7, which has one fewer special case to deal with.

gcc/ada/

* einfo.ads (Finalization_Master_Node_Or_Object): Rename into...
(Finalization_Master_Node): ...this and adjust description.
* exp_ch4.adb (Process_Transient_In_Expression): Attach the object
to its master node here.
* exp_ch7.ads (Attach_Object_To_Master_Node): New declaration.
* exp_ch7.adb (Attach_Object_To_Master_Node): New procedure.
(Build_Finalizer.Process_Declarations): Examine the type of a
variable to spot master nodes.
(Build_Finalizer.Process_Object_Declaration): Look only at the
object and deal specifically with the case of a master node.
(Build_Finalizer.Build_BIP_Cleanup_Stmts): Move to child function
of Attach_Object_To_Master_Node.
(Build_Finalizer.Make_Address_For_Finalize): Move to...
(Insert_Actions_In_Scope_Around.Process_Transient_In_Scope): Attach
the object to its master node here.
(Make_Address_For_Finalize): ...here.
(Make_Master_Node_Declaration): Adjust to above renaming and set
Finalization_Master_Node only on the object.
(Make_Suppress_Object_Finalize_Call): Adjust to above renaming and
attach the object to its master node here.
* exp_util.adb (Requires_Cleanup_Actions): Examine the type of a
variable to spot master nodes.
* gen_il-fields.ads (Opt_Field_Enum): Adjust to above renaming.
* gen_il-gen-gen_entities.adb (Allocatable_Kind): Likewise.
* rtsfind.ads (RE_Id): Add RE_Chain_Node_To_Master.
(RE_Unit_Table): Add entry for  RE_Chain_Node_To_Master.
* libgnat/s-finpri.ads (Chain_Node_To_Master): New declaration.
* libgnat/s-finpri.adb (Chain_Node_To_Master): New procedure.
(Attach_Object_To_Master): Call it.
(Finalize_Master): Do not raise Program_Error on null addresses.
(Finalize_Object): Add assertion that the address is not null.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/einfo.ads   |   8 +-
 gcc/ada/exp_ch4.adb |   4 +
 gcc/ada/exp_ch7.adb | 804 
 gcc/ada/exp_ch7.ads |   7 +
 gcc/ada/exp_util.adb|   7 +-
 gcc/ada/gen_il-fields.ads   |   2 +-
 gcc/ada/gen_il-gen-gen_entities.adb |   2 +-
 gcc/ada/libgnat/s-finpri.adb|  32 +-
 gcc/ada/libgnat/s-finpri.ads|  12 +-
 gcc/ada/rtsfind.ads |   2 +
 10 files changed, 500 insertions(+), 380 deletions(-)

diff --git a/gcc/ada/einfo.ads b/gcc/ada/einfo.ads
index 24964004c05..6f563d5e62c 100644
--- a/gcc/ada/einfo.ads
+++ b/gcc/ada/einfo.ads
@@ -1305,15 +1305,13 @@ package Einfo is
 --   type. Empty for access-to-subprogram types. Empty for access types
 --   whose designated type does not need finalization actions.
 
---Finalization_Master_Node_Or_Object
+--Finalization_Master_Node
 --   Defined in variables and constants that require finalization actions.
 --   The field contains the entity of an object (called a Master_Node) that
 --   contains the address of the finalizable object, along with an access
 --   value denoting the finalizable object's finalization procedure. The
 --   Master_Node may be attached to a finalization list associated with
 --   either the global scope or some dynamic scope (block or subprogram).
---   Conversely, for a Master_Node entity, the field contains the entity
---   of the finalizable object.
 
 --Finalize_Storage_Only [base type only]
 --   Defined in all types. Set on direct controlled types to which a
@@ -5304,7 +5302,7 @@ package Einfo is
--Related_Type  (constants only)
--Initialization_Statements
--BIP_Initialization_Call
-   --

[COMMITTED] ada: Fix spurious warning emission

2024-05-06 Thread Marc Poulhiès
From: Ronan Desplanques 

This patch fixes a bug where GNAT would emit incorrect warnings
about obsolescent syntax for array aggregates with generics and
particular arrangements of Ada version pragmas.

This patch also removes a syntactic field that was introduced to
support the emission of this warning, but is no longer required.

gcc/ada/

* exp_imgv.adb (Append_Table_To): Remove reference to removed
field.
* gen_il-fields.ads: Remove Is_Enum_Array_Aggregate field.
* gen_il-gen-gen_nodes.adb: Likewise.
* sem_aggr.adb: Tweak warning emission condition.
* sinfo.ads: Remove documentation for Is_Enum_Array_Aggregate.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_imgv.adb |  3 +--
 gcc/ada/gen_il-fields.ads|  1 -
 gcc/ada/gen_il-gen-gen_nodes.adb |  1 -
 gcc/ada/sem_aggr.adb | 13 -
 gcc/ada/sinfo.ads|  5 -
 5 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/gcc/ada/exp_imgv.adb b/gcc/ada/exp_imgv.adb
index ec0126a7da2..6dc59f2c6f3 100644
--- a/gcc/ada/exp_imgv.adb
+++ b/gcc/ada/exp_imgv.adb
@@ -159,8 +159,7 @@ package body Exp_Imgv is
Make_Component_Definition (Loc,
  Aliased_Present=> False,
  Subtype_Indication => New_Occurrence_Of (Ctyp, Loc))),
- Expression  => Make_Aggregate (Loc, Expressions => V,
-  Is_Enum_Array_Aggregate => True)));
+ Expression  => Make_Aggregate (Loc, Expressions => V)));
   end Append_Table_To;
 
--  Start of Build_Enumeration_Image_Tables
diff --git a/gcc/ada/gen_il-fields.ads b/gcc/ada/gen_il-fields.ads
index cdd9b9577e2..7cf6a38faa3 100644
--- a/gcc/ada/gen_il-fields.ads
+++ b/gcc/ada/gen_il-fields.ads
@@ -259,7 +259,6 @@ package Gen_IL.Fields is
   Is_Generic_Contract_Pragma,
   Is_Homogeneous_Aggregate,
   Is_Parenthesis_Aggregate,
-  Is_Enum_Array_Aggregate,
   Is_Ignored,
   Is_Ignored_Ghost_Pragma,
   Is_In_Discriminant_Check,
diff --git a/gcc/ada/gen_il-gen-gen_nodes.adb b/gcc/ada/gen_il-gen-gen_nodes.adb
index 72280025498..d7cc39bc048 100644
--- a/gcc/ada/gen_il-gen-gen_nodes.adb
+++ b/gcc/ada/gen_il-gen-gen_nodes.adb
@@ -492,7 +492,6 @@ begin -- Gen_IL.Gen.Gen_Nodes
 Sy (Null_Record_Present, Flag),
 Sy (Is_Parenthesis_Aggregate, Flag),
 Sy (Is_Homogeneous_Aggregate, Flag),
-Sy (Is_Enum_Array_Aggregate, Flag),
 Sm (Aggregate_Bounds_Or_Ancestor_Type, Node_Id),
 Sm (Entity_Or_Associated_Node, Node_Id), -- just Associated_Node
 Sm (Compile_Time_Known_Aggregate, Flag),
diff --git a/gcc/ada/sem_aggr.adb b/gcc/ada/sem_aggr.adb
index e381af101c8..508c86bc5de 100644
--- a/gcc/ada/sem_aggr.adb
+++ b/gcc/ada/sem_aggr.adb
@@ -2099,14 +2099,25 @@ package body Sem_Aggr is
 
   --  Disable the warning for GNAT Mode to allow for easier transition.
 
+  --  We don't warn about obsolescent usage of parentheses in generic
+  --  instances for two reasons:
+  --
+  --  1. An equivalent warning has been emitted in the corresponding
+  -- definition.
+  --  2. In cases where a generic definition specifies a version older than
+  -- Ada 2022 through a pragma and rightfully uses parentheses for
+  -- an array aggregate, an incorrect warning would be raised in
+  -- instances of that generic that are in Ada 2022 or later if we
+  -- didn't filter out the instance case.
+
   if Ada_Version_Explicit >= Ada_2022
 and then Warn_On_Obsolescent_Feature
 and then not GNAT_Mode
 and then not Is_Homogeneous_Aggregate (N)
-and then not Is_Enum_Array_Aggregate (N)
 and then Is_Parenthesis_Aggregate (N)
 and then Nkind (Parent (N)) /= N_Qualified_Expression
 and then Comes_From_Source (N)
+and then not In_Instance
   then
  Error_Msg_N
("?j?array aggregate using () is an" &
diff --git a/gcc/ada/sinfo.ads b/gcc/ada/sinfo.ads
index 803f5dfc759..4e977152cd0 100644
--- a/gcc/ada/sinfo.ads
+++ b/gcc/ada/sinfo.ads
@@ -1715,10 +1715,6 @@ package Sinfo is
--nodes which emulate the barrier function of a protected entry body.
--The flag is used when checking for incorrect use of Current_Task.
 
-   --  Is_Enum_Array_Aggregate
-   --A flag set on an aggregate created internally while building the
-   --images tables for enumerations.
-
--  Is_Expanded_Build_In_Place_Call
--This flag is set in an N_Function_Call node to indicate that the extra
--actuals to support a build-in-place style of call have been added to
@@ -4091,7 +4087,6 @@ package Sinfo is
   --  Compile_Time_Known_Aggregate
   --  Expansion_Delayed
   --  Has_Self_Reference
-  --  Is_Enum_Array_Aggregate
   --  Is_Homogeneous_Aggregate
   --  

[COMMITTED] ada: Don't propagate convention to internal subprograms

2024-05-06 Thread Marc Poulhiès
From: Richard Kenner 

AI95-117 requires that all new primitives of a tagged type must
inherit the convention of the full view of the type. However, we need
not do this for primitives that are internally-generated, such as for
finalization. There are issues with GNAT LLVM when primitives have
convention C since the UC from that subprogram type to the type used
in the dispatch table will generate a warning. We're not doing
anything here about the case where the convention C is explicit or
there are user-specified primitives on a type with convention C, but
let's not make the problem worse by putting convention C on internal
subprograms.

gcc/ada/

* freeze.adb (Freeze_Entity): When changing the convention of
primitive to match that of the type, only do this for user-specified
primitives.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/freeze.adb | 28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
index d032b75f1f2..4cb5979b016 100644
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -7327,17 +7327,21 @@ package body Freeze is
 
  if Is_Composite_Type (E) then
 
---  AI95-117 requires that all new primitives of a tagged type must
---  inherit the convention of the full view of the type. Inherited
---  and overriding operations are defined to inherit the convention
---  of their parent or overridden subprogram (also specified in
---  AI-117), which will have occurred earlier (in Derive_Subprogram
---  and New_Overloaded_Entity). Here we set the convention of
+--  AI95-117 requires that all new primitives of a tagged type
+--  must inherit the convention of the full view of the
+--  type. Inherited and overriding operations are defined to
+--  inherit the convention of their parent or overridden
+--  subprogram (also specified in AI-117), which will have
+--  occurred earlier (in Derive_Subprogram and
+--  New_Overloaded_Entity). Here we set the convention of
 --  primitives that are still convention Ada, which will ensure
---  that any new primitives inherit the type's convention. Class-
---  wide types can have a foreign convention inherited from their
---  specific type, but are excluded from this since they don't have
---  any associated primitives.
+--  that any new primitives inherit the type's convention. We
+--  don't do this for primitives that are internal to avoid
+--  potential problems in the case of nested subprograms and
+--  convention C. In addition, class-wide types can have a
+--  foreign convention inherited from their specific type, but
+--  are excluded from this since they don't have any associated
+--  primitives.
 
 if Is_Tagged_Type (E)
   and then not Is_Class_Wide_Type (E)
@@ -7350,7 +7354,9 @@ package body Freeze is
begin
   Prim := First_Elmt (Prim_List);
   while Present (Prim) loop
- if Convention (Node (Prim)) = Convention_Ada then
+ if Convention (Node (Prim)) = Convention_Ada
+   and then Comes_From_Source (Node (Prim))
+ then
 Set_Convention (Node (Prim), Convention (E));
  end if;
 
-- 
2.43.2



[COMMITTED] ada: Fix RM reference in comment

2024-05-06 Thread Marc Poulhiès
From: Ronan Desplanques 

The RM 2.2(15) that the comment mentioned before this patch doesn't
exist. It's pretty clear that the comment meant to refer to
RM 2.2(14) instead.

gcc/ada/

* hostparm.ads: Fix reference to RM clause.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/hostparm.ads | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/hostparm.ads b/gcc/ada/hostparm.ads
index 11eef359d30..b2d2f814b32 100644
--- a/gcc/ada/hostparm.ads
+++ b/gcc/ada/hostparm.ads
@@ -48,7 +48,7 @@ package Hostparm is
--  have a valid Column_Number equal to Max_Line_Length to represent
--  the location of a "line too long" error.
--
-   --  200 is the minimum value required (RM 2.2(15)). The value set here
+   --  200 is the minimum value required (RM 2.2(14)). The value set here
--  can be reduced by the explicit use of the -gnatyM style switch.
 
Max_Name_Length : constant := 1024;
-- 
2.43.2



[COMMITTED] ada: Fix memory leak in 'Image

2024-05-06 Thread Marc Poulhiès
From: Bob Duff 

Fix memory leak in 'Image by managing the secondary stack
in scopes that call the new Ada 2020 'Image, which calls 'Put_Image
and then Get, which returns on the secondary stack.

gcc/ada/

* exp_put_image.adb (Build_Image_Call): Call Set_Uses_Sec_Stack on
the current scope. We don't do this at all call sites, because
there are three; better to do it here.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_put_image.adb | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/ada/exp_put_image.adb b/gcc/ada/exp_put_image.adb
index c298163f36f..182497fb6e8 100644
--- a/gcc/ada/exp_put_image.adb
+++ b/gcc/ada/exp_put_image.adb
@@ -1290,6 +1290,14 @@ package body Exp_Put_Image is
  Actions := New_List (Sink_Decl, Put_Im, Result_Decl);
   end if;
 
+  --  To avoid leaks, we need to manage the secondary stack, because Get is
+  --  returning a String allocated thereon. It might be cleaner to let the
+  --  normal mechanisms for functions returning on the secondary stack call
+  --  Set_Uses_Sec_Stack, but this expansion of 'Image is happening too
+  --  late for that.
+
+  Set_Uses_Sec_Stack (Current_Scope);
+
   return Make_Expression_With_Actions (Loc,
 Actions=> Actions,
 Expression => New_Occurrence_Of (Result_Entity, Loc));
-- 
2.43.2



[COMMITTED] ada: Fix non-idiomatic construct

2024-05-06 Thread Marc Poulhiès
From: Eric Botcazou 

gcc/ada/

* exp_ch3.adb (Expand_Freeze_Class_Wide_Type): Use No instead of
not Present.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch3.adb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
index 7a137dda3f7..09551b22154 100644
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -5021,7 +5021,7 @@ package body Exp_Ch3 is
   --  Create the body of TSS primitive Finalize_Address. This automatically
   --  sets the TSS entry for the class-wide type.
 
-  if not Present (Finalize_Address (Typ)) then
+  if No (Finalize_Address (Typ)) then
  Make_Finalize_Address_Body (Typ);
   end if;
end Expand_Freeze_Class_Wide_Type;
-- 
2.43.2



[COMMITTED] ada: Make a couple of comment tweaks

2024-05-06 Thread Marc Poulhiès
From: Eric Botcazou 

This removes a reference to a mechanism that didn't make it into the final
implementation and completes the description of another.

gcc/ada/

* libgnat/s-finpri.ads (Finalize_Master): Remove obsolete reference
in the description.
(Finalize_Object): Document the effects of repeated calls.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/s-finpri.ads | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/gcc/ada/libgnat/s-finpri.ads b/gcc/ada/libgnat/s-finpri.ads
index de775caee91..ab79ea2c664 100644
--- a/gcc/ada/libgnat/s-finpri.ads
+++ b/gcc/ada/libgnat/s-finpri.ads
@@ -92,13 +92,12 @@ package System.Finalization_Primitives with Preelaborate is
 
procedure Finalize_Master (Master : in out Finalization_Scope_Master);
--  Finalizes each of the controlled objects associated with Master, in the
-   --  reverse of the order in which they were attached, and releases the space
-   --  that was allocated on the secondary stack if Master.SS_Mark is not null.
-   --  Calls to this procedure with a Master that has already been finalized
-   --  have no effects.
+   --  reverse of the order in which they were attached. Calls to the procedure
+   --  with a Master that has already been finalized have no effects.
 
procedure Finalize_Object (Node : in out Master_Node);
-   --  Finalizes the controlled object attached to Node
+   --  Finalizes the controlled object attached to Node. Calls to the procedure
+   --  with a Node that has already been finalized have no effects.
 
procedure Suppress_Object_Finalize_At_End (Node : in out Master_Node);
--  Changes the state of Node to effectively suppress a call to Node's
-- 
2.43.2



[COMMITTED] ada: Do not inline in GNATprove the subprograms with (Un)Hide_Info

2024-05-06 Thread Marc Poulhiès
From: Yannick Moy 

The annotations Hide_Info and Unhide_Info in GNATprove are meant to
give special visibility in the corresponding scope to the precise definition
of some entities. Hence, such scopes should not be inlined in GNATprove.

gcc/ada/

* inline.adb (Can_Be_Inlined_In_GNATprove_Mode): Adapt checking.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/inline.adb | 89 --
 1 file changed, 86 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/inline.adb b/gcc/ada/inline.adb
index f23100dbb13..2ec92ca9dff 100644
--- a/gcc/ada/inline.adb
+++ b/gcc/ada/inline.adb
@@ -1503,6 +1503,12 @@ package body Inline is
   --  an unconstrained record type with per-object constraints on component
   --  types.
 
+  function Has_Hide_Unhide_Annotation
+(Spec_Id, Body_Id : Entity_Id)
+ return Boolean;
+  --  Returns whether the subprogram has an annotation Hide_Info or
+  --  Unhide_Info on its spec or body.
+
   function Has_Skip_Proof_Annotation (Id : Entity_Id) return Boolean;
   --  Returns True if subprogram Id has an annotation Skip_Proof or
   --  Skip_Flow_And_Proof.
@@ -1705,6 +1711,76 @@ package body Inline is
  return False;
   end Has_Formal_With_Discriminant_Dependent_Fields;
 
+  
+  -- Has_Hide_Unhide_Annotation --
+  
+
+  function Has_Hide_Unhide_Annotation
+(Spec_Id, Body_Id : Entity_Id)
+ return Boolean
+  is
+ function Has_Hide_Unhide_Pragma (Prag : Node_Id) return Boolean;
+ --  Return whether a pragma Hide/Unhide is present in the list of
+ --  pragmas starting with Prag.
+
+ 
+ -- Has_Hide_Unhide_Pragma --
+ 
+
+ function Has_Hide_Unhide_Pragma (Prag : Node_Id) return Boolean is
+Decl : Node_Id := Prag;
+ begin
+while Present (Decl)
+  and then Nkind (Decl) = N_Pragma
+loop
+   if Get_Pragma_Id (Decl) = Pragma_Annotate
+ and then List_Length (Pragma_Argument_Associations (Decl)) = 4
+   then
+  declare
+ Arg1  : constant Node_Id :=
+   First (Pragma_Argument_Associations (Decl));
+ Arg2  : constant Node_Id := Next (Arg1);
+ Arg1_Name : constant Name_Id :=
+   Chars (Get_Pragma_Arg (Arg1));
+ Arg2_Name : constant String :=
+   Get_Name_String (Chars (Get_Pragma_Arg (Arg2)));
+  begin
+ if Arg1_Name = Name_Gnatprove
+   and then Arg2_Name in "hide_info" | "unhide_info"
+ then
+return True;
+ end if;
+  end;
+   end if;
+
+   Next (Decl);
+end loop;
+
+return False;
+ end Has_Hide_Unhide_Pragma;
+
+  begin
+ if Present (Spec_Id)
+   and then Has_Hide_Unhide_Pragma
+ (Next (Unit_Declaration_Node (Spec_Id)))
+ then
+return True;
+
+ elsif Present (Body_Id) then
+declare
+   Subp_Body : constant N_Subprogram_Body_Id :=
+ Unit_Declaration_Node (Body_Id);
+begin
+   return Has_Hide_Unhide_Pragma (Next (Subp_Body))
+ or else
+   Has_Hide_Unhide_Pragma (First (Declarations (Subp_Body)));
+end;
+
+ else
+return False;
+ end if;
+  end Has_Hide_Unhide_Annotation;
+
   ---
   -- Has_Skip_Proof_Annotation --
   ---
@@ -1725,12 +1801,12 @@ package body Inline is
   Arg1  : constant Node_Id :=
 First (Pragma_Argument_Associations (Decl));
   Arg2  : constant Node_Id := Next (Arg1);
-  Arg1_Name : constant String :=
-Get_Name_String (Chars (Get_Pragma_Arg (Arg1)));
+  Arg1_Name : constant Name_Id :=
+Chars (Get_Pragma_Arg (Arg1));
   Arg2_Name : constant String :=
 Get_Name_String (Chars (Get_Pragma_Arg (Arg2)));
begin
-  if Arg1_Name = "gnatprove"
+  if Arg1_Name = Name_Gnatprove
 and then Arg2_Name in "skip_proof" | "skip_flow_and_proof"
   then
  return True;
@@ -1952,6 +2028,13 @@ package body Inline is
   elsif Has_Skip_Proof_Annotation (Id) then
  return False;
 
+  --  Do not inline subprograms with the Hide_Info or Unhide_Info
+  --  annotation, since their scope has special 

[COMMITTED] ada: Prevent inlining in GNATprove for memory leaks

2024-05-06 Thread Marc Poulhiès
From: Yannick Moy 

In some cases, inlining a call in GNATprove could lead to
missing a memory leak. Recognize such cases and do not inline
such calls.

gcc/ada/

* inline.adb (Call_Can_Be_Inlined_In_GNATprove_Mode):
Add case to prevent inlining of call.
* inline.ads: Likewise.
* sem_res.adb (Resolve_Call): Update comment and message.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/inline.adb  | 58 +
 gcc/ada/inline.ads  |  5 ++--
 gcc/ada/sem_res.adb |  5 ++--
 3 files changed, 64 insertions(+), 4 deletions(-)

diff --git a/gcc/ada/inline.adb b/gcc/ada/inline.adb
index 2ec92ca9dff..98bed860760 100644
--- a/gcc/ada/inline.adb
+++ b/gcc/ada/inline.adb
@@ -1460,10 +1460,47 @@ package body Inline is
 (N: Node_Id;
  Subp : Entity_Id) return Boolean
is
+  function Has_Dereference (N : Node_Id) return Boolean;
+  --  Return whether N contains an explicit dereference
+
+  -
+  -- Has_Dereference --
+  -
+
+  function Has_Dereference (N : Node_Id) return Boolean is
+
+ function Process (N : Node_Id) return Traverse_Result;
+ --  Process one node in search for dereference
+
+ -
+ -- Process --
+ -
+
+ function Process (N : Node_Id) return Traverse_Result is
+ begin
+if Nkind (N) = N_Explicit_Dereference then
+   return Abandon;
+else
+   return OK;
+end if;
+ end Process;
+
+ function Traverse is new Traverse_Func (Process);
+ --  Traverse tree to look for dereference
+
+  begin
+ return Traverse (N) = Abandon;
+  end Has_Dereference;
+
+  --  Local variables
+
   F : Entity_Id;
   A : Node_Id;
 
begin
+  --  Check if inlining may lead to missing a check on type conversion of
+  --  input parameters otherwise.
+
   F := First_Formal (Subp);
   A := First_Actual (N);
   while Present (F) loop
@@ -1480,6 +1517,27 @@ package body Inline is
  Next_Actual (A);
   end loop;
 
+  --  Check if inlining may lead to introducing temporaries of access type,
+  --  which can lead to missing checks for memory leaks. This can only
+  --  come from an (IN-)OUT parameter transformed into a renaming by SPARK
+  --  expansion, whose side-effects are removed, and a dereference in the
+  --  corresponding actual. If the formal itself is of a deep type (it has
+  --  access subcomponents), the subprogram already cannot be inlined in
+  --  GNATprove mode.
+
+  F := First_Formal (Subp);
+  A := First_Actual (N);
+  while Present (F) loop
+ if Ekind (F) /= E_In_Parameter
+   and then Has_Dereference (A)
+ then
+return False;
+ end if;
+
+ Next_Formal (F);
+ Next_Actual (A);
+  end loop;
+
   return True;
end Call_Can_Be_Inlined_In_GNATprove_Mode;
 
diff --git a/gcc/ada/inline.ads b/gcc/ada/inline.ads
index 3df0a01b65d..bc90c0ce6d8 100644
--- a/gcc/ada/inline.ads
+++ b/gcc/ada/inline.ads
@@ -146,8 +146,9 @@ package Inline is
 (N: Node_Id;
  Subp : Entity_Id) return Boolean;
--  Returns False if the call in node N to subprogram Subp cannot be inlined
-   --  in GNATprove mode, because it may lead to missing a check on type
-   --  conversion of input parameters otherwise. Returns True otherwise.
+   --  in GNATprove mode, because it may otherwise lead to missing a check
+   --  on type conversion of input parameters, or a missing memory leak on
+   --  an output parameter. Returns True otherwise.
 
function Can_Be_Inlined_In_GNATprove_Mode
  (Spec_Id : Entity_Id;
diff --git a/gcc/ada/sem_res.adb b/gcc/ada/sem_res.adb
index 075c0d85ccd..67062c6b32b 100644
--- a/gcc/ada/sem_res.adb
+++ b/gcc/ada/sem_res.adb
@@ -7329,11 +7329,12 @@ package body Sem_Res is
 ("cannot inline & (in while loop condition)?", N, Nam_UA);
 
--  Do not inline calls which would possibly lead to missing a
-   --  type conversion check on an input parameter.
+   --  type conversion check on an input parameter or a memory leak
+   --  on an output parameter.
 
elsif not Call_Can_Be_Inlined_In_GNATprove_Mode (N, Nam) then
   Cannot_Inline
-("cannot inline & (possible check on input parameters)?",
+("cannot inline & (possible check on parameters)?",
  N, Nam_UA);
 
--  Otherwise, inline the call, issuing an info message when
-- 
2.43.2



[COMMITTED] ada: Adjust source location for degenerate scope master

2024-05-06 Thread Marc Poulhiès
From: Eric Botcazou 

When the finalization scope master degenerates into a simple master node,
the latter must inherit the source location that the former would have had.

gcc/ada/

* exp_ch7.adb (Build_Finalizer.Process_Object_Declaration): Adjust
the Sloc of the master node declaration in the degenerate case.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch7.adb | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb
index 75c9e223956..4382de9b6b2 100644
--- a/gcc/ada/exp_ch7.adb
+++ b/gcc/ada/exp_ch7.adb
@@ -2875,6 +2875,7 @@ package body Exp_Ch7 is
  Master_Node_Decl   : Node_Id;
  Master_Node_Id : Entity_Id;
  Master_Node_Ins: Node_Id;
+ Master_Node_Loc: Source_Ptr;
  Obj_Ref: Node_Id;
 
   --  Start of processing for Process_Object_Declaration
@@ -2936,11 +2937,20 @@ package body Exp_Ch7 is
 end if;
 
  else
+--  For one object, use the Sloc the scope master would have had
+
+if Counter_Val = 1 then
+   Master_Node_Loc := Sloc (N);
+else
+   Master_Node_Loc := Loc;
+end if;
+
 Master_Node_Id :=
-  Make_Defining_Identifier (Loc,
+  Make_Defining_Identifier (Master_Node_Loc,
 Chars => New_External_Name (Chars (Obj_Id), Suffix => "MN"));
 Master_Node_Decl :=
-  Make_Master_Node_Declaration (Loc, Master_Node_Id, Obj_Id);
+  Make_Master_Node_Declaration (Master_Node_Loc,
+Master_Node_Id, Obj_Id);
 
 Push_Scope (Scope (Obj_Id));
 if Counter_Val = 1 then
-- 
2.43.2



Re: [PATCH] sra: Do not leave work for DSE (that it can sometimes not perform)

2024-05-06 Thread Richard Biener
On Fri, 3 May 2024, Martin Jambor wrote:

> Hi,
> 
> when looking again at the g++.dg/tree-ssa/pr109849.C testcase we
> discovered that it generates terrible store-to-load forwarding stalls
> because SRA was leaving behind aggregate loads but all the stores were
> by scalar parts and DSE failed to remove the useless load.  SRA has
> all the knowledge to remove the statement even now, so this small
> patch makes it do so.
> 
> With this patch, the g++.dg/tree-ssa/pr109849.C micro-benchmark runs 9
> times faster (on an AMD EPYC 75F3 machine).
> 
> Bootstrapped and tested on x86_64.  OK for master?

OK.

> Given that the patch is simple but can sometimes have large benefit,
> could it possibly be backported to gcc-14 branch even if it is not a
> regression (at least not in the last decade) in a few weeks?

Sounds reasonable.  We have some more leeway for X.2 releases.

Thanks,
Richard.

> Thanks,
> 
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2024-04-18  Martin Jambor  
> 
>   * tree-sra.cc (sra_modify_assign): Remove the original statement
>   also when dealing with a store to a fully covered aggregate from a
>   non-candidate.
> 
> gcc/testsuite/ChangeLog:
> 
> 2024-04-23  Martin Jambor  
> 
>   * g++.dg/tree-ssa/pr109849.C: Also check that the aggeegate store
>   to cur disappears.
>   * gcc.dg/tree-ssa/ssa-dse-26.c: Instead of relying on DSE,
>   check that the unwanted stores were removed at early SRA time.
> ---
>  gcc/testsuite/g++.dg/tree-ssa/pr109849.C   |  3 ++-
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c |  6 +++---
>  gcc/tree-sra.cc| 14 --
>  3 files changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr109849.C 
> b/gcc/testsuite/g++.dg/tree-ssa/pr109849.C
> index cd348c0f590..d06dbb10482 100644
> --- a/gcc/testsuite/g++.dg/tree-ssa/pr109849.C
> +++ b/gcc/testsuite/g++.dg/tree-ssa/pr109849.C
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-sra" } */
> +/* { dg-options "-O2 -fdump-tree-sra -fdump-tree-optimized" } */
>  
>  #include 
>  typedef unsigned int uint32_t;
> @@ -29,3 +29,4 @@ main()
>  }
>  
>  /* { dg-final { scan-tree-dump "Created a replacement for stack offset" 
> "sra"} } */
> +/* { dg-final { scan-tree-dump-not "cur = MEM" "optimized"} } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
> index 43152de5616..1d01392c595 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-dse1-details -fno-short-enums 
> -fno-tree-fre" } */
> +/* { dg-options "-O2 -fdump-tree-esra -fno-short-enums -fno-tree-fre" } */
>  /* { dg-skip-if "we want a BIT_FIELD_REF from fold_truth_andor" { ! lp64 } } 
> */
>  /* { dg-skip-if "temporary variable names are not x and y" { 
> mmix-knuth-mmixware } } */
>  
> @@ -31,5 +31,5 @@ constraint_equal (struct constraint a, struct constraint b)
>  && constraint_expr_equal (a.rhs, b.rhs);
>  }
>  
> -/* { dg-final { scan-tree-dump-times "Deleted dead store: x = " 2 "dse1" } } 
> */
> -/* { dg-final { scan-tree-dump-times "Deleted dead store: y = " 2 "dse1" } } 
> */
> +/* { dg-final { scan-tree-dump-not "x = " "esra" } } */
> +/* { dg-final { scan-tree-dump-not "y = " "esra" } } */
> diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
> index 32fa28911f2..8040b0c5645 100644
> --- a/gcc/tree-sra.cc
> +++ b/gcc/tree-sra.cc
> @@ -4854,8 +4854,18 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator 
> *gsi)
>But use the RHS aggregate to load from to expose more
>optimization opportunities.  */
> if (access_has_children_p (lacc))
> - generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
> -  0, 0, gsi, true, true, loc);
> + {
> +   generate_subtree_copies (lacc->first_child, rhs, lacc->offset,
> +0, 0, gsi, true, true, loc);
> +   if (lacc->grp_covered)
> + {
> +   unlink_stmt_vdef (stmt);
> +   gsi_remove (& orig_gsi, true);
> +   release_defs (stmt);
> +   sra_stats.deleted++;
> +   return SRA_AM_REMOVED;
> + }
> + }
>   }
>  
>return SRA_AM_NONE;
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH] middle-end/114931 - type_hash_canon and structual equality types

2024-05-06 Thread Richard Biener
On Mon, 6 May 2024, Martin Uecker wrote:

> Am Montag, dem 06.05.2024 um 09:00 +0200 schrieb Richard Biener:
> > On Sat, 4 May 2024, Martin Uecker wrote:
> > 
> > > Am Freitag, dem 03.05.2024 um 21:16 +0200 schrieb Jakub Jelinek:
> > > > > On Fri, May 03, 2024 at 09:11:20PM +0200, Martin Uecker wrote:
> > > > > > > > > TYPE_CANONICAL as used by the middle-end cannot express this 
> > > > > > > > > but
> > > > > > > 
> > > > > > > Hm. so how does it work now for arrays?
> > > > > 
> > > > > Do you have a testcase which doesn't work correctly with the arrays?
> > > 
> > > I am mostly trying to understand better how this works. But
> > > if I am not mistaken, the following example would indeed
> > > indicate that we do incorrect aliasing decisions for types
> > > derived from arrays:
> > > 
> > > https://godbolt.org/z/rTsE3PhKc
> > 
> > This example is about pointer-to-array types, int (*)[2] and
> > int (*)[1] are supposed to be compatible as in receive the same alias
> > set. 
> 
> In C, char (*)[2] and char (*)[1] are not compatible. But with
> COMPAT set, the example operates^1 with char (*)[] and char (*)[1]
> which are compatible.  If we form equivalence classes, then
> all three types would need to be treated as equivalent. 
> 
> ^1 Actually, pointer to functions returning pointers
> to arrays. Probably this example can still be simplified...
> 
> >  This is ensured by get_alias_set POINTER_TYPE_P handling,
> > the alias set is supposed to be the same as that of int *.  It seems
> > we do restrict the handling a bit, the code does
> > 
> >   /* Unnest all pointers and references.
> >  We also want to make pointer to array/vector equivalent to 
> > pointer to
> >  its element (see the reasoning above). Skip all those types, too.  
> > */
> >   for (p = t; POINTER_TYPE_P (p)
> >|| (TREE_CODE (p) == ARRAY_TYPE
> >&& (!TYPE_NONALIASED_COMPONENT (p)
> >|| !COMPLETE_TYPE_P (p)
> >|| TYPE_STRUCTURAL_EQUALITY_P (p)))
> >|| TREE_CODE (p) == VECTOR_TYPE;
> >p = TREE_TYPE (p))
> > 
> > where the comment doesn't exactly match the code - but C should
> > never have TYPE_NONALIASED_COMPONENT (p).
> > 
> > But maybe I misread the example or it goes wrong elsewhere.
> 
> If I am not confusing myself too much, the example shows that
> aliasing analysis treats the the types as incompatible in
> both cases, because it does not reload *a with -O2. 
> 
> For char (*)[1] and char (*)[2] this would be correct (but an
> implementation exploiting this would need to do structural
> comparisons and not equivalence classes) but for 
> char (*)[2] and char (*)[] it is not.

Oh, these are function pointers, so it's about the alias set of
a pointer to FUNCTION_TYPE.  I don't see any particular code
trying to make char[] * (*)() and char[1] *(*)() inter-operate
for TBAA iff the FUNCTION_TYPEs themselves are not having the
same TYPE_CANONICAL.

Can you open a bugreport and please point to the relevant parts
of the C standard that tells how pointer-to FUNCTION_TYPE TBAA
is supposed to work?

Thanks,
Richard.

> Martin
> 
> 
> > 
> > Richard.
> > 
> > > Martin
> > > 
> > > > > 
> > > > > E.g. same_type_for_tbaa has
> > > > >   type1 = TYPE_MAIN_VARIANT (type1);
> > > > >   type2 = TYPE_MAIN_VARIANT (type2);
> > > > > 
> > > > >   /* Handle the most common case first.  */
> > > > >   if (type1 == type2)
> > > > > return 1;
> > > > > 
> > > > >   /* If we would have to do structural comparison bail out.  */
> > > > >   if (TYPE_STRUCTURAL_EQUALITY_P (type1)
> > > > >   || TYPE_STRUCTURAL_EQUALITY_P (type2))
> > > > > return -1;
> > > > > 
> > > > >   /* Compare the canonical types.  */
> > > > >   if (TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2))
> > > > > return 1;
> > > > > 
> > > > >   /* ??? Array types are not properly unified in all cases as we have
> > > > >  spurious changes in the index types for example.  Removing this
> > > > >  causes all sorts of problems with the Fortran frontend.  */
> > > > >   if (TREE_CODE (type1) == ARRAY_TYPE
> > > > >   && TREE_CODE (type2) == ARRAY_TYPE)
> > > > > return -1;
> > > > > ...
> > > > > and later compares alias sets and the like.
> > > > > So, even if int[] and int[0] have different TYPE_CANONICAL, they
> > > > > will be considered maybe the same.  Also, guess get_alias_set
> > > > > has some ARRAY_TYPE handling...
> > > > > 
> > > > > Anyway, I think we should just go with Richi's patch.
> > > > > 
> > > > >   Jakub
> > > > > 
> > > 
> > > 
> > > 
> > 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] testsuite: c++: Skip g++.dg/analyzer on Solaris [PR111475]

2024-05-06 Thread Richard Biener
On Sun, 5 May 2024, Rainer Orth wrote:

> Rainer Orth  writes:
> 
> >> On Fri, May 03, 2024 at 09:31:08AM -0400, David Malcolm wrote:
> >>> Jakub, Richi, Rainer: this is a non-trivial change that cleans up
> >>> analyzer C++ testsuite results on Solaris, but has a slight risk of
> >>> affecting analyzer behavior on other targets.  As such, I was thinking
> >>> to hold off on backporting it to GCC 14 until after 14.1 is released.
> >>> Is that a good plan?
> >>
> >> Agreed 14.2 is better target than 14.1 for this, especially if committed
> >> shortly after 14.1 goes out.
> >
> > fully agreed: this is way too risky this close to the 14.1 release.  As
> > a stop-gap measure, one might consider just skipping the C++ analyzer
> > tests on Solaris to avoid the immense number of testsuite failures.
> 
> How about this?
> 
> Almost 1400 C++ analyzer tests FAIL on Solaris.  The patch is too risky
> to apply so close to the GCC 14.1.0 release, so disable the tests on
> Solaris instead to reduce the noise.
> 
> Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and
> x86_64-pc-linux-gnu.
> 
> Ok for gcc-14 branch?

OK.

Richard.


Re: [PATCH] Driver: Reject output filenames with the same suffixes as source files [PR80182]

2024-05-06 Thread Peter0x44
On Mon May 6, 2024 at 8:14 AM BST, Richard Biener wrote:
> On Sat, May 4, 2024 at 9:36 PM Peter Damianov  wrote:
> >
> > Currently, commands like:
> > gcc -o file.c -lm
> > will delete the user's code.
>
> Since there's an error from the linker in the end (missing 'main'), I wonder 
> if
> the linker can avoid truncating/opening the output file instead?  A trivial
> solution might be to open a temporary file first and only atomically replace
> the output file with the temporary file when there were no errors?
I think this is a great idea! The only concern I have is that I think
for mingw targets it would be necessary to be careful to append .exe if
the file has no suffix when moving the temporary file to the output
file. Maybe some other targets have similar concerns.
>
> > This patch checks the suffix of the output, and errors if the output ends in
> > any of the suffixes listed in default_compilers.
> >
> > Unfortunately, I couldn't come up with a better heuristic to diagnose this 
> > case
> > more specifically, so it is now not possible to directly make executables 
> > with
> > said suffixes. I am unsure if any users are depending on this.
>
> A way to provide a workaround would be to require the file not existing.  So
> change the heuristic to only trigger if the output file exists (and is
> non-empty?).
I guess this could work, and has a lower chance of breaking anyone
depending on this behavior, but I think it would still be confusing to
anyone who did rely on this behavior, since then it wouldn't be allowed
to overwrite an executable with the ".c" name. If anyone did rely on
this behavior, their build would succeed once, and then error for every
subsequent invokation, which would be confusing. It seems to me it is
not a meaningful improvement.

With your previous suggestion, this whole heuristic becomes unnecessary
anyway, so I think I will just forego it.
>
> Richard.
>
> > PR driver/80182
> > * gcc.cc (process_command): fatal_error if the output has the 
> > suffix of
> >   a source file.
> > (have_c): Change type to bool.
> > (have_O): Change type to bool.
> > (have_E): Change type to bool.
> > (have_S): New global variable.
> > (driver_handle_option): Assign have_S
> >
> > Signed-off-by: Peter Damianov 
> > ---
> >  gcc/gcc.cc | 29 ++---
> >  1 file changed, 26 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/gcc.cc b/gcc/gcc.cc
> > index 830a4700a87..53169c16460 100644
> > --- a/gcc/gcc.cc
> > +++ b/gcc/gcc.cc
> > @@ -2127,13 +2127,16 @@ static vec at_file_argbuf;
> >  static bool in_at_file = false;
> >
> >  /* Were the options -c, -S or -E passed.  */
> > -static int have_c = 0;
> > +static bool have_c = false;
> >
> >  /* Was the option -o passed.  */
> > -static int have_o = 0;
> > +static bool have_o = false;
> >
> >  /* Was the option -E passed.  */
> > -static int have_E = 0;
> > +static bool have_E = false;
> > +
> > +/* Was the option -S passed.  */
> > +static bool have_S = false;
> >
> >  /* Pointer to output file name passed in with -o. */
> >  static const char *output_file = 0;
> > @@ -4593,6 +4596,10 @@ driver_handle_option (struct gcc_options *opts,
> >have_E = true;
> >break;
> >
> > +case OPT_S:
> > +  have_S = true;
> > +  break;
> > +
> >  case OPT_x:
> >spec_lang = arg;
> >if (!strcmp (spec_lang, "none"))
> > @@ -5058,6 +5065,22 @@ process_command (unsigned int decoded_options_count,
> >output_file);
> >  }
> >
> > +  /* Reject output file names that have the same suffix as a source
> > + file. This is to catch mistakes like: gcc -o file.c -lm
> > + that could delete the user's code. */
> > +  if (have_o && output_file != NULL && !have_E && !have_S)
> > +{
> > +  const char* filename = lbasename(output_file);
> > +  const char* suffix = strchr(filename, '.');
> > +  if (suffix != NULL)
> > +   for (int i = 0; i < n_default_compilers; ++i)
> > + if (!strcmp(suffix, default_compilers[i].suffix))
> > +   fatal_error (input_location,
> > +"output file suffix %qs could be a source file",
> > +suffix);
> > +}
> > +
> > +
> >if (output_file != NULL && output_file[0] == '\0')
> >  fatal_error (input_location, "output filename may not be empty");
> >
> > --
> > 2.39.2
> >

Thanks for the feedback,
Peter D.


[PING 4][PATCH v3] rs6000/p8swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2024-05-06 Thread Surya Kumari Jangala
Ping

On 08/01/24 11:19 am, Surya Kumari Jangala wrote:
> Ping
> 
> On 28/11/23 6:24 pm, Surya Kumari Jangala wrote:
>> Ping
>>
>> On 10/11/23 12:27 pm, Surya Kumari Jangala wrote:
>>> Ping
>>>
>>> On 03/11/23 1:14 pm, Surya Kumari Jangala wrote:
 Hi Segher,
 I have incorporated changes in the code as per the review comments 
 provided by you 
 for version 2 of the patch. Please review.

 Regards,
 Surya


 rs6000/p8swap: Fix incorrect lane extraction by vec_extract() [PR106770]

 In the routine rs6000_analyze_swaps(), special handling of swappable
 instructions is done even if the webs that contain the swappable 
 instructions
 are not optimized, i.e., the webs do not contain any permuting load/store
 instructions along with the associated register swap instructions. Doing 
 special
 handling in such webs will result in the extracted lane being adjusted
 unnecessarily for vec_extract.

 Another issue is that existing code treats non-permuting loads/stores as 
 special
 swappables. Non-permuting loads/stores (that have not yet been split into a
 permuting load/store and a swap) are handled by converting them into a 
 permuting
 load/store (which effectively removes the swap). As a result, if special
 swappables are handled only in webs containing permuting loads/stores, then
 non-optimal code is generated for non-permuting loads/stores.

 Hence, in this patch, all webs containing either permuting loads/ stores or
 non-permuting loads/stores are marked as requiring special handling of
 swappables. Swaps associated with permuting loads/stores are marked for 
 removal,
 and non-permuting loads/stores are converted to permuting loads/stores. 
 Then the
 special swappables in the webs are fixed up.

 This patch also ensures that swappable instructions are not modified in the
 following webs as it is incorrect to do so:
  - webs containing permuting load/store instructions and associated swap
instructions that are transformed by converting the permuting memory
instructions into non-permuting instructions and removing the swap
instructions.
  - webs where swap(load(vector constant)) instructions are replaced with
load(swapped vector constant).

 2023-09-10  Surya Kumari Jangala  

 gcc/
PR rtl-optimization/PR106770
* config/rs6000/rs6000-p8swap.cc (non_permuting_mem_insn): New function.
(handle_non_permuting_mem_insn): New function.
(rs6000_analyze_swaps): Handle swappable instructions only in certain
webs.
(web_requires_special_handling): New instance variable.
(handle_special_swappables): Remove handling of non-permuting load/store
instructions.

 gcc/testsuite/
PR rtl-optimization/PR106770
* gcc.target/powerpc/pr106770.c: New test.
 ---

 diff --git a/gcc/config/rs6000/rs6000-p8swap.cc 
 b/gcc/config/rs6000/rs6000-p8swap.cc
 index 0388b9bd736..02ea299bc3d 100644
 --- a/gcc/config/rs6000/rs6000-p8swap.cc
 +++ b/gcc/config/rs6000/rs6000-p8swap.cc
 @@ -179,6 +179,13 @@ class swap_web_entry : public web_entry_base
unsigned int special_handling : 4;
/* Set if the web represented by this entry cannot be optimized.  */
unsigned int web_not_optimizable : 1;
 +  /* Set if the swappable insns in the web represented by this entry
 + have to be fixed. Swappable insns have to be fixed in:
 +   - webs containing permuting loads/stores and the swap insns
 +   in such webs have been marked for removal
 +   - webs where non-permuting loads/stores have been converted
 +   to permuting loads/stores  */
 +  unsigned int web_requires_special_handling : 1;
/* Set if this insn should be deleted.  */
unsigned int will_delete : 1;
  };
 @@ -1468,14 +1475,6 @@ handle_special_swappables (swap_web_entry 
 *insn_entry, unsigned i)
if (dump_file)
fprintf (dump_file, "Adjusting subreg in insn %d\n", i);
break;
 -case SH_NOSWAP_LD:
 -  /* Convert a non-permuting load to a permuting one.  */
 -  permute_load (insn);
 -  break;
 -case SH_NOSWAP_ST:
 -  /* Convert a non-permuting store to a permuting one.  */
 -  permute_store (insn);
 -  break;
  case SH_EXTRACT:
/* Change the lane on an extract operation.  */
adjust_extract (insn);
 @@ -2401,6 +2400,25 @@ recombine_lvx_stvx_patterns (function *fun)
free (to_delete);
  }
  
 +/* Return true if insn is a non-permuting load/store.  */
 +static bool
 +non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i)
 +{
 +  return insn_entry[i].special_handling == SH_NOSWAP_LD
 +   || 

[PATCH] i386: fix ix86_hardreg_mov_ok with lra_in_progress

2024-05-06 Thread Kong, Lingling
Hi,
Originally eliminate_regs_in_insn will transform 
(parallel [
  (set (reg:QI 130)
(plus:QI (subreg:QI (reg:DI 19 frame) 0)
  (const_int 96)))
  (clobber (reg:CC 17 flag))]) {*addqi_1} 
to 
(set (reg:QI 130) 
  (subreg:QI (reg:DI 19 frame) 0)) {*movqi_internal}
when verify_changes.

But with No Flags add, it transforms
(set (reg:QI 5 di)
  (plus:QI (subreg:QI (reg:DI 19 frame) 0)
   (const_int 96))) {*addqi_1_nf}
to
(set (reg:QI 5 di)
 (subreg:QI (reg:DI 19 frame) 0)) {*addqi_1_nf}.
there is no extra clobbers at the end, and its dest reg just is a hardreg. For 
ix86_hardreg_mov_ok, it returns false. So it fails to update insn and causes 
the ICE when transform to movqi_internal.

But actually it is ok and safe for ix86_hardreg_mov_ok when lra_in_progress.

And tested the spec2017, the performance was not affected.
Bootstrapped and regtested on x86_64-pc-linux-gnu. OK for trunk?

gcc/ChangeLog:

* config/i386/i386.cc (ix86_hardreg_mov_ok): Relax
hard reg mov restriction when lra in progress.
---
 gcc/config/i386/i386.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 
4d6b2b98761..ca4348a18bf 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -20357,7 +20357,8 @@ ix86_hardreg_mov_ok (rtx dst, rtx src)
   ? standard_sse_constant_p (src, GET_MODE (dst))
   : x86_64_immediate_operand (src, GET_MODE (dst)))
   && ix86_class_likely_spilled_p (REGNO_REG_CLASS (REGNO (dst)))
-  && !reload_completed)
+  && !reload_completed
+  && !lra_in_progress)
 return false;
   return true;
 }
--
2.31.1



Re: [PATCH] libgfortran: Fix libgfortran.so versioning on Solaris with subdirs

2024-05-06 Thread Rainer Orth
Hi FX,

>> This patch fixes this by allowing for the new structure.
>> Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11.
>> 
>> Ok for trunk?
>
> OK to push, given it’s localised inside LIBGFOR_USE_SYMVER_SUN.
>
> I find it weird though that .libs is harcoded there. If we look at all the
> lib*/Makefile.am in gcc, the only thing that ever needs to specify .libs is
> for Solaris versioning. It feels like it should be more generic, as you say
> (but that’s for longer term).

look again ;-) libgo/Makefile.am has other unrelated instances for both
setting LD_LIBRARY_PATH and related to AIX.

It seems that libtool has no provision for operations other than compile
(create .lo from sources) and link (create executable from libtool
objects/archives).  It you need something else, there's no way but to
reach below the abstraction.  I believe libtool could provide something
like this, but apparently it doesn't.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] middle-end/114931 - type_hash_canon and structual equality types

2024-05-06 Thread Martin Uecker
Am Montag, dem 06.05.2024 um 09:00 +0200 schrieb Richard Biener:
> On Sat, 4 May 2024, Martin Uecker wrote:
> 
> > Am Freitag, dem 03.05.2024 um 21:16 +0200 schrieb Jakub Jelinek:
> > > > On Fri, May 03, 2024 at 09:11:20PM +0200, Martin Uecker wrote:
> > > > > > > > TYPE_CANONICAL as used by the middle-end cannot express this but
> > > > > > 
> > > > > > Hm. so how does it work now for arrays?
> > > > 
> > > > Do you have a testcase which doesn't work correctly with the arrays?
> > 
> > I am mostly trying to understand better how this works. But
> > if I am not mistaken, the following example would indeed
> > indicate that we do incorrect aliasing decisions for types
> > derived from arrays:
> > 
> > https://godbolt.org/z/rTsE3PhKc
> 
> This example is about pointer-to-array types, int (*)[2] and
> int (*)[1] are supposed to be compatible as in receive the same alias
> set. 

In C, char (*)[2] and char (*)[1] are not compatible. But with
COMPAT set, the example operates^1 with char (*)[] and char (*)[1]
which are compatible.  If we form equivalence classes, then
all three types would need to be treated as equivalent. 

^1 Actually, pointer to functions returning pointers
to arrays. Probably this example can still be simplified...

>  This is ensured by get_alias_set POINTER_TYPE_P handling,
> the alias set is supposed to be the same as that of int *.  It seems
> we do restrict the handling a bit, the code does
> 
>   /* Unnest all pointers and references.
>  We also want to make pointer to array/vector equivalent to 
> pointer to
>  its element (see the reasoning above). Skip all those types, too.  
> */
>   for (p = t; POINTER_TYPE_P (p)
>|| (TREE_CODE (p) == ARRAY_TYPE
>&& (!TYPE_NONALIASED_COMPONENT (p)
>|| !COMPLETE_TYPE_P (p)
>|| TYPE_STRUCTURAL_EQUALITY_P (p)))
>|| TREE_CODE (p) == VECTOR_TYPE;
>p = TREE_TYPE (p))
> 
> where the comment doesn't exactly match the code - but C should
> never have TYPE_NONALIASED_COMPONENT (p).
> 
> But maybe I misread the example or it goes wrong elsewhere.

If I am not confusing myself too much, the example shows that
aliasing analysis treats the the types as incompatible in
both cases, because it does not reload *a with -O2. 

For char (*)[1] and char (*)[2] this would be correct (but an
implementation exploiting this would need to do structural
comparisons and not equivalence classes) but for 
char (*)[2] and char (*)[] it is not.

Martin


> 
> Richard.
> 
> > Martin
> > 
> > > > 
> > > > E.g. same_type_for_tbaa has
> > > >   type1 = TYPE_MAIN_VARIANT (type1);
> > > >   type2 = TYPE_MAIN_VARIANT (type2);
> > > > 
> > > >   /* Handle the most common case first.  */
> > > >   if (type1 == type2)
> > > > return 1;
> > > > 
> > > >   /* If we would have to do structural comparison bail out.  */
> > > >   if (TYPE_STRUCTURAL_EQUALITY_P (type1)
> > > >   || TYPE_STRUCTURAL_EQUALITY_P (type2))
> > > > return -1;
> > > > 
> > > >   /* Compare the canonical types.  */
> > > >   if (TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2))
> > > > return 1;
> > > > 
> > > >   /* ??? Array types are not properly unified in all cases as we have
> > > >  spurious changes in the index types for example.  Removing this
> > > >  causes all sorts of problems with the Fortran frontend.  */
> > > >   if (TREE_CODE (type1) == ARRAY_TYPE
> > > >   && TREE_CODE (type2) == ARRAY_TYPE)
> > > > return -1;
> > > > ...
> > > > and later compares alias sets and the like.
> > > > So, even if int[] and int[0] have different TYPE_CANONICAL, they
> > > > will be considered maybe the same.  Also, guess get_alias_set
> > > > has some ARRAY_TYPE handling...
> > > > 
> > > > Anyway, I think we should just go with Richi's patch.
> > > > 
> > > > Jakub
> > > > 
> > 
> > 
> > 
> 

-- 
Univ.-Prof. Dr. rer. nat. Martin Uecker
Graz University of Technology
Institute of Biomedical Imaging




Re: [PATCH] Driver: Reject output filenames with the same suffixes as source files [PR80182]

2024-05-06 Thread Richard Biener
On Sat, May 4, 2024 at 9:36 PM Peter Damianov  wrote:
>
> Currently, commands like:
> gcc -o file.c -lm
> will delete the user's code.

Since there's an error from the linker in the end (missing 'main'), I wonder if
the linker can avoid truncating/opening the output file instead?  A trivial
solution might be to open a temporary file first and only atomically replace
the output file with the temporary file when there were no errors?

> This patch checks the suffix of the output, and errors if the output ends in
> any of the suffixes listed in default_compilers.
>
> Unfortunately, I couldn't come up with a better heuristic to diagnose this 
> case
> more specifically, so it is now not possible to directly make executables with
> said suffixes. I am unsure if any users are depending on this.

A way to provide a workaround would be to require the file not existing.  So
change the heuristic to only trigger if the output file exists (and is
non-empty?).

Richard.

> PR driver/80182
> * gcc.cc (process_command): fatal_error if the output has the suffix 
> of
>   a source file.
> (have_c): Change type to bool.
> (have_O): Change type to bool.
> (have_E): Change type to bool.
> (have_S): New global variable.
> (driver_handle_option): Assign have_S
>
> Signed-off-by: Peter Damianov 
> ---
>  gcc/gcc.cc | 29 ++---
>  1 file changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/gcc.cc b/gcc/gcc.cc
> index 830a4700a87..53169c16460 100644
> --- a/gcc/gcc.cc
> +++ b/gcc/gcc.cc
> @@ -2127,13 +2127,16 @@ static vec at_file_argbuf;
>  static bool in_at_file = false;
>
>  /* Were the options -c, -S or -E passed.  */
> -static int have_c = 0;
> +static bool have_c = false;
>
>  /* Was the option -o passed.  */
> -static int have_o = 0;
> +static bool have_o = false;
>
>  /* Was the option -E passed.  */
> -static int have_E = 0;
> +static bool have_E = false;
> +
> +/* Was the option -S passed.  */
> +static bool have_S = false;
>
>  /* Pointer to output file name passed in with -o. */
>  static const char *output_file = 0;
> @@ -4593,6 +4596,10 @@ driver_handle_option (struct gcc_options *opts,
>have_E = true;
>break;
>
> +case OPT_S:
> +  have_S = true;
> +  break;
> +
>  case OPT_x:
>spec_lang = arg;
>if (!strcmp (spec_lang, "none"))
> @@ -5058,6 +5065,22 @@ process_command (unsigned int decoded_options_count,
>output_file);
>  }
>
> +  /* Reject output file names that have the same suffix as a source
> + file. This is to catch mistakes like: gcc -o file.c -lm
> + that could delete the user's code. */
> +  if (have_o && output_file != NULL && !have_E && !have_S)
> +{
> +  const char* filename = lbasename(output_file);
> +  const char* suffix = strchr(filename, '.');
> +  if (suffix != NULL)
> +   for (int i = 0; i < n_default_compilers; ++i)
> + if (!strcmp(suffix, default_compilers[i].suffix))
> +   fatal_error (input_location,
> +"output file suffix %qs could be a source file",
> +suffix);
> +}
> +
> +
>if (output_file != NULL && output_file[0] == '\0')
>  fatal_error (input_location, "output filename may not be empty");
>
> --
> 2.39.2
>


Re: [V2][PATCH] gcc-14/changes.html: Deprecate a GCC C extension on flexible array members.

2024-05-06 Thread Richard Biener
On Sat, 4 May 2024, Sebastian Huber wrote:

> On 07.08.23 16:22, Qing Zhao via Gcc-patches wrote:
> > Hi,
> > 
> > This is the 2nd version of the patch.
> > Comparing to the 1st version, the only change is to address Richard's
> > comment on refering a warning option for diagnosing deprecated behavior.
> > 
> > 
> > Okay for committing?
> > 
> > thanks.
> > 
> > Qing
> > 
> > ==
> > 
> > *htdocs/gcc-14/changes.html (Caveats): Add notice about deprecating a C
> > extension about flexible array members.
> > ---
> >   htdocs/gcc-14/changes.html | 13 -
> >   1 file changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
> > index dad1ba53..eae25f1a 100644
> > --- a/htdocs/gcc-14/changes.html
> > +++ b/htdocs/gcc-14/changes.html
> > @@ -30,7 +30,18 @@ a work-in-progress.
> >   
> >   Caveats
> >   
> > -  ...
> > +  C:
> > +  Support for the GCC extension, a structure containing a C99 flexible
> > array
> > +  member, or a union containing such a structure, is not the last field
> > of
> > +  another structure, is deprecated. Refer to
> > +  https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html;>
> > +  Zero Length Arrays.
> > +  Any code relying on this extension should be modifed to ensure that
> > +  C99 flexible array members only end up at the ends of structures.
> > +  Please use the warning option
> > +   > href="https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wflex-array-member-not-at-end;>-Wflex-array-member-not-at-end
> > to
> > +  identify all such cases in the source code and modify them.
> > +  
> >   
> 
> I have a question with respect to the static initialization of flexible array
> members. According to the documentation this is supported by GCC:
> 
> https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
> 
> "GCC allows static initialization of flexible array members. This is
> equivalent to defining a new structure containing the original structure
> followed by an array of sufficient size to contain the data. E.g. in the
> following, f1 is constructed as if it were declared like f2.
> 
> struct f1 {
>   int x; int y[];
> } f1 = { 1, { 2, 3, 4 } };
> 
> struct f2 {
>   struct f1 f1; int data[3];
> } f2 = { { 1 }, { 2, 3, 4 } };
> "
> 
> However, when I compile this code, I get a warning like this:
> 
> flex-array.c:6:13: warning: structure containing a flexible array member is
> not at the end of another structure [-Wflex-array-member-not-at-end]
> 6 |   struct f1 f1; int data[3];
>   |
> 
> In general, I agree that flexible array members should be at the end, however
> the support for static initialization is quite important from my point of view
> especially for applications for embedded systems. Here, dynamic allocations
> may not be allowed or feasible.

I do not get a diagnostic for this on trunk?  And I agree there shouldn't
be any.

Richard.


Re: [PATCH] middle-end/114931 - type_hash_canon and structual equality types

2024-05-06 Thread Richard Biener
On Sat, 4 May 2024, Martin Uecker wrote:

> Am Freitag, dem 03.05.2024 um 21:16 +0200 schrieb Jakub Jelinek:
> > > On Fri, May 03, 2024 at 09:11:20PM +0200, Martin Uecker wrote:
> > > > > > > TYPE_CANONICAL as used by the middle-end cannot express this but
> > > > > 
> > > > > Hm. so how does it work now for arrays?
> > > 
> > > Do you have a testcase which doesn't work correctly with the arrays?
> 
> I am mostly trying to understand better how this works. But
> if I am not mistaken, the following example would indeed
> indicate that we do incorrect aliasing decisions for types
> derived from arrays:
> 
> https://godbolt.org/z/rTsE3PhKc

This example is about pointer-to-array types, int (*)[2] and
int (*)[1] are supposed to be compatible as in receive the same alias
set.  This is ensured by get_alias_set POINTER_TYPE_P handling,
the alias set is supposed to be the same as that of int *.  It seems
we do restrict the handling a bit, the code does

  /* Unnest all pointers and references.
 We also want to make pointer to array/vector equivalent to 
pointer to
 its element (see the reasoning above). Skip all those types, too.  
*/
  for (p = t; POINTER_TYPE_P (p)
   || (TREE_CODE (p) == ARRAY_TYPE
   && (!TYPE_NONALIASED_COMPONENT (p)
   || !COMPLETE_TYPE_P (p)
   || TYPE_STRUCTURAL_EQUALITY_P (p)))
   || TREE_CODE (p) == VECTOR_TYPE;
   p = TREE_TYPE (p))

where the comment doesn't exactly match the code - but C should
never have TYPE_NONALIASED_COMPONENT (p).

But maybe I misread the example or it goes wrong elsewhere.

Richard.

> Martin
> 
> > > 
> > > E.g. same_type_for_tbaa has
> > >   type1 = TYPE_MAIN_VARIANT (type1);
> > >   type2 = TYPE_MAIN_VARIANT (type2);
> > > 
> > >   /* Handle the most common case first.  */
> > >   if (type1 == type2)
> > > return 1;
> > > 
> > >   /* If we would have to do structural comparison bail out.  */
> > >   if (TYPE_STRUCTURAL_EQUALITY_P (type1)
> > >   || TYPE_STRUCTURAL_EQUALITY_P (type2))
> > > return -1;
> > > 
> > >   /* Compare the canonical types.  */
> > >   if (TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2))
> > > return 1;
> > > 
> > >   /* ??? Array types are not properly unified in all cases as we have
> > >  spurious changes in the index types for example.  Removing this
> > >  causes all sorts of problems with the Fortran frontend.  */
> > >   if (TREE_CODE (type1) == ARRAY_TYPE
> > >   && TREE_CODE (type2) == ARRAY_TYPE)
> > > return -1;
> > > ...
> > > and later compares alias sets and the like.
> > > So, even if int[] and int[0] have different TYPE_CANONICAL, they
> > > will be considered maybe the same.  Also, guess get_alias_set
> > > has some ARRAY_TYPE handling...
> > > 
> > > Anyway, I think we should just go with Richi's patch.
> > > 
> > >   Jakub
> > > 
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


[PATCH] c++: Allow IS_FAKE_BASE_TYPE for union types [PR114954]

2024-05-06 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

In some circumstances, unions can also have an __as_base type; we need
to make sure that IS_FAKE_BASE_TYPE correctly recognises this.

PR c++/114954

gcc/cp/ChangeLog:

* cp-tree.h (IS_FAKE_BASE_TYPE): Also apply to unions.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr114954.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/cp-tree.h|  2 +-
 gcc/testsuite/g++.dg/modules/pr114954.C | 14 ++
 2 files changed, 15 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/pr114954.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 933504b4821..fa24217eb2b 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -2616,7 +2616,7 @@ struct GTY(()) lang_type {
 
 /* True iff NODE is the CLASSTYPE_AS_BASE version of some type.  */
 #define IS_FAKE_BASE_TYPE(NODE)\
-  (TREE_CODE (NODE) == RECORD_TYPE \
+  (RECORD_OR_UNION_TYPE_P (NODE)   \
&& TYPE_CONTEXT (NODE) && CLASS_TYPE_P (TYPE_CONTEXT (NODE))\
&& CLASSTYPE_AS_BASE (TYPE_CONTEXT (NODE)) == (NODE))
 
diff --git a/gcc/testsuite/g++.dg/modules/pr114954.C 
b/gcc/testsuite/g++.dg/modules/pr114954.C
new file mode 100644
index 000..a9787140808
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/pr114954.C
@@ -0,0 +1,14 @@
+// PR c++/114954
+// { dg-additional-options "-fmodules-ts" }
+// { dg-module-cmi main }
+
+export module main;
+
+template 
+union U {
+private:
+  char a[N + 1];
+  int b;
+};
+
+U<4> p;
-- 
2.43.2



Re: [PATCH] x86: Fix cmov cost model issue [PR109549]

2024-05-06 Thread Uros Bizjak
On Mon, May 6, 2024 at 5:20 AM Hongtao Liu  wrote:
>
> CC uros.
>
> On Mon, May 6, 2024 at 11:03 AM Kong, Lingling  
> wrote:
> >
> > Hi,
> > (if_then_else:SI (eq (reg:CCZ 17 flags)
> > (const_int 0 [0]))
> > (reg/v:SI 101 [ e ])
> > (reg:SI 102))
> > The cost is 8 for the rtx, the cost for
> > (eq (reg:CCZ 17 flags) (const_int 0 [0])) is 4, but this is just an 
> > operator do not need to compute it's cost in cmov.
> It looks like a reasonable change to me, for cmov, the first operand
> of if_then_else is not a mask.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu.
> > OK for trunk?
> >
> > gcc/ChangeLog:
> >
> > PR target/109549
> > * config/i386/i386.cc (ix86_rtx_costs): The XEXP (x, 0) for cmov
> > is an operator do not need to compute cost.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/cmov6.c: Fixed.

OK.

BTW: I'd like to point out PR85559 [1] that collects some persistent
issues with x86 CMOV insn, especially [2].

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=cmov
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56309

Uros.

> > ---
> >  gcc/config/i386/i386.cc   | 2 +-
> >  gcc/testsuite/gcc.target/i386/cmov6.c | 5 +
> >  2 files changed, 2 insertions(+), 5 deletions(-)
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 
> > 4d6b2b98761..59b4ce3bfbf 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -22237,7 +22237,7 @@ ix86_rtx_costs (rtx x, machine_mode mode, int 
> > outer_code_i, int opno,
> > {
> >   /* cmov.  */
> >   *total = COSTS_N_INSNS (1);
> > - if (!REG_P (XEXP (x, 0)))
> > + if (!COMPARISON_P (XEXP (x, 0)) && !REG_P (XEXP (x, 0)))
> > *total += rtx_cost (XEXP (x, 0), mode, code, 0, speed);
> >   if (!REG_P (XEXP (x, 1)))
> > *total += rtx_cost (XEXP (x, 1), mode, code, 1, speed); diff 
> > --git a/gcc/testsuite/gcc.target/i386/cmov6.c 
> > b/gcc/testsuite/gcc.target/i386/cmov6.c
> > index 5111c8a9099..535326e4c2a 100644
> > --- a/gcc/testsuite/gcc.target/i386/cmov6.c
> > +++ b/gcc/testsuite/gcc.target/i386/cmov6.c
> > @@ -1,9 +1,6 @@
> >  /* { dg-do compile } */
> >  /* { dg-options "-O2 -march=k8" } */
> > -/* if-converting this sequence would require two cmov
> > -   instructions and seems to always cost more independent
> > -   of the TUNE_ONE_IF_CONV setting.  */
> > -/* { dg-final { scan-assembler-not "cmov\[^6\]" } } */
> > +/* { dg-final { scan-assembler "cmov\[^6\]" } } */
> >
> >  /* Verify that blocks are converted to conditional moves.  */  extern int 
> > bar (int, int);
> > --
> > 2.31.1
> >
>
>
> --
> BR,
> Hongtao


[PATCH-5, rs6000] Replace explicit CC bit reverse with common format

2024-05-06 Thread HAO CHEN GUI
Hi,
  It's the fifth patch of a series of patches optimizing CC modes on
rs6000.

  There are some explicit CR6 bit reverse (mfcr/xor) expand in vector.md.
As the forth patch optimized CC bit reverse implement, the patch changes
the explicit format to the common format (testing if the bit is not set).
With the common format, it can matches different implement on different
sub-targets. On Power10, it should be setbcr. On Power9, it's isel. On
Power8 and below, it's mfcr/xor.

  Bootstrapped and tested on powerpc64-linux BE and LE with no
regressions. Is it OK for the trunk?

Thanks
Gui Haochen

ChangeLog
rs6000: Replace explicit CC bit reverse with common format

This patch replaces explicit CC bit reverse (mfcr/xor) with the common format
so that it can match setbcr on Power 10, isel on Power 9 and mfcr/xor on
other sub-targets.

gcc/
* config/rs6000/vector.md (vector_ae__p): Replace explicit CC
bit reverse with common format.
(vector_ae_v2di_p): Likewise.
(vector_ae_v1ti_p): Likewise.
(vector_ae__p): Likewise.
(cr6_test_for_zero): Likewise.
(cr6_test_for_lt): Likewise.

gcc/testsuite/
* gcc.target/powerpc/vsu/vec-any-eq-10.c: Replace rlwinm with isel.
* gcc.target/powerpc/vsu/vec-any-eq-14.c: Replace rlwinm with isel.
* gcc.target/powerpc/vsu/vec-any-eq-7.c: Replace rlwinm with isel.
* gcc.target/powerpc/vsu/vec-any-eq-8.c: Replace rlwinm with isel.
* gcc.target/powerpc/vsu/vec-any-eq-9.c: Replace rlwinm with isel.

patch.diff
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index f86c1f2990e..b1bbf9bac2d 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -942,11 +942,8 @@ (define_expand "vector_ae__p"
  (ne:VI (match_dup 1)
 (match_dup 2)))])
(set (match_operand:SI 0 "register_operand" "=r")
-   (lt:SI (reg:CCLTEQ CR6_REGNO)
-  (const_int 0)))
-   (set (match_dup 0)
-   (xor:SI (match_dup 0)
-   (const_int 1)))]
+   (ge:SI (reg:CCLTEQ CR6_REGNO)
+  (const_int 0)))]
   "TARGET_P9_VECTOR"
 {
   operands[3] = gen_reg_rtx (mode);
@@ -1027,11 +1024,8 @@ (define_expand "vector_ae_v2di_p"
  (eq:V2DI (match_dup 1)
   (match_dup 2)))])
(set (match_operand:SI 0 "register_operand" "=r")
-   (eq:SI (reg:CCLTEQ CR6_REGNO)
-  (const_int 0)))
-   (set (match_dup 0)
-   (xor:SI (match_dup 0)
-   (const_int 1)))]
+   (ne:SI (reg:CCLTEQ CR6_REGNO)
+  (const_int 0)))]
   "TARGET_P9_VECTOR"
 {
   operands[3] = gen_reg_rtx (V2DImode);
@@ -1048,11 +1042,8 @@ (define_expand "vector_ae_v1ti_p"
  (eq:V1TI (match_dup 1)
   (match_dup 2)))])
(set (match_operand:SI 0 "register_operand" "=r")
-   (eq:SI (reg:CCLTEQ CR6_REGNO)
-  (const_int 0)))
-   (set (match_dup 0)
-   (xor:SI (match_dup 0)
-   (const_int 1)))]
+   (ne:SI (reg:CCLTEQ CR6_REGNO)
+  (const_int 0)))]
   "TARGET_POWER10"
 {
   operands[3] = gen_reg_rtx (V1TImode);
@@ -1095,11 +1086,8 @@ (define_expand "vector_ae__p"
  (eq:VEC_F (match_dup 1)
(match_dup 2)))])
(set (match_operand:SI 0 "register_operand" "=r")
-   (eq:SI (reg:CCLTEQ CR6_REGNO)
-  (const_int 0)))
-   (set (match_dup 0)
-   (xor:SI (match_dup 0)
-   (const_int 1)))]
+   (ne:SI (reg:CCLTEQ CR6_REGNO)
+  (const_int 0)))]
   "TARGET_P9_VECTOR"
 {
   operands[3] = gen_reg_rtx (mode);
@@ -1172,11 +1160,8 @@ (define_expand "cr6_test_for_zero"
 ;; integer constant first argument equals one (aka __CR6_EQ_REV in altivec.h).
 (define_expand "cr6_test_for_zero_reverse"
   [(set (match_operand:SI 0 "register_operand" "=r")
-   (eq:SI (reg:CCLTEQ CR6_REGNO)
-  (const_int 0)))
-   (set (match_dup 0)
-   (xor:SI (match_dup 0)
-   (const_int 1)))]
+   (ne:SI (reg:CCLTEQ CR6_REGNO)
+  (const_int 0)))]
   "TARGET_ALTIVEC || TARGET_VSX"
   "")

@@ -1198,11 +1183,8 @@ (define_expand "cr6_test_for_lt"
 ;; (aka __CR6_LT_REV in altivec.h).
 (define_expand "cr6_test_for_lt_reverse"
   [(set (match_operand:SI 0 "register_operand" "=r")
-   (lt:SI (reg:CCLTEQ CR6_REGNO)
-  (const_int 0)))
-   (set (match_dup 0)
-   (xor:SI (match_dup 0)
-   (const_int 1)))]
+   (ge:SI (reg:CCLTEQ CR6_REGNO)
+  (const_int 0)))]
   "TARGET_ALTIVEC || TARGET_VSX"
   "")

diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-any-eq-10.c 
b/gcc/testsuite/gcc.target/powerpc/vsu/vec-any-eq-10.c
index 30dfc83a97b..9743a496fb5 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-any-eq-10.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-any-eq-10.c
@@ -15,4 +15,4 @@ test_any_equal (vector unsigned long long *arg1_p,
 }

 /* { dg-final { scan-assembler "vcmpequd." } } */
-/* { dg-final { scan-assembler "rlwinm